Ephemeris / 0.1.4
The archive of prediction markets
2026 / 04 / 17 / NYC
prototype / archive begins Q3 2026
01Thesis

The Bloomberg Terminal
for prediction markets.

Every tick on Kalshi and Polymarket01, paired with what the world knew at that moment. News, social, polling, on-chain, telemetry — synchronized, archived forever.

Ticks / year 1 target
2.4B
Context / year 1 target
18TB
Venues at launch
100%
Target cadence
1S
01.1Fig. A / Sample query output · illustrative
$ eph query WILL_FED_CUT_MAY_2026 --range 2026-01-01..2026-04-17 --context on

┌─────────────┬──────┬──────┬────────┬──────────────────────────┐
 timestamp    yes   no    volume  context                  
├─────────────┼──────┼──────┼────────┼──────────────────────────┤
 01-02 14:33  0.42  0.58   24100  CPI 2.8% YoY             
 01-15 09:15  0.51  0.49   38402  Powell hawkish, Econ Club 
 02-28 16:00  0.63  0.37   67100  NFP miss, 143k v 190k    
 03-19 14:00  0.72  0.28   89240  FOMC hold, dots softened  
 04-17 12:00  0.691  0.309   12500  <illustrative>           
└─────────────┴──────┴──────┴────────┴──────────────────────────┘

  price history / yes leg                           1s cadence
  0.80                                          ▂▄▆
  0.70                                  ▁▂▄▆██████▆
  0.60                        ▁▂▄▆████████████████▅
  0.50           ▁▁▂▂▃▄▅▆▇███████████████████
  0.40 ▁▁▁▂▂▃▃▄▅████████████████████████████▃
       └─────────────────────────────────────────────────
        JAN           FEB            MAR           APR

 287,419 rows · 1.3 MB parquet · 14 context fields
 joined to L2 book depth, r/federalreserve, Bloomberg econ wire,
  Atlanta Fed GDPNow, FOMC dot-plot deltas.
02Thesis

Prediction markets are becoming the highest signal source for event probabilities. No Bloomberg exists for this asset class yet.

02.1 / Context

A new asset class, no infrastructure.

Kalshi is CFTC regulated. Polymarket returned to US retail. Volume crossed ten billion dollars in 2025. Every major bank has a prediction market desk under research. There is still no Bloomberg, no Refinitiv, no Koyfin. Tick data vendors do not exist.

02.2 / Moat

History is a moat that compounds.

A competitor who starts in 2028 cannot retroactively capture what we capture during this launch window. Every day the archive runs, it becomes harder to replicate. This is not a feature that can be cloned. It is a position in time, and the clock starts at Q3 2026.

02.3 / Depth

Prices alone are not enough.

Backtesting event markets requires the full knowledge state at each timestamp — news, social, polls, weather, game scores. Ephemeris archives the world around every tick, synchronized to the second. Nobody else does this.

03Coverage

Two data layers. One timestamp.

Market layer Target · 1s

Venues
Kalshi, Polymarket. Manifold and Insight on roadmap.
Fields
Mid, bid/ask, L2 order book depth, trade prints, volume, open interest, resolution outcome.
Cadence
1 second for liquid contracts, 60 seconds for the long tail. All contracts, no gaps.
Coverage
Economics, politics, climate, crypto, sports, culture, science.
Format
Parquet, CSV, JSONL. SQL query interface on roadmap.

Context layer Target · 15s

News
Synchronized headline snapshots from Bloomberg, Reuters, AP, and topic-specific feeds at every tick.
Social
Reddit subreddit state, top comments, velocity. X/Twitter posts for flagged keywords.
Polling
538, Nate Silver, YouGov, Morning Consult, plus state-level trackers.
On-chain
ETH/BTC mempool, DEX volumes, wallet flows for crypto contracts.
Telemetry
Weather (NOAA), sports live scores, seismic (USGS), game state for esports markets.
// Fig. B. Cross-source row, single timestamp illustrative
{
  "ts": "2024-11-05T23:14:07Z",
  "market": "polymarket:trump-wins-2024",
  "yes": 0.843, "no": 0.157,
  "orderbook": { "bids": [[0.842, 18200], [0.841, 47510]], "asks": [[0.844, 11200]] },
  "volume_1m": 2148530,
  "context": {
    "headlines": [
      { "src": "AP",        "ts": "23:12:41Z", "txt": "Trump projected winner in PA per AP" },
      { "src": "Bloomberg", "ts": "23:10:02Z", "txt": "Harris campaign halts call to supporters" }
    ],
    "reddit": { "r/politics": { "front_page_score": 0.91, "velocity": "+412/min" } },
    "polls":  { "silver_bulletin_538": { "trump_ev": 276, "delta_1h": "+9" } }
  }
}
04Technical

The details investors and engineers both ask.

Q.01

When does capture begin?

Q3 2026. No archive exists today. Year-1 targets: 2.4 billion market ticks across Kalshi and Polymarket, 18TB of synchronized context, roughly 120GB per week growth. We are starting now because every day a competitor waits, they lose a day of history they cannot recover.

Scale
Q.02

Target capture latency.

1-second cadence for liquid contracts on both venues. Target wire-to-disk under 400ms for Kalshi, under 900ms for Polymarket. Context sampled at 15s to 5min depending on source. Engineering in progress; exact numbers at launch.

Perf
Q.03

Can I get redistributable data?

Redistribution rights will be negotiated with Kalshi and Polymarket before public launch. Early access is research and internal-use only, under a standard data license. A public commercial API follows once rights are in place.

Legal
Q.04

Pricing.

Academic research: free. Individual traders: free tier plus metered query-volume pricing. Funds and institutions: custom, based on feed breadth and refresh rate. Quoted on request.

Price
Q.05

Why wait? Why not scrape this yourself?

Because starting now is the point. Today, no archive exists. In twelve months, one will, and it will belong to whoever is running through this window. The moat is time itself. A competitor who arrives in 2028 cannot retrieve what we capture in 2026 and 2027, no matter how good their engineers are.

Moat
05Access

The record nobody is keeping.
Start the archive with us.