PMXT
When running a PMXT-backed quote-tick backtest, the loader fetches historical L2 order-book data one hour at a time. In the current codebase, the hour lookup order is:
- local filtered cache
- each explicit raw source in
DATA.sources, left to right - none
DATA.sources is prefix-driven on purpose: use local:, archive:, and
relay: only. Bare hosts, bare paths, and alias prefixes are not accepted.
Two practical notes matter here:
- the shared public relay for this repository is mirror-first, so raw-hour serving is the supported shared-server path
- the public runner layer disables relay-hosted filtered parquet
- PMXT upstream raw hours live at flat object URLs like
https://r2.pmxt.dev/polymarket_orderbook_YYYY-MM-DDTHH.parquet, while the local mirror serves those same files under dated/v1/raw/YYYY/MM/DD/...paths
After a successful fetch from a raw source, the result is written to the local filtered cache so subsequent runs are fast.
Example Output
The timing harness prints one completion line per resolved hour and keeps an
aggregate progress bar for any hours that are still in flight. A representative
make backtest PMXT run looks like this:
make backtest
uv run python main.py
Running: polymarket_quote_tick_pmxt_panic_fade
PMXT source: explicit priority (cache -> local /Volumes/LaCie/pmxt_raws -> archive https://r2.pmxt.dev -> relay https://209-209-10-83.sslip.io)
Loading PMXT Polymarket market will-openai-launch-a-new-consumer-hardware-product-by-march-31-2026 (token_index=0, window_start=2026-02-21T16:00:00+00:00, window_end=2026-02-23T10:00:00+00:00)...
2026-02-21T18:00:00+00:00 0.002s 263 rows cache 2026-02-21T18
2026-02-21T17:00:00+00:00 0.002s 339 rows cache 2026-02-21T17
2026-02-21T16:00:00+00:00 0.003s 117 rows cache 2026-02-21T16
2026-02-21T15:00:00+00:00 0.553s 0 rows none
2026-02-21T19:00:00+00:00 6.466s 862 rows local raw 2026-02-21T19
2026-02-22T01:00:00+00:00 22.608s 4156 rows local raw 2026-02-22T01
2026-02-22T16:00:00+00:00 24.068s 3571 rows local raw 2026-02-22T16
Fetching hours (41/44 done, 3 active): 95%|████████████████████████████████████████████████████████████████████████████████████████████████████████▉| [02:43<00:07], prefetch: - local raw 2026-02-23T09 scan 529.8MiB 574b 220r 3.6s | local raw 2026-02-23T08 scan 563.5MiB 3.6s | +1 more
The important signals are:
- the
PMXT source:line shows the exact cache, local, archive, and relay priority the runner will use - each per-hour line shows the hour, load time, filtered row count, and the source that satisfied that hour
cache,local raw, andnonetell you whether the hour came from warm cache, the local raw mirror, or a confirmed missdoneandactiveon the aggregate bar show how much of the window has completed and how many hours are still in flight- the
prefetch:segment shows the currently active raw-hour scans or transfers, including source, hour, bytes, and elapsed time
The exact timings, row counts, and active prefetch details vary with cache warmth, mirror speed, and the requested window.
Timing Expectations By Source
| Source | Typical time | When it happens |
|---|---|---|
| Local cache | <0.05s | Second run onward for the same market/token/hour |
| Local raw PMXT archive | local disk bound | You mirrored raw PMXT hours locally and pointed DATA.sources at local:/..., or used PMXT_RAW_ROOT for a lower-level loader workflow |
| Remote raw PMXT archive | network and file-size bound | Hour is missing from local cache and local raw mirror, so the client downloads the upstream raw parquet to a temp file and filters it locally |
| Relay raw mirror | network and file-size bound | A mirror-only relay serves /v1/raw/..., so the client downloads the raw parquet to a temp file and filters it locally |
| None | <1s | Hour does not exist yet |
How To See This Output
Timing is enabled by default in the interactive menu and direct script runners
that use @timing_harness.
Turn it off explicitly with:
BACKTEST_ENABLE_TIMING=0 make backtest
Or run any PMXT runner directly:
uv run python backtests/polymarket_quote_tick_pmxt_ema_crossover.py
You can also time a runner explicitly through the harness test helper:
uv run python backtests/_shared/_timing_test.py backtests/polymarket_quote_tick_pmxt_ema_crossover.py
Public PMXT examples are pinned to known-good sample windows in code so the
direct script paths stay runnable without editing the file first. If your local
raw mirror or relay lives somewhere else, update DATA.sources in the runner
file.