PMXT

When running a PMXT-backed quote-tick backtest, the loader fetches historical L2 order-book data one hour at a time. In the current codebase, the hour lookup order is:

  1. local filtered cache
  2. each explicit raw source in DATA.sources, left to right
  3. none

DATA.sources is prefix-driven on purpose: use local:, archive:, and relay: only. Bare hosts, bare paths, and alias prefixes are not accepted.

Two practical notes matter here:

  • the shared public relay for this repository is mirror-first, so raw-hour serving is the supported shared-server path
  • the public runner layer disables relay-hosted filtered parquet
  • PMXT upstream raw hours live at flat object URLs like https://r2.pmxt.dev/polymarket_orderbook_YYYY-MM-DDTHH.parquet, while the local mirror serves those same files under dated /v1/raw/YYYY/MM/DD/... paths

After a successful fetch from a raw source, the result is written to the local filtered cache so subsequent runs are fast.

Example Output

The timing harness prints one completion line per resolved hour and keeps an aggregate progress bar for any hours that are still in flight. A representative make backtest PMXT run looks like this:

make backtest
uv run python main.py

Running: polymarket_quote_tick_pmxt_panic_fade

PMXT source: explicit priority (cache -> local /Volumes/LaCie/pmxt_raws -> archive https://r2.pmxt.dev -> relay https://209-209-10-83.sslip.io)
Loading PMXT Polymarket market will-openai-launch-a-new-consumer-hardware-product-by-march-31-2026 (token_index=0, window_start=2026-02-21T16:00:00+00:00, window_end=2026-02-23T10:00:00+00:00)...
  2026-02-21T18:00:00+00:00      0.002s       263 rows  cache 2026-02-21T18
  2026-02-21T17:00:00+00:00      0.002s       339 rows  cache 2026-02-21T17
  2026-02-21T16:00:00+00:00      0.003s       117 rows  cache 2026-02-21T16
  2026-02-21T15:00:00+00:00      0.553s         0 rows  none
  2026-02-21T19:00:00+00:00      6.466s       862 rows  local raw 2026-02-21T19
  2026-02-22T01:00:00+00:00     22.608s      4156 rows  local raw 2026-02-22T01
  2026-02-22T16:00:00+00:00     24.068s      3571 rows  local raw 2026-02-22T16
Fetching hours (41/44 done, 3 active):  95%|████████████████████████████████████████████████████████████████████████████████████████████████████████▉| [02:43<00:07], prefetch: - local raw 2026-02-23T09 scan 529.8MiB 574b 220r 3.6s | local raw 2026-02-23T08 scan 563.5MiB 3.6s | +1 more

The important signals are:

  • the PMXT source: line shows the exact cache, local, archive, and relay priority the runner will use
  • each per-hour line shows the hour, load time, filtered row count, and the source that satisfied that hour
  • cache, local raw, and none tell you whether the hour came from warm cache, the local raw mirror, or a confirmed miss
  • done and active on the aggregate bar show how much of the window has completed and how many hours are still in flight
  • the prefetch: segment shows the currently active raw-hour scans or transfers, including source, hour, bytes, and elapsed time

The exact timings, row counts, and active prefetch details vary with cache warmth, mirror speed, and the requested window.

Timing Expectations By Source

Source Typical time When it happens
Local cache <0.05s Second run onward for the same market/token/hour
Local raw PMXT archive local disk bound You mirrored raw PMXT hours locally and pointed DATA.sources at local:/..., or used PMXT_RAW_ROOT for a lower-level loader workflow
Remote raw PMXT archive network and file-size bound Hour is missing from local cache and local raw mirror, so the client downloads the upstream raw parquet to a temp file and filters it locally
Relay raw mirror network and file-size bound A mirror-only relay serves /v1/raw/..., so the client downloads the raw parquet to a temp file and filters it locally
None <1s Hour does not exist yet

How To See This Output

Timing is enabled by default in the interactive menu and direct script runners that use @timing_harness.

Turn it off explicitly with:

BACKTEST_ENABLE_TIMING=0 make backtest

Or run any PMXT runner directly:

uv run python backtests/polymarket_quote_tick_pmxt_ema_crossover.py

You can also time a runner explicitly through the harness test helper:

uv run python backtests/_shared/_timing_test.py backtests/polymarket_quote_tick_pmxt_ema_crossover.py

Public PMXT examples are pinned to known-good sample windows in code so the direct script paths stay runnable without editing the file first. If your local raw mirror or relay lives somewhere else, update DATA.sources in the runner file.