Institutional-Grade
Market Microstructure
Market microstructure data for Hyperliquid and Polymarket. Hyperliquid ships as Zstd Parquet; Polymarket ships as raw hourly CLOB WebSocket JSONL.ZST.
What You Can Build
Real use cases. Not hypothetical demos.
Smart Money Detection
Track wallet-level order flow to identify informed traders. Cluster wallets by behavior, detect accumulation/distribution patterns before price moves.
Execution Quality Analysis
Maker/taker attribution with per-fill fees. Model slippage, measure fill rates, and benchmark execution quality across venues.
ML Signal Engineering
Block-sequenced features with causal ordering guaranteed. Build order flow imbalance, cancel ratios, and toxicity signals for price prediction models.
Market Microstructure Research
20-level orderbook snapshots + funding rate dynamics across 10 assets. Study liquidity provision, OBI signals, and cross-asset correlations on a fully on-chain exchange.
Data Catalog
Every field verified. Every schema documented.
Orders
Per-block ~2sFills (L1)
Per-tradeL2 Orderbook
60s snapshotsFunding + OI
5 minBTC Microstructure Book
Standalone BTC high-frequency historical data for spread dynamics, queue depth, liquidity shocks, adverse selection, and short-horizon execution research. Binance context data is included as a bonus for cross-venue validation.
Load in One Line
Hyperliquid loads as Parquet. Polymarket replays as raw compressed JSONL.
# Load 163M+ order events with 100% order_id coverage
orders = pl.read_parquet("orders/*.parquet")
btc = orders.filter(pl.col("coin") == "BTC")
print(btc.head(5))
| block_height | timestamp_ms | wallet | coin | action | status | is_buy | price | size | order_type | tif | reduce_only | order_id | cloid |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| i64 | i64 | str | str | str | str | bool | f64 | f64 | str | str | bool | i64 | str |
| 0 | 1774396843090 | 0xce71..5a42 | BTC | Rejected | badAloPxRejected | false | 70525.0 | 0.12282 | Limit | Alo | false | 360336907743 | 0x0000.. |
| 0 | 1774396843090 | 0xf36c..e1a4 | PAXG | Filled | filled | false | 4548.0 | 0.56 | Limit | Alo | false | 360336907286 | 0x1490.. |
| 0 | 1774396843090 | 0x5483..a6b2 | SOL | PlaceOrder | open | true | 90.282 | 88.34 | Limit | Alo | false | 359298802930 | 0xc71b.. |
Delivery & Folder Structure
Private Google Drive delivery. UTC-organized files with source-specific formats.
10-Market Order Flow Delivery
The standard package ships as one clean folder with stream-first layout. Each stream is partitioned by UTC
date and time bucket, with a coin column covering all 10 supported Hyperliquid markets.
l1ticks_hyperliquid_YYYY-MM-DD_to_YYYY-MM-DD/
├── hl_orders/
│ └── YYYY-MM-DD/HH-MM.parquet
├── hl_fills/
│ └── YYYY-MM-DD/HH-MM.parquet
├── hl_book/
│ └── YYYY-MM-DD/HH-MM-SS.parquet
└── hl_funding/
└── YYYY-MM-DD/HH.parquet
BTC High-Frequency Historical Delivery
The BTC high-frequency package is delivered separately for microstructure research. The core product is BTC historical book data; Binance folders are included as bonus cross-venue context.
btc_high_frequency_historical/
├── hl_book/
│ └── YYYY-MM-DD/HH-MM-SS.parquet
├── hl_orders/
│ └── YYYY-MM-DD/HH-MM.parquet
├── hl_fills/
│ └── YYYY-MM-DD/HH-MM.parquet
├── hl_funding/
│ └── YYYY-MM-DD/HH.parquet
├── hl_candles/
│ └── YYYY-MM-DD.parquet
├── hl_snapshots/
│ └── YYYY-MM-DD.parquet
├── binance_futures_bbo/
│ └── YYYY-MM-DD/HH-MM.parquet
├── binance_spot_bbo/
│ └── YYYY-MM-DD/HH-MM.parquet
├── binance_futures_trades/
│ └── YYYY-MM-DD/HH-MM.parquet
└── binance_futures_liquidations/
└── YYYY-MM-DD/HH-MM.parquet
Why This Data
Built for quants, not dashboards.
100% Order Lifecycle Tracking
Every order carries an exchange-assigned order_id. Deterministic place-to-fill linkage with no fuzzy matching. Track the full lifecycle: placement, rejection, fill, cancellation.
Wallet-Level Order Flow
Track individual accounts across executions. See who's placing, canceling, and getting filled.
Maker / Taker Attribution
Every fill tagged with execution role. Know who initiated each trade at the protocol level.
L1 Native Status Stream
14 granular status enums direct from the exchange engine. Distinguish accepted orders from ALO rejections, partial fills from full fills, triggered stops from manual cancels.
Parquet-Native
Zstd-compressed, columnar storage. One line of Python to load. No JSON parsing, no CSV headers.
Cross-Asset Coverage
Crypto majors, DeFi protocols, memecoins, and index perps are fully available on decentralized order books. This provides an incredibly rare dataset.
Deterministic Clock Alignment
Data is snapped to strict UTC clock boundaries with zero drift. Delivered in mathematically perfect daily grids (e.g., exactly 1,440 order files/day at 60s intervals).
Zero Midnight Leakage
State persistence routing guarantees a 23:59:59 payload never contaminates the next day's partition, saving your quants hours of manual resampling.
Frequently Asked Questions
Everything you need to know about the data.
What is L1 Ticks?
L1 Ticks provides market microstructure data for Hyperliquid and Polymarket. Hyperliquid is captured from L1 infrastructure and delivered as Zstd-compressed Parquet. Polymarket is captured from CLOB WebSocket feeds and sold as raw hourly JSONL.ZST tick files.
How is the data captured?
For Hyperliquid, we operate L1 node infrastructure that reads native order status streams in real time. For Polymarket, the raw product comes from our CLOB WebSocket recorder and active market discovery. The paid Polymarket package does not include outside scraper archives or derived feature tables.
What format is the data in?
Hyperliquid packages ship as Apache Parquet files with Zstd level 3 compression. The Polymarket raw
package uses one format only: hourly ticks_ws_YYYY-MM-DDTHH.jsonl.zst files, with one raw
WebSocket payload per line.
What markets are covered?
Hyperliquid: BTC, ETH, SOL, HYPE, SPX, PAXG, XRP, LINK, AAVE, and DOGE perpetual futures — spanning crypto majors, DeFi, TradFi-bridge, and meme sectors.
Polymarket: Raw CLOB WebSocket tick archives for active prediction markets.
What is the BTC high-frequency historical package?
It is a standalone BTC research package centered on high-frequency L2 book history. It is designed for market microstructure work: queue depth, spread behavior, liquidity shocks, short-horizon alpha features, and execution-quality analysis. Binance futures and spot context data is included as bonus data for cross-venue comparison, but the core product is the BTC high-frequency book.
Is there a free sample?
Yes. Free public Kaggle samples are available for BTC high-frequency microstructure and 10-perp Hyperliquid order flow. Polymarket raw tick samples use the same replay format as the paid raw delivery.
Pricing
Start free. Scale when ready.
- BTC HF sample: book, orders, fills, funding
- 10-perp sample: orders, fills, book, funding
- Raw UTC-partitioned Parquet files
- No sign-up or Telegram required
- Daily Parquet delivery for 30 days
- All 10 perpetual markets
- Orders, fills, book, funding streams
- Auto-renews for continuous access
- 30 days from your chosen start date
- All 10 Hyperliquid perps
- Orders, fills, book, funding streams
- Wallet-level order flow included
- Google Drive delivery
- Turn-key Docker deploy on your hardware
- RAM-disk architecture (prevents NVMe burnout)
- Raw L1 block extraction + API fallbacks
- 24/7 Ops Bot (Telegram/Slack) included
- Maintenance SLA for network upgrades
- BTC microstructure book with sub-second update cadence
- 20 levels bid/ask with server_time, snapshot_id, order counts, and OBI
- Built for spread, queue, liquidity shock, and execution-quality studies
- Bonus Binance data: futures BBO, spot BBO, trades, and liquidations
- Delivered as Parquet historical data with monthly refresh access