Pipeline running — data updated

Institutional-Grade
Market Microstructure

Market microstructure data for Hyperliquid and Polymarket. Hyperliquid ships as Zstd Parquet; Polymarket ships as raw hourly CLOB WebSocket JSONL.ZST.

10 HL Perps
1 BTC HF Book
Raw PM WS Ticks
30B+ Rows Delivered
99.7% Uptime
2000+ Kaggle DLs
Daily Data Updates

What You Can Build

Real use cases. Not hypothetical demos.

$

Smart Money Detection

Track wallet-level order flow to identify informed traders. Cluster wallets by behavior, detect accumulation/distribution patterns before price moves.

Execution Quality Analysis

Maker/taker attribution with per-fill fees. Model slippage, measure fill rates, and benchmark execution quality across venues.

λ

ML Signal Engineering

Block-sequenced features with causal ordering guaranteed. Build order flow imbalance, cancel ratios, and toxicity signals for price prediction models.

Market Microstructure Research

20-level orderbook snapshots + funding rate dynamics across 10 assets. Study liquidity provision, OBI signals, and cross-asset correlations on a fully on-chain exchange.

Data Catalog

Every field verified. Every schema documented.

BTC ETH SOL HYPE SPX PAXG XRP LINK AAVE DOGE

Orders

Per-block ~2s
block_height Int64 timestamp_ms Int64 wallet String coin String action Enum status String (L1 native) is_buy Boolean price Float64 size Float64 order_type String tif String reduce_only Boolean order_id Int64 (100% populated) cloid String
1B+ events/day • 100% order_id coverage

Fills (L1)

Per-trade
block_height Int64 timestamp_ms Int64 wallet String coin String side String role Maker / Taker price Float64 size Float64 fee Float64 is_liquidation Boolean tid Int64 oid Int64
1.6M+ fills/day • wallet-level tracking

L2 Orderbook

60s snapshots
timestamp_ms Int64 coin String level 1–20 bid_price Float64 bid_size Float64 ask_price Float64 ask_size Float64 obi Float64
20 levels bid + ask • all 10 assets • OBI pre-computed

Funding + OI

5 min
timestamp_ms Int64 coin String open_interest Float64 funding_rate Float64 mark_price Float64 oracle_price Float64 premium Float64 day_volume Float64
1.4K+ readings/day • all 10 assets
Special Book Stream

BTC Microstructure Book

BTC only · sub-second

Standalone BTC high-frequency historical data for spread dynamics, queue depth, liquidity shocks, adverse selection, and short-horizon execution research. Binance context data is included as a bonus for cross-venue validation.

Queue dynamics Spread behavior Liquidity shocks Execution quality Binance bonus data
server_time Int64 timestamp_ms Int64 snapshot_id Int64 coin BTC level 1–20 bid_price Float64 bid_size Float64 bid_n Int32 ask_price Float64 ask_size Float64 ask_n Int32 obi Float64
BTC only • 20 levels bid + ask • order counts + OBI included

Load in One Line

Hyperliquid loads as Parquet. Polymarket replays as raw compressed JSONL.

orders_sample.py
import polars as pl

# Load 163M+ order events with 100% order_id coverage
orders = pl.read_parquet("orders/*.parquet")
btc = orders.filter(pl.col("coin") == "BTC")

print(btc.head(5))
shape: (5, 14)
block_height timestamp_ms wallet coin action status is_buy price size order_type tif reduce_only order_id cloid
i64 i64 str str str str bool f64 f64 str str bool i64 str
0 1774396843090 0xce71..5a42 BTC Rejected badAloPxRejected false 70525.0 0.12282 Limit Alo false 360336907743 0x0000..
0 1774396843090 0xf36c..e1a4 PAXG Filled filled false 4548.0 0.56 Limit Alo false 360336907286 0x1490..
0 1774396843090 0x5483..a6b2 SOL PlaceOrder open true 90.282 88.34 Limit Alo false 359298802930 0xc71b..

Delivery & Folder Structure

Private Google Drive delivery. UTC-organized files with source-specific formats.

Standard Hyperliquid Package

10-Market Order Flow Delivery

BTC · ETH · SOL · HYPE · SPX · PAXG · XRP · LINK · AAVE · DOGE

The standard package ships as one clean folder with stream-first layout. Each stream is partitioned by UTC date and time bucket, with a coin column covering all 10 supported Hyperliquid markets.

l1ticks_hyperliquid_YYYY-MM-DD_to_YYYY-MM-DD/
├── hl_orders/
│   └── YYYY-MM-DD/HH-MM.parquet
├── hl_fills/
│   └── YYYY-MM-DD/HH-MM.parquet
├── hl_book/
│   └── YYYY-MM-DD/HH-MM-SS.parquet
└── hl_funding/
    └── YYYY-MM-DD/HH.parquet
All 10 markets in one schema Zstd-compressed Parquet No JSON parsing
Standalone Research Pack

BTC High-Frequency Historical Delivery

BTC only · Binance bonus data

The BTC high-frequency package is delivered separately for microstructure research. The core product is BTC historical book data; Binance folders are included as bonus cross-venue context.

btc_high_frequency_historical/
├── hl_book/
│   └── YYYY-MM-DD/HH-MM-SS.parquet
├── hl_orders/
│   └── YYYY-MM-DD/HH-MM.parquet
├── hl_fills/
│   └── YYYY-MM-DD/HH-MM.parquet
├── hl_funding/
│   └── YYYY-MM-DD/HH.parquet
├── hl_candles/
│   └── YYYY-MM-DD.parquet
├── hl_snapshots/
│   └── YYYY-MM-DD.parquet
├── binance_futures_bbo/
│   └── YYYY-MM-DD/HH-MM.parquet
├── binance_spot_bbo/
│   └── YYYY-MM-DD/HH-MM.parquet
├── binance_futures_trades/
│   └── YYYY-MM-DD/HH-MM.parquet
└── binance_futures_liquidations/
    └── YYYY-MM-DD/HH-MM.parquet
BTC microstructure core Binance context included Monthly refresh access

Why This Data

Built for quants, not dashboards.

100% Order Lifecycle Tracking

Every order carries an exchange-assigned order_id. Deterministic place-to-fill linkage with no fuzzy matching. Track the full lifecycle: placement, rejection, fill, cancellation.

Wallet-Level Order Flow

Track individual accounts across executions. See who's placing, canceling, and getting filled.

Maker / Taker Attribution

Every fill tagged with execution role. Know who initiated each trade at the protocol level.

L1 Native Status Stream

14 granular status enums direct from the exchange engine. Distinguish accepted orders from ALO rejections, partial fills from full fills, triggered stops from manual cancels.

Parquet-Native

Zstd-compressed, columnar storage. One line of Python to load. No JSON parsing, no CSV headers.

Cross-Asset Coverage

Crypto majors, DeFi protocols, memecoins, and index perps are fully available on decentralized order books. This provides an incredibly rare dataset.

Deterministic Clock Alignment

Data is snapped to strict UTC clock boundaries with zero drift. Delivered in mathematically perfect daily grids (e.g., exactly 1,440 order files/day at 60s intervals).

Zero Midnight Leakage

State persistence routing guarantees a 23:59:59 payload never contaminates the next day's partition, saving your quants hours of manual resampling.

Frequently Asked Questions

Everything you need to know about the data.

What is L1 Ticks?

L1 Ticks provides market microstructure data for Hyperliquid and Polymarket. Hyperliquid is captured from L1 infrastructure and delivered as Zstd-compressed Parquet. Polymarket is captured from CLOB WebSocket feeds and sold as raw hourly JSONL.ZST tick files.

How is the data captured?

For Hyperliquid, we operate L1 node infrastructure that reads native order status streams in real time. For Polymarket, the raw product comes from our CLOB WebSocket recorder and active market discovery. The paid Polymarket package does not include outside scraper archives or derived feature tables.

What format is the data in?

Hyperliquid packages ship as Apache Parquet files with Zstd level 3 compression. The Polymarket raw package uses one format only: hourly ticks_ws_YYYY-MM-DDTHH.jsonl.zst files, with one raw WebSocket payload per line.

What markets are covered?

Hyperliquid: BTC, ETH, SOL, HYPE, SPX, PAXG, XRP, LINK, AAVE, and DOGE perpetual futures — spanning crypto majors, DeFi, TradFi-bridge, and meme sectors.

Polymarket: Raw CLOB WebSocket tick archives for active prediction markets.

What is the BTC high-frequency historical package?

It is a standalone BTC research package centered on high-frequency L2 book history. It is designed for market microstructure work: queue depth, spread behavior, liquidity shocks, short-horizon alpha features, and execution-quality analysis. Binance futures and spot context data is included as bonus data for cross-venue comparison, but the core product is the BTC high-frequency book.

Is there a free sample?

Yes. Free public Kaggle samples are available for BTC high-frequency microstructure and 10-perp Hyperliquid order flow. Polymarket raw tick samples use the same replay format as the paid raw delivery.

Pricing

Start free. Scale when ready.

Free
$0
Public Kaggle samples for BTC high-frequency microstructure and 10-perp Hyperliquid order flow.
  • BTC HF sample: book, orders, fills, funding
  • 10-perp sample: orders, fills, book, funding
  • Raw UTC-partitioned Parquet files
  • No sign-up or Telegram required
Starter
$69 / month
Daily Parquet delivery for all 10 Hyperliquid perps. 30-day rolling access from your start date.
  • Daily Parquet delivery for 30 days
  • All 10 perpetual markets
  • Orders, fills, book, funding streams
  • Auto-renews for continuous access
Get Started →
Enterprise
Custom
We deploy our battle-tested L1 node infrastructure directly onto your servers.
  • Turn-key Docker deploy on your hardware
  • RAM-disk architecture (prevents NVMe burnout)
  • Raw L1 block extraction + API fallbacks
  • 24/7 Ops Bot (Telegram/Slack) included
  • Maintenance SLA for network upgrades
Book a Call →
Standalone Research Pack
BTC High-Frequency Historical
$129 / month
BTC-only high-frequency historical book data for market microstructure research. Binance futures and spot context data is included as a bonus.
  • BTC microstructure book with sub-second update cadence
  • 20 levels bid/ask with server_time, snapshot_id, order counts, and OBI
  • Built for spread, queue, liquidity shock, and execution-quality studies
  • Bonus Binance data: futures BBO, spot BBO, trades, and liquidations
  • Delivered as Parquet historical data with monthly refresh access
Get BTC HF Data →