← Blog

I measured every file in Solana's 464 TB on-chain archive


Everyone references “Old Faithful” for Solana’s historical data, but most devs don’t know the specifics. So I measured it.

I HEAD-requested every one of the 958 CAR files in Project Yellowstone’s Old Faithful archive and summed them. Solana’s full historical blockchain is exactly 464.6 TB.

What Old Faithful actually is

Old Faithful is a free, CDN-hosted archive of every Solana block ever produced, stored in IPLD CAR format. It is not validator RocksDB format. It does not contain account state. It’s purely transaction history — blocks, transactions, and their metadata.

Key details most people don’t know:

  • It’s 464.6 TB of raw data across 958 files
  • It explicitly does not contain account state
  • It’s IPLD format, not what validators store locally
  • It’s free but CDN-throttled at ~150 req/s per IP
  • It comes with limited indexes: signature-to-CID, CID-to-offset, and a gsfa index

Per-year breakdown

The archive is heavily weighted toward recent years. 2024 and 2025 dominate — Solana’s throughput has grown dramatically, and each slot now contains significantly more transactions than it did in 2020-2021.

The practical implication: if you’re building an indexer and only care about the last 12 months of data, you’re still looking at a substantial chunk of the total archive.

What you can build on top

Old Faithful gives you the raw blocks. What’s missing are the secondary indexes that make the data queryable:

  • Address → transaction (what Helius sells as getTransactionsForAddress)
  • Program → accounts (what DAS API partially covers)
  • Mint → holders (what token registries provide)

Each of these indexes is relatively small — sub-TB — compared to the archive itself. The index is where the value lives, and it’s the cheapest part of the stack to host.

Why this matters

If you’re evaluating whether to self-host Solana data infrastructure or pay for a managed provider, the first thing you need is real numbers. Not estimates, not “it’s a lot” — actual measured sizes, actual throughput numbers, actual costs.

That’s what we’ve been doing at Fala Labs. Every number in our helius-replacement design comes from reproducible benchmarks you can run yourself.