The Blockchain Data Problem Is Bigger Than You Think

Published on by Coindesk | Published on

A straightforward data point - the total supply of bitcoin hit 17 million.

As some celebrated once the mark was hit on bitcoin data provider Blockchain's website, others took to Twitter to rain on their parade.

"Today I've learned that a lot of data sources are incorrectly reporting the total bitcoin supply. We haven't actually hit 17 million BTC yet."

Info, one of the most popular and highly-regarded sources for blockchain network data, among others, had not accounted for instances in which bitcoin miners, due to bugs and other causes, did not claim their full block reward.

These discrepancies in the total bitcoin supply metric are not the exception, but part of a larger problem that stems from the "Opaque" methodologies these blockchain data analysis providers use, according to Greg Cipolaro, the CEO of Digital Asset Research, a firm that provides blockchain analysis to clients.

Still, many people who depend on public blockchain data don't realize how flawed some of this data is.

Due to the issues with public data sets, many blockchain data professionals avoid using them and instead use data they calculate internally whenever possible.

Chainalysis, a firm that analyzes blockchain data for clients including the U.S. Internal Revenue Service, is certainly skeptical.

Kimberley Grauer, Chainalysis' chief economist, said she prefers to use internal data because, "I know where the errors are; I know where the vulnerabilities are." DAR's Cipolaro echoed that, telling CoinDesk the company runs its own code, gleaning data from its own bitcoin node.

Still, despite their shortcomings, Cipolaro has high praise for the free sites that make bitcoin data available to the public.

x