Benford’s Law is often used as a test to see if data has been manipulated, could it also be used to test to see if trading in Bitcoin is as dodgy as it’s made out to be?
By this stage of the interrogation, Tameer Hussain was wondering if he would ever see the light of day again. Staring directly into the eyes of the Director of the Pakistani Inter-Services Intelligence, he repeated,
“I’m telling you I didn’t know about the American drone strike, if I had known I would have asked them to call it off.”
The Director gave a half smile and with a slight motion of his head, signaled for Hussain’s interrogators to continue beating him,
“You see, I have a very simple rule. Do you know how I tell whether someone is telling the truth or not?”
Grabbing Hussain’s battered and bruised face between his large hand, the Director sneered,
“Because when I ask them, they give a simple ‘yes’ or ‘no’ — nothing more. When a man lies he tells stories. So I am asking you again, did you or did you not know about the drone strike before you came to see me.”
“Sahib, I am saying I didn’t know….ahhhh….”
Jolts of electricity from a nearby car battery were applied again to Hussain’s body as the room filled with the acrid smell of burning flesh.
In the world of international espionage, where everyone lies, it’s almost impossible to develop a simple rule to see if someone is telling the truth or not, but what if it was possible in the world of data?
That’s where Benford’s Law comes in.
Betting on Benford
For the uninitiated, Benford’s Law is simply the law of first-digits — an observation about the frequency of distribution of leading digits in many real-life sets of numerical data.
It can apply to volcanic eruptions or taxes, but essentially the law states that in any naturally occurring collections of numbers, the leading digit is likely to be small.
So for instance, in sets that obey the law, the number “1” appears as the leading significant digit about 30% of the time, while “9” appears as the leading significant digit less than 5% of the time.
Benford’s First Digit Distribution of Numbers in a Given Data Set
But Benford’s Law doesn’t stop there — it also makes predictions about the distribution of second digits, third digits, digit combinations and so on, stating that the first digit in a set of naturally occurring numbers is evenly distributed.
And although Benford’s Law is more commonly known, the phenomenon was first reported by Francis Newcomb, who noted a more frequent use of logarithmic tables that included numbers beginning with low digits.
Newcomb derived a mathematical rule for the probability p of first digits d occurring in the numbers of a given data set.
This rule is characterized by the following logarithmic function (with B as the logarithmic base), which empirically predicts the occurrence of first digits in a broad variety of data sets:
Frank Benford made the same observation independently over five decades later and published what we now know today as “Benford’s Law.”
Since then, Benford’s Law has been used to detect fraud from elections to macroeconomics, volcanoes to genomes and everything in between.
Basically, given a set of numbers, if they do not adhere to Benford’s Law, there’s a distinct possibility that the numbers have been manipulated.
Anytime a set of data does not exhibit “Benfordness,” further inquiry needs to be made.
Which begs the question, could Benford’s Law also be used to detect manipulation in cryptocurrency markets?
Benford’s Law for Cryptocurrency Trading
First, let’s see if Benford’s Law detects any funny business when it comes to the daily high price for Bitcoin.
Using IBM’s SPSS Extension, Doug Stauber took the first digit of the daily high price for Bitcoin between 2017 to 2018, which produced results that did not closely match Benford’s Law.
(Credit: Doug Stauber. Source: https://developer.ibm.com/predictiveanalytics/2018/02/09/benfords-law-perfectly-describes-bitcoin-volume/)
Using data from CoinDesk’s Bitcoin Price Index also suggests the price of Bitcoin does not obey Benford’s Law.
The grey crosses represent starting digits that would obey Benford’s Law.
So is this the smoking gun that Bitcoin prices are manipulated?
Hold your horses there Sherlock.
As useful as Benford’s Law is, it does not however characterize every dataset and its appearance in a set of data requires the presence of various criteria.
To begin with, changing the scale of the numbers shouldn’t change the first digit distribution in a set of numbers, otherwise that set of data can’t be subject to Benford’s Law.
So if the first digit occurrence probability in a data set expressed in meters changes when expressed in kilometers, that dataset probably does not follow Benford’s Law.
Benford’s Law also requires the dataset to be unaffected by base invariance, meaning that the first digit occurrence probability must not change with a change in the base B of the underlying logarithmic function.
Because Benford’s Law is a result of multiplicative processes, for a dataset to comply with Benford’s Law, the numerical data generated must follow a Markov chain.
A Markov chain is simply a stochastic model describing a sequence of possible events, in which the probability of each event depends only on the state attained in the previous event.
An example of a Markov chain is a game of snakes and ladders, or indeed any game which is determined entirely by dice, like craps.
Whereas a game like blackjack, where the cards represent a “memory” of the past, is a non-Markov chain — because there can only be four aces in a deck, if all the aces are dealt out, the probability of the next card being an ace is zero and not based on the fact that the next card is being dealt.
That’s why Benford’s Law works so well for observable economic data like GDP growth, employment rates or income development, but it also explains why it doesn’t work for datasets where numbers are influenced by human emotion, for instance prices set by psychological thresholds, such as pricing a bottle of ketchup at $1.99 instead of US$2.
While it is entirely possible that Bitcoin’s price doesn’t conform to Benford’s Law due to manipulation, because its price is driven in large part by competing narratives, psychological thresholds like US$10,000 and US$20,000 make Bitcoin’s price less, well, Benfordian.
So while Benford’s Law may not be useful to determine if Bitcoin’s price is being manipulated, could it be used to see if there’s any funny business with cryptocurrency exchange trading volume?
Stauber applied Benford’s Law to Bitcoin volume on the Kraken Exchange and came up with this data:
Surprisingly, Bitcoin’s trading volume on Kraken conformed remarkably with Benford’s Law. That doesn’t mean that there’s no manipulation of traded volumes on Kraken, just that so far at least, Kraken’s volumes conform to Benford’s Law.
Stauber then went on to test that analysis against altcoin volume on Poliniex, another cryptocurrency exchange and churned out this data:
Again, and somewhat surprisingly, even altcoin volumes, outside perhaps of Bitcoin Cash, fell in line with Benford’s Law — surely this must be a smoking gun?
Given that both Polinex and Kraken are U.S.-based and regulated exchanges, there are strong disincentives to fake volume, so to understand how Benford’s Law could be used to detect volume that purports to be fake, we need to expand the study to also include cryptocurrency exchanges that are more “lightly” regulated.
Using data from Investing.com, the frequency distribution of the leading digit of daily Bitcoin trading volume looks like this across some popular exchanges:
Now surely this must be a smoking gun?
Just as in Sesame Street, if one of these kids is doing their own thing, then perhaps it’s better not to play their game.
One of these kids is not like the others.
While I’m not for a minute suggesting that any of these exchanges is manipulating their Bitcoin trading volumes, there are some glaring deviations from Benford’s Law that appear significant and require further investigation.
Although Bitcoin’s price doesn’t adhere to Benford’s Law, there may be an excuse in that psychological threshold prices, like every “thousand” dollars, can limit the value with which Benford’s Law can be used to detect fraud or manipulation in the price of Bitcoin.
And that’s why many people who attempted to use Benford’s Law for stock prices, did so with limited success — there’s just too much psychology and likely, too much manipulation.
We all instinctively know that out there somewhere is a trader who received “inside information” before a stock’s price moved.
That Benford’s Law can’t reasonably be used for the market prices of things like stocks and cryptocurrencies, perhaps says it all — all markets suffer from some degree of manipulation to one degree or another.
And in that sense, that market prices can’t fit into Benford’s Law is proof that manipulation can and does exist, but provides no useful information outside of that.
Knowing that markets are manipulated provides limited actionable insight, it’s nothing outside of the realm of contemplation.
But when it comes to cryptocurrency exchange volume, Benford’s Law may shine a spotlight on some of the cryptosphere’s worst excesses and for some, determine where they should be trading on.