Welcome to Benford's Law Stocks, an Investment Research Tool for Your Toolbox

By BenfordsLawStocks.com Website Staff, March 2024

Have you heard of Benford's Law (also sometimes called the "first-digit law")? We first heard about it from the August 19, 2010 episode of the podcast Radiolab, which introduced the concept and briefly mentioned its application to the stock market. So what is Benford's Law and what is our website all about?

Let's start the introduction this way... Please come up with a random number. Got one? We'll pick one too. Ours was 54 — a number which starts with the digit 5. What digit did your number start with?

If you ask a computer to come up with a random number from, say, 1 to 100 — an exercise which we just did for the purposes of this introduction — you will experience essentially equal odds that it will give you a number beginning with the digit 1, as a number beginning with the digit 2, and 3, 4, 5, 6, 7, 8, and 9.

To illustrate this by way of example, for the purposes of this introductory article we asked the computer 10,000 times for a random number from 1 to 100, and we got back 1,144 numbers that began with the digit 1, followed by 1,085 that began with the digit 2, and so on until 1,121 numbers that began with 9:

1's 2's 3's
The random number generator produced 1,144 numbers that start with the digit 1:
14
13
18
10
18
19
12
18
10
16
16
18
18
10
17
12
14
10
15
17
11
14
16
10
16

Etc...
The random number generator produced 1,085 numbers that start with the digit 2:
24
21
22
25
23
20
23
23
25
20
2
20
22
26
21
27
2
22
23
24
28
28
26
24
2

Etc...
The random number generator produced 1,156 numbers that start with the digit 3:
36
30
31
37
39
35
34
31
3
33
36
36
39
31
36
38
39
31
32
3
32
39
37
37
37

Etc...
4's 5's 6's
The random number generator produced 1,091 numbers that start with the digit 4:
4
44
45
45
45
42
43
46
45
43
49
45
43
40
41
40
48
45
4
41
42
46
47
43
47

Etc...
The random number generator produced 1,148 numbers that start with the digit 5:
51
52
52
59
54
59
50
51
52
51
55
57
53
59
50
5
52
52
59
57
59
59
54
52
53

Etc...
The random number generator produced 1,118 numbers that start with the digit 6:
66
65
65
6
68
65
67
61
66
65
66
64
65
6
66
61
62
66
69
63
6
64
62
66
61

Etc...
7's 8's 9's
The random number generator produced 1,108 numbers that start with the digit 7:
79
7
70
77
71
78
79
7
74
75
78
76
7
75
71
78
79
74
72
78
77
72
74
73
74

Etc...
The random number generator produced 1,029 numbers that start with the digit 8:
8
83
88
83
88
81
85
87
85
86
87
89
81
84
89
82
85
87
85
82
82
82
82
86
89

Etc...
The random number generator produced 1,121 numbers that start with the digit 9:
96
99
98
90
97
91
99
97
94
99
93
97
93
94
99
93
95
9
93
9
96
92
9
97
9

Etc...


Obviously since the above set of numbers was generated at random, the percentage of numbers starting with each digit wasn't precisely equal to one another — the generator was random after all — but you can see how the odds pointed in that direction (because if you look at the actual numbers starting at 1 and ending at 99 there exist an equal amount of numbers starting with each of the digits), and that when we chart the resulting percentages, the chart is basically "a flat line" where each bar is of the same height (each digit's percentage share of numbers generated that start with that digit, being roughly the same):


11.4
10.8
11.6
10.9
11.5
11.2
11.1
10.3
11.2
 
%1 %2 %3 %4 %5 %6 %7 %8 %9


The bar chart you see above has nine bars, representing the percentage of numbers in the set that began with the digit 1 (11.4%), through 9 (11.2%). While there was a little variation due to random chance, each bar is at roughly the same height, meaning that the set had close-to-equal percentages of numbers that began with each of the nine digits.

And if you were to ask the average person about the prevalence of numbers "out there in the world" beginning with each digit, they might assume it follows a similar pattern to the above — i.e. equal odds. (And if that average person wanted to commit fraud by, say, coming up with a bunch of phony invoices, they might choose fake invoice amounts that are also randomly distributed across the digits in this way). But is the average person correct in thinking numbers naturally arrange by first-digit equally in this way?

Enter Frank Benford. We'll now need to transport ourselves backwards in history, to a time before computers — even before the time of pocket calculators, and all the way back to the days where if you needed a convenient tool to help you perform multiplication, you might go to your public library and borrow its copy of a physical book of logarithmic tables.

Book of logarithm tables


In these logarithmic table books, the pages were arranged in numerical order such that the 1's were among the beginning pages, the following pages held the 2's, and on through to the 9's at the pages at the tail end of the book.

Picture Frank Benford going to the library and pulling out a really old — i.e. very well used — copy of one of these books of logarithmic tables. And he can't help but notice something unexpected: towards the front of the book, the paper was much more smudged, crinkled, more clearly-used, than towards the end of the book, where the paper was comparably pristine.

Benford pondered this observation, and asked himself the question: Maybe there are more numbers "out there in the world" that start with 1 or 2, than there are with 8 or 9? Could that be?!? Maybe the distribution of numbers in the world isn't equal (like our chart earlier) but rather forms a different pattern?

And he set out to compile some statistics, to check. First he set about gathering various naturally-occurring sets of numbers, such as molecular weights of different chemicals; baseball statistics; census data; bank account balances; the revenues of all listed companies on the stock market... And next, for each set he organized the numbers within that set into the nine buckets, depending on whether the first digit of each number was 1 through 9.

And no matter where he looked — sizes of rivers; population counts; number of deaths; areas of counties — he kept encountering the same distribution pattern, over and over and over again, and it always looked something like this:

30.1
17.6
12.5
9.7
7.9
6.7
5.8
5.1
4.6
 
%1 %2 %3 %4 %5 %6 %7 %8 %9


There were more ones in the 1's bucket than twos in the 2's bucket, more twos than threes, more threes than fours, more fours than fives, more fives than sixes, more sixes than sevens, more sevens than eights, and more eights than nines!

This result may seem quizzical and counter-intuitive when you hear it for the first time, but for an illustrative example of why this actually makes sense, let's think about the stock market data set example that Benford included in his studies: the revenues that year for each and every public company on the stock market.

Consider the fact, that based on the numbers at the time of this writing, the year over year revenue growth among S&P 500 companies as of 9/30/2023 and looking backwards to 2001, averages to approx. 4.3% per year and with a median of approx. 5.4% over that 2+ decade span — let's keep this "typical growth rate" in our heads for context.

Now imagine "typical company X" currently has annual revenue of 800K (putting them in the 8's first-digit bucket), and aspires to graduate up to the 9's bucket by growing their revenue to 900K. The percentage change in revenue that it will take to accomplish this goal, is +12.5%. And then, from 900K, to graduate from the 9's bucket up to the next digit (circling back to the 1's bucket), the required revenue growth is even less: an increase of +11.1% brings the company up to 1 million.

Ah, but now that the company has reached 1 million, think about how long they're going to stay in that 1's first-digit bucket... they have to grow through 1.1 million, 1.2 million — they're not going to reach the 2's first-digit until their revenue has doubled!   A 100% increase in revenue is required to go from 1 to 2 million! Think of how long they'll need to stay in the 1's bucket, compared to the 9's!

 1  2  3  4  5  6  7  8  9


Once they finally achieve 100% revenue growth and hit 2 million, the next first-digit bucket (which they'll reach at 3 million of revenue), is a closer hurdle: now, they require 50% growth in revenue to graduate from the 2's up to the 3's. And once they reach 3 million, the next digit (4) requires 33.3% revenue growth; from 4 to 5 requires 25% revenue growth; 5 to 6 requires 20% revenue growth... See how the percentage change requirement to jump to the next first-digit bucket keeps shrinking as revenue grows? Then one day they're at 9 million, and require just 11.1% revenue growth to reach 10 million...

But then once that 10 million mark has been reached, they're back in the 1's first-digit again: and once more they'll need 100% revenue growth (to 20 million) before they're back in the 2's! Think about how comparably little time they were just in the 9's, whereas now they'll be in the 1's for the entire time that they pass 11 million, 12 million ... 18 million, 19 million, and finally they're back in the 2's with 20 million. But from 20 million, to reach 30 million the hurdle is closer: 50% growth is required (just like it took 50% growth to get from 2 million to 3 million).

Having walked through the digits in this way, you can see that the pattern repeats over and over as the revenue number grows: a full 100% growth (a doubling) is needed to go from the 1's to the 2's, 50% from the 2's to the 3's, 33.3% from the 3's to the 4's, 25% from the 4's to the 5's, 20% from the 5's to the 6's, 16.67% from the 6's to the 7's, 14.29% from the 7's to the 8's, 12.5% from the 8's to the 9's, and 11.1% from the 9's back around to the 1's again.

Actual company revenue growth rates of course can be variable, but if "typical company X" hypothetically were to achieve perfectly-steady year/year revenue growth throughout their journey from a tiny company to a behemoth, you can see how they'd spend much more time with their revenue happening to be in the 1's first-digit bucket than the 2's, that the next-highest amount of time would be spent in the 2's, followed by the 3's, etc., until the smallest amount of time would be lived with revenue in the 9's first-digit bucket.

So looking across all publicly traded companies (from young start-ups to large behemoths and every company in between), when you take all of their annual revenues as a data set, you can envision how this is precisely the pattern you should expect to see in the data — more 1's than 2's, more 2's than 3's, etc. — and this relationship between the numbers, this downward-sloping curve, is what was described in the American Philosophical Society Proceedings in 1938, as "The Law of Anomalous Numbers."

The Law of Anomalous Numbers


But of course, most people today just refer to it as "Benford's Law."

And it turns out that the discrepancy visible between the set of numbers an average fraudster might create (i.e. someone who might pick their numbers without regard to the natural curve, such as picking at random), versus the normal Benford's Law curve of a "natural" data set, is highly useful in rooting out suspicious reports.

And that brings us to the reason we created this website. Although Benford had gathered the revenue figures from every company as a data set, what we were inspired to do, with this website, was to drill into each public company, have our code study its individual quarterly 10Q and annual 10K filings and extract the numbers within each filing — each of which typically includes various revenue line items, various expense line items, balance sheet line items, share counts, etc. etc. (typically we see around 1000 numbers and sometimes as many as 15000+).

We created software to perform this task and then, for each filing, to arrange the extracted numbers from that filing into a chart, showing the percentage of 1's first-digit numbers, 2's, 3's, etc.

And as our system began to process filing after filing, we quickly found that the numbers within each filing typically tend to follow the same familiar Benford's Law distribution pattern.

How useful is this in spotting potential fraud? Our first thought was to run some old Enron filings through the software; for instance here's the software's chart of their filing from 5/15/1998, which seems to have a suspicious count of first-digit-4's in it:

30.7
19.4
9.0
12.5
7.5
6.9
5.2
4.6
4.2
 
%1 %2 %3 %4 %5 %6 %7 %8 %9


As far as modern-day accounting frauds among current public companies, they might be very difficult to spot, because for one thing, any "professional criminal" is probably already going to know about Benford's Law, and any "fake numbers" would be carefully chosen in order to conform to it. Another difficulty in spotting fraud using Benford's Law is that the fraudster's selective manipulating one or two very-important numbers could go completely unnoticed amidst a filing with thousands of numbers in it.

However: we have to imagine that many cases of accounting fraud might begin with an innocent mistake by non-criminals, who then make very poor choices about how to proceed once they realize the mistake. Consider for example the following quote from this SEC filing which relates to an accounting scandal that ended in one of the largest class action lawsuit settlements in recent history:

"The Audit Committee based its conclusion on the preliminary findings of its investigation into concerns regarding accounting practices and other matters that first were reported to the Audit Committee on September 7, 2014. The Audit Committee promptly initiated an investigation, which is being conducted with the assistance of independent counsel and forensic experts ... the Audit Committee believes that the Company incorrectly included certain amounts related to its non-controlling interests in the calculation of adjusted funds from operations (“AFFO”), a non-U.S. GAAP financial measure, for the three months ended March 31, 2014 and, as a result, overstated AFFO for this period. The Audit Committee believes that this error was identified but intentionally not corrected, and other AFFO and financial statement errors were intentionally made, resulting in an overstatement of AFFO and an understatement of the Company’s net loss for the three and six months ended June 30, 2014."


The above quote suggests that the entire scandal began with an error; that once this error was identified, instead of correcting it, it was intentionally not corrected; and instead, apparently to compensate for the error, our read of the quote is that additional "errors" were intentionally introduced. Anything intentionally introduced is, by definition, "non-natural" and thus might include numbers chosen by a person according to their particular thought process — and perhaps knowledge of Benford's Law was not part of that thought process.

We ran the above company's 2014 10K filing through our software, and here's the resulting chart:

31.8
20.1
10.6
6.6
5.1
6.6
10.1
4.8
4.4
 
%1 %2 %3 %4 %5 %6 %7 %8 %9


That 7's column in particular raises an eyebrow — seeing that, one might have the thought, "that seems quite a bit too high and I should investigate further", wouldn't you agree? Now it turns out that the 10K filing in question includes this line: "The number of outstanding shares of the registrant’s common stock on May 7, 2014 was 769,995,602 shares." Their share count (a number that would tend to get repeated quite often throughout a filing) began with the digit 7, which could definitely help explain an above-expected result in the 7's column.

But the main point is this: for our own investment research we have found that having a handy way to see how the numbers within a given company's 10Q and 10K filings compares with the expected Benford's Law distribution, is a profoundly useful research tool. And we wanted to share our tool with the investing public. It can be of great use in flagging things that you would like to investigate more closely during your research. For example while the chart above might inevitably cause one to want to research why the 7's column came in so high, the below chart, from the very same company in its final 10K filing before it was acquired by a larger competitor, looked like this:

29.2
20.3
11.9
9.1
7.3
7.1
5.6
5.1
4.3
 
%1 %2 %3 %4 %5 %6 %7 %8 %9


Seeing the above, one might think, "okay, that looks about normal."

Once someone has learned about Benford's Law, it is natural to be curious to know how the filings of stocks you own happen to look in comparison to the normal Benford curve. Perhaps everything looks totally normal — great. But suppose the chart for one of your stocks looked the below... Wouldn't you want to know?

18.4
8.1
5.1
3.7
13.2
8.8
10.3
25.0
7.4
 
%1 %2 %3 %4 %5 %6 %7 %8 %9


So now having read this far, you will understand exactly what our website is all about, and how we are applying Benford's Law to stocks: the software we commissioned gathers and analyses 10Q and 10K filings, outputs charts like the ones you've seen above, and then also applies an in-house "scoring algorithm" (learn more about our scoring algorithm) in order to give an indication of how normal or abnormal the distribution pattern looks, versus the expectation based on Benford's Law.

To explore our website further, you can jump to the top and enter any company name or ticker symbol into the search box in the main menu, or, browse through the latest filings our system has processed. Click through on any company for the full details. Thanks for visiting, and we hope our site becomes one of your go-to research tools for stock market research.

 

Benford's Law Stocks | www.Benford'sLawStocks.com
Copyright © 2023 - 2024, All Rights Reserved

Nothing in BenfordsLawStocks.com is intended to be investment advice, nor does it represent the opinion of, counsel from, or recommendations by BNK Invest Inc. or any of its affiliates, subsidiaries or partners. None of the information contained herein constitutes a recommendation that any particular security, portfolio, transaction, or investment strategy is suitable for any specific person. All viewers agree that under no circumstances will BNK Invest, Inc,. its subsidiaries, partners, officers, employees, affiliates, or agents be held liable for any loss or damage caused by your reliance on information obtained. By visiting, using or viewing this site, you agree to the following Full Disclaimer & Terms of Use and Privacy Policy.