About Our Benford's Law Stocks Scoring Algorithm
By BenfordsLawStocks.com Website Staff, March 2024
Hello! Have you read our
main introduction yet? This page is written for those who have; so if you have
not, please go do so now before you continue reading. If you
have read our introduction, then welcome back! And please read on.
In our introduction, we had shown you some charts that looked something like this:
|
|
|
|
|
|
|
|
|
|
%1 |
%2 |
%3 |
%4 |
%5 |
%6 |
%7 |
%8 |
%9 |
The above represents the actual current
median result, for the latest 10Q filing for every company in our coverage universe — i.e. aggregating every company our software has successfully processed.
And indeed the above tracks
almost exactly with what our expectation would be, based on Benford's Law. Thus, we could consider the above to represent a "perfectly normal"
filing.
But there is of course a
range of normal outcomes for each column, so the charts you will see throughout our website look a little different than the one you see above —
our charts also include a colored background area, like
this:
|
|
|
|
|
|
|
|
|
|
%1 |
%2 |
%3 |
%4 |
%5 |
%6 |
%7 |
%8 |
%9 |
What you see above in the colored background area represents "normal range" — meaning that across our entire coverage universe,
most companies fell within that range.
It is based on this idea of what's "typical" that we devised our
scoring algorithm for each individual company's filings: because once we have an idea of what's "normal"
we can then perform a
comparison.
A
second comparison we are able to make is against an individual company's "own history" — i.e. the software can get a sense of what an individual company's
filings typically look like on a Benford curve basis, and then check if the newest filing has any abnormal deviation from that typical range.
We turn this idea of comparisons against typical range, into a
score, where our algorithm makes certain deductions for things that look out of whack.
Here is one example chart, which our algorithm scored an 82 out of 100:
|
|
|
|
|
|
|
|
|
|
%1 |
%2 |
%3 |
%4 |
%5 |
%6 |
%7 |
%8 |
%9 |
Starting from a possible score of 100, our algorithm made the following score deductions:
The 7's percentage was higher than the 6's percentage, increasing the penalty multiplier.
We deducted points (includes the impact of penalty multiplier) upon observing the 7's percentage was above the typical coverage universe range.
The 9's percentage was higher than the 8's percentage, increasing the penalty multiplier.
This is the 2nd column with percentage higher than the prior column, resulting in a penalty deduction.
We deducted points (includes the impact of penalty multiplier) upon observing the 9's percentage was above the typical coverage universe range.
Once our software has scored
every company in our coverage universe on their latest filings, we can also see how any individual company
compares to the scores of others — and in the case of the above example, this was a normal score compared to other filers, falling within the middle 50%.
We can also see how the data set size compares — i.e. how many total numbers our system extracted from each filing, and how that count compared to other filers. For the above example, 5,891 numbers were found, which is a large count compared to other filers, putting the company within the highest 10%. Knowing the number count can be useful, because for filings with very small counts, even one or two additional numbers in a given column can have a large impact — thus readers can take the output with the appropriate grain of salt.
The above example filing was filed using something called
XBRL (eXtensible
Business Reporting Language) format. In cases like this, our algorithm extracts all "tagged" numbers presented by the company. For any non-XBRL
filings, our algorithm uses its own internal set of rules to extract what it believes to be numbers as found throughout the filings. Repeated numbers are allowed — i.e. we are not just looking for
the set of "unique numbers" in the filing.
The final thing we would like to say about our algorithm is to reiterate what we say across the website which is that our Benford's Law analysis is strictly for informational purposes only and does
not represent investment advice of any kind, and our software should be expected to make mistakes and contain errors.
Therefore nothing you see on this website should be relied upon for any decision-making purpose without first repeating
the analysis yourself independently, which we encourage users to do — i.e.
check our work!
In fact, each page on the website, together with a score, also presents to you the full data set of all of the numbers our system detected. We also include links to every filing on sec.gov.
This way, you can open any actual filing on your own and compare our data set, against what you yourself find. This can also be helpful for exploring any digit columns in particular that caught your eye.
Thank you for taking the time to read our website introduction pages, in order to understand exactly what our website is all about and how we are applying Benford's Law to
stocks, with our
in-house algorithm designed to to give an indication of how normal or abnormal the first-digit distribution looks within a given company's filings, as compared to the expectation based on Benford's Law.
To explore our website further, you can
jump to the top and enter any company name or ticker symbol into the search box in the main menu, or, browse through the
latest filings our system has processed. Click through on any company for the full details. Thanks for visiting,
and we hope our site becomes one of your go-to research tools for stock market research.