Conventional wisdom suggests that each digit has an equal chance of being the first digit, but Benford's law shows that lower numerals always have a greater chance at being the leading digit compared to higher numerals. Counting jobs start with low numbers and progress to include increasingly higher numbers; for example, when counting to 25, 11 numbers have a leading digit of 1, seven numbers have the leading digit of 2 and only one number leads with the digit 3.
In general, a series of numerical records follow Benford’s Law when they:
- Signify records such as populations, river water flow or sizes of celestial bodies.
- Don’t have pre-determined upper or lower limits.
- Are not made up of numbers used as identifiers, such as identity or social security numbers, bank accounts or telephone numbers.
- Have a lower mean than median, and data that are not concentrated around the mean.
Scientifically, Benford's Law is based on base-10 logarithms that show the probability that the leading digit of a number will be n can be calculated as log10(1+1/n). By substituting the numbers 1 through 9 for n, you can calculate that each subsequent number 1 through 9 has a diminishing probability that it will be the leading digit.
Computer-generated numbers give each numeral from 1 to 9 equal probability of being the leading digit. Using the above algorithm, equally weighted numbers would create a graph with straight-line results, thus potentiall indicating fraud. Analyzing data entered using the horizontal number keys on a keyboard could show a bell curve, using the reasonable assumption that dominant fingers — the index and middle fingers — would be hitting more numbers in the middle of the bar, 4, 5, 6 and 7.
Using Benford’s Law is one way to investigate fraudulent behavior and is a valuable tool in an anti-fraud toolkit, but it is not foolproof and should not be relied upon solely in an investigation. The process of counting leading digits will never determine without a doubt that fraud has occurred. However, if an expected Benford’s Law curve doesn’t show up in a data analysis, it does indicate that further investigation is warranted.