Mega Millions Lottery Enhanced Algorithm
Have you ever wondered if creating an algorithm to win the lottery was possible?
I did. So I wrote a case study to help others understand if it was possible - or even profitable.
The short answer is: Not really.
Github Project Link.
Findings
I conducted a test where I generated 20,000 lottery numbers, every 4000 using a different generation algorithm.
Out of 20,000, I found that 7 correctly guessed a winning lottery number including the megaball from all previous lottery numbers since 2002.
I know what you are thinking: That means every 3000 tickets using this algorithm will be a Mega Millions winner! This yields better odds than the 1 in 293,000,000 number combinations they claim!
Well, no. This is backtesting previous winning lottery numbers.
This approach is known as the Needle and the Haystack Approach,
where the previous winning numbers are the haystack and the needle is each guess.
In the actual drawing, the winning number is a needle, and there are a haystack of guesses. The above approach doesn't neccessarily work the same way around.
Technicals
Now for the geeky stuff.
Chances of drawing a 10 or less: 58.06%Chances of drawing 20 or less: 58.47%
Chances of drawing multiple 10 or less: 33.47%
Chances of drawing multiple 20 or less: 34.07%
Five Most Frequent By %
17: 1.94% (Count: 48)
31: 1.9% (Count: 47)
10: 1.9% (Count: 47)
14: 1.9% (Count: 47)
04: 1.77% (Count: 44)
Five Least Frequent By %
05: 1.05% (Count: 26)
65: 1.09% (Count: 27)
23: 1.09% (Count: 27)
67: 1.13% (Count: 28)
45: 1.13% (Count: 28)
Coding Walkthrough
First things first, I had to locate the .csv of the lottery winning numbers.
I got the dataset from here.
I then sorted the dataset by the last 1 year and last 5 year winning numbers, since 5 years ago the Mega Millions algorithm changed.
From the data, I concluded the following:
Five Most Frequent Numbers | Frequency Count |
17 | 48 |
31 | 47 |
10 | 47 |
14 | 47 |
4 | 44 |
Five Least Frequent Numbers | Frequency Count |
50 | 26 |
35 | 25 |
55 | 25 |
49 | 25 |
51 | 21 |
But did that amount to anything useful?
Well, maybe. While we had clear evidence that some numbers appeared nearly 2x as others,
could this help me build an algorithm that generated more accurate lottery numbers? Only tests and more analysis could tell.
After doing many analyzations, I discovered that the data is truly as random as possible.
There are some theories I had that could be used to generate future guesses, however.
To test, I created 4 bias lottery number generators.
Bias Generation 1
Pick from top 10's (both most and least).
i. Generate how many will be picked from each side.
Bias Generation 2
Pick from top 20's (most and least).
i. Generate how many will be picked from each side.
Bias Generation 3
Pick from both halfs
i. Generate how many will be picked from each side.
Bias Generation 4
Algorithm driven
i. Pick at most 2 from under 20
ii. Rest are free-reign
The Tests
The most efficient way to test if these bias generations work, is to test it against a single winning lottery number.
To make the analysis easy, I created an algorithm that generates how much money is made if the winning number happened to
02 05 29 64 69 18 and the results are as follows:
Trial | Money Won | Money Spent |
Generation 1 | $2060 | $20000 |
Generation 2 | $2620 | $20000 |
Generation 3 | $2916 | $20000 |
Generation 4 | $3102 | $20000 |
As you can see, if $20,000 was spent on tickets, it would indeed not yield profitable.
Even though each generation perfomed slightly better than the previous, based from a more combinations tests,
none were feasible nor profitable.
Conclusion
It may be possible with more indepth analysis to build an algorithm that generates say 100 lottery numbers and
have profitability by making more in rewards than spent, but so far with my efforts there were none.
This is an ongoing
project. Last update (08/26/2022 - Revision 1)