This is most definitely going to sound and feel like fallacy for those of you who have not heard this before. It is going to seem mathematically impractical, I must warn you. Come to truly think of it though, how could it be a paradox if it wasn’t at least that much? In this article, I shall introduce you to a paradox that has been fundamentally important in computer cryptography. Well, I believe some of you have already got the answer to this paradox, but do you really understand it? Let us find out.
Paradoxes come in several varieties and you might not distinguish them from dilemmas at face value. They have their ring and call. It is tempting to break down paradoxes in general but that shall defeat the purpose of this article. This article demystifies the all too popular Birthday Paradox. It is not fallacy but rather veridical, supported by mathematics at its core… well, *algebra.
Consider this statement; In a room with 23 people, just 23 people there is a 50% chance that two of them share a birthday. In a room with 75 people, there is a 99.9% chance that two of them share the same birthday. All this is in the scope of a year, 365 days. How absurd does that that sound? Now, this is to tell you that there is a 50% chance that you share your birthday with one of your former class mates and, a 99.9% chance that you share your birthday with quite a number of your former school mates. Before you are tempted to pick that calculator and crunch those numbers, have you considered to make a phone call?
Here is something to ponder on before we get to the numbers; in a group of 367 people, there is a 100% chance that two of them share a birthday – with the pigeon hole principle that is.
For the purposes of this explanation, we going to hold some factors constant because they have little effect on the analysis. These include the leap years, the birth of twins and variation in week days. One more thing; we do know that in real life, birth days are not really equally probable. Say, June 1st and December 29th are not equally likely to be named a birth date. That aside, let us get right into it.
I shall demystify this paradox by introducing your mind to a simpler problem before even attempting to bring in statistics. Consider a coin, a normal currency coin. Disregard all the unproven propaganda you have heard on this. What are the chances of getting a head once a coin is flipped the first time? 50%. We can all agree on that. What of the second time it is flipped.. just 50% divide by 2, you get 25%. What about ten times… divide 50% by 10 and there you have your answer, 5%. Right? Wrong! But what is wrong with these numbers… it is not just division? It is not.
What probability is based on is some mathematical concept of exponents. Let me say this clearly; the chance of getting all 10 heads when a coin is flipped ten times is not 5% or 0.05. It is 0.5 raised to the power of 10 or; (50 / 100)10. Does that sound confusing? Let’s push on and you shall see this is not that difficult and definitely the concept is not that evasive. As you can see, there is a very small probability that you shall get all heads, and you are welcome to toss your coin. This argument is carried into the Birthday Paradox, with a change in numbers and one more thing; what we shall be calculating for.
To simplify our coin problem, instead of trying to look at the probability of getting the head on the coin flip, you flip the problem and look at the probability of getting all tails. That does not seem very useful once you consider only 10 coin flips but, consider 300 of them. This is the mentality we carry forward to the Birthday paradox; you find the chance that everyone has a different birthday. From there, we going to introduce pairs. Consider our problem of 23 people, how many pairs do we have? Let us start with a smaller number. With 5 people, we have 10 possible pairs of people, without repetition. How did I get that number? (5 x 4) / 2. You are welcome to use a piece of paper through. With 6 people its is 15 pairs and with 23 people, it is 253 pairs.
Remember earlier assumptions on the year being 365 days and thus no leap years? Let us work with that. With 365 days, and one person having being born, there are only 364 days left for one other person to be born without that person having the same birthday as our earlier fellow. The chance of 2 people having the same birthday is going to be 50 – 50 in our group of 23 fellows. How exactly did we get here? Keep two things in mind; exponents, 253 pairs and 364 days. All that we have already discussed, then remember the coin probability and you shall end up with this: ( 364 / 364)253 which shall equal .4995 or 49.95%. We not done yet. To find the chance of getting a match we subtract .4995 from one and we get .5005 or 50.05% chance that two people in our group of 23 share the same birthday.
Our formula is not only restricted to 23 people, it can work with any number. Now, I know there are more complex explanations out there explaining this concept but, the purpose of this article is to make anyone understand. You can find more mathematics on Wikipedia if you like but, the mathematics there can make your head crack if you are not ready yet. Back to the formula I was about to give you.
To find the probability of a match in any number, you work with this formula: p(n) = 1 – (364 / 365)n(n-1)/2. Where p is probability, n is number of people being considered for a match and all the other numbers in the formula are regular numbers. And there you have it.
I know I have thrown in quite a number of numbers but how does all this fit in computers. We’ve seen babies being born and birthdays being mentioned but what about computers? This last segment covers all that. You have most definitely heard of the word cryptography be it in Bitcoin, movies or nightmares. Cryptography is basically the art of writing and solving code well, according to 19th century folks. Computer cryptography is much more. It is the scientific study of techniques for securing digital information, transactions and distributed computations. I rest my digging on crypto but, it is necessary to understand that concept in order to explain the usage of the birthday paradox in computing: the Birthday Attack.
Birthday Attacks as the names suggests utilizes this paradox we have discussed. They are use to find collisions, or rather matching values like our birthdays in something we call hash functions. That is a topic for another day for now, that information is quite a chunk to chew on. Now you know.
What do babies, birthdays and computers have in common? The Birthday Paradox.