Collecting coupons

Every time you buy a pizza from Pizza Hut, they give you a coupon selected at random from a set of ${n}$ different coupons. How many pizzas do you need to buy on average in order to collect all possible coupons? We know from kindergarten that the answer is ${\Omega(n\log n)}$. Here’s an intuitive explanation why.

First, let’s look at a cool trick.

Suppose a biased coin with probability of showing heads to be ${p}$ is being tossed repeatedly. How many times, on average, does it need to be tossed to get a heads for the first time? I promise this is related, and the relation will be clear soon.

Anyway, this is a standard geometric distribution and thus we can plug things into known formulas and get the value ${1/p}$ for our answer. Intuitively, it does look like that should be the answer, since if we rolled a die with ${1/p}$ faces repeatedly, we would expect to see face #1 after an average of ${1/p}$ rolls. We can also derive the formula ourselves by writing down an infinite series and using standard summation tricks. But here’s a nice trick that can be used to get to the result immediately.

Let ${E}$ be the expected number of tosses need. Then if the first toss shows a heads, we know that ${E}$ is 1, and if the first toss shows a tails, we know that ${E}$ is ${1+E}$. Writing this “in math” we get ${E = p + (1-p)\cdot E}$, which solves to ${E = p}$.

Anyway, back to coupon collection. Suppose we already have ${x}$ coupons. What’s the probability that if we buy a pizza, we get a coupon that we do not already have? Clearly, ${\frac{n-x}{n}}$. Thus the expected number of pizzas needed to get one new coupon, from the previous calculations, is ${\frac{n}{n-x}}$. This means the expected number of pizzas required to collect all coupons can just be written as a the sum ${\sum_{x=1}^{n-1} \frac{n}{n-x}}$, which solves to ${\Omega(n\log n)}$ after using the well-known approximation for the Harmonic number.