Sixteen bottles of wine riddle

August 11, 2025

Imagine the typical everyday scenario in which you have been imprisoned in the wine cellar of an evil combinatorialist. She promises to let you go, but only if you can solve her riddle.

“Is it the one where there are 1000 barrels of wine and one of them is poisoned?”

“Uh, maybe? Do you already know the answer to that one?”

“…”

You’re lead past a conspicuous amount of barrels and into a new room where you’re given a bottle of wine and four measuring devices. The combinatorialist explains that the wine is from one of sixteen possible years, however the label is missing. Each measuring device will output either a 0 or a 1 depending on what year the wine is from. Your goal is to identify the year while doing as few measurements as possible. The combinatorialist explains what each device outputs using a table:

Year Device 3 Device 2 Device 1 Device 0
0 0 0 0 0
1 0 0 0 1
2 0 0 1 0
3 0 0 1 1
4 0 1 0 0
5 0 1 0 1
6 0 1 1 0
7 0 1 1 1
8 1 0 0 0
9 1 0 0 1
10 1 0 1 0
11 1 0 1 1
12 1 1 0 0
13 1 1 0 1
14 1 1 1 0
15 1 1 1 1

You briefly wonder how she managed to procure wine from over 2000 years ago before recalling that the wine cellar was built deep inside of a hypothetical scenario. Looking closely, you notice that each device tells you one bit of information about the year. For example, 1101 in binary can be converted to 13 in decimal. You measure the wine with devices 3, 2, 1 and 0 which output 0, 1, 1 and 0 respectively. Converting 0110 from binary to a decimal you arrive at the answer of 6. The combinatorialist nods.

“So, can I leave now?”

“Hahaha – no. That was just an apéritif.”

The combinatorialist reveals 16 new unlabeled bottles and explains that each wine is from a different year from 0 to 15. Once again your task is to identify the year of each bottle of wine.

“Am I allowed to mix samples from different bottles to try to measure them simultaneously?”

“No, the measuring devices explode instantly if they’re used on any mix of wines.”

Before you have a chance to question how and why the devices would explode, your captor gives you one final condition: Identify all 16 of the wines in a total of 50 measurements or less. Is there a strategy you can follow that is guaranteed to succeed?

This is designed to be solvable without a computer (e.g. you don’t need to do any programmatic brute forcing). Spoilers for the solution are below the following image.

Photo of a wine glass on top of a 7x7x7 Rubik's cube and a wine bottle with a question mark on the label.


A solution

Since identifying one bottle of wine requires 4 measurements, it appears you would need \(4 \times 16 = 64\) measurements to identify all 16 bottles of wine. The trick here is that each wine is from a different year. Without the uniqueness constraint there would be \(16^{16}\) possible states (64 bits of entropy), while with the uniqueness constraint there are \(16!\) possible states (≈44.25 bits of entropy), i.e. the number of ways to permute 16 objects. If we can find a way to take advantage of the uniqueness constraint there’s a chance we can identify everything in 50 measurements or less.

Two bottle case

Year Device 0
0 0
1 1

To start it might help to think of a much simpler case: say we’re only given two wines and told that one is from the year 0 and the other is from year 1. Notice that if we can identify one bottle, we can immediately deduce the identity of the other bottle without having to take any additional measurements. In this case, we only need to do 1 measurement using device 0 to identify both bottles.

Four bottle case

Year Device 1 Device 0
0 0 0
1 0 1
2 1 0
3 1 1

Next, say we have four bottles from the years 0, 1, 2, and 3. For convenience let’s call the wines \(w_{00}, w_{01}, w_{10}\) and \(w_{11}\) respectively based on the binary representations of their years.

Let’s consider what would happen if we used device 1 to measure all four of the bottles. Since each of the four years appear exactly once, device 1 should output 0 twice (for \(w_{00}\) and \(w_{01}\)) and output 1 twice (for \(w_{10}\) and \(w_{11}\)). Now that we know the two possible positions of \(w_{00}\) and \(w_{01}\), we can use device 0 once to identify both like in the two bottle case from earlier. Likewise we can use device 0 once on the other pair to identify both \(w_{10}\) and \(w_{11}\).

This strategy would use 6 measurements: the 4 initial measurements with device 1 and the 2 measurements with device 0. Can we do better? Looking closely, we don’t actually need to measure every bottle with device 1. Regardless of what order we measure the bottles in, after three measurements with device 1 we will either observe:

  • Device 1 outputs 0 twice, and 1 once. By elimination measuring the fourth bottle would output 1.
  • Device 1 outputs 0 once, and 1 twice. By elimination measuring the fourth bottle would output 0.

In both cases we can infer what the fourth measurement would be without actually using the device. Using this strategy we are guaranteed to identify every bottle in 5 measurements: the 3 initial measurements with device 1 followed by 2 additional measurements with device 0.

Eight bottle case

Year Device 2 Device 1 Device 0
0 0 0 0
1 0 0 1
2 0 1 0
3 0 1 1
4 1 0 0
5 1 0 1
6 1 1 0
7 1 1 1

This time we have wine from the years 0, 1, 2, 3, 4, 5, 6 and 7 (let’s refer to them as \(w_{000}, w_{001}, w_{010}, w_{011}, w_{100}, w_{101}, w_{110}\) and \(w_{111}\)). Starting with device 2, we can reuse the trick from earlier and measure seven of the bottles. The two cases are:

  • Device 2 outputs 0 four times, and 1 three times. By elimination measuring the eighth bottle would output 1.
  • Device 2 outputs 0 three times, and 1 four times. By elimination measuring the eighth bottle would output 0.

Now that we know what device 2 would output for every bottle, we can subdivide the problem into two groups:

  • \(w_{000}, w_{001}, w_{010}\) and \(w_{011}\) comprise the bottles that device 2 would output 0 for.
  • \(w_{100}, w_{101}, w_{110}\) and \(w_{111}\) comprise the bottles that device 2 would output 1 for.

Notice that we can reuse our strategy from the four bottle case on each group to determine the identity of every bottle. In total we are guaranteed to identify every bottle in 17 measurements: 7 initial measurements from device 2, and 5 additional measurements for each of the two groups.

Sixteen bottle case

We can use device 3 on fifteen of the bottles and subdivide the problem into two groups of eight bottles based on device 3’s outputs. If we reuse the strategy from the eight bottle case on each group we get a total of 15 + 17 + 17 = 49 measurements, just below the evil combinatorialist’s threshold of 50. You’re free!

Visualization

One way to visualize this is to watch how the probabilities that a wine is in a certain position change as we make observations. Under the hood we can compute this by looping through every permutation and counting how many are still possible given the observations we’ve made so far. In the following grids, each column represents a bottle and the measurements we’ve performed on it. Each row represents what year the wine in those bottles might be. The cells in the grid grow darker or lighter as the probability that a year is in a given position increases or decreases.

You may have noticed earlier that we can save additional measurements if we get lucky on the order certain values are output. For example, in the eight bottle case we don’t always need to perform seven measurements with device 2, we can quit as soon as we see either four 1’s or four 0’s (the above animation reaches four 0’s on the sixth measurement). The following animation shows a “lucky” run where we skip as many measurements as possible:

Surprisingly, if we’re lucky it’s possible to identify every bottle in only 12 measurements despite there being \(8! = 40320\) possible states (≈15.3 bits of entropy). At first glance each measurement looks like it can resolve at most 1 bit of entropy since each one literally reveals 1 bit. However certain combinations of measurements and outputs are worth more than the sum of their parts. In the case of getting lucky on the eight bottle case and having the first four bottles all have the same highest order bit, we go from \(8!\) states to \(4!4! = 576\) states (≈9.17 bits of entropy), meaning each of the four measurements resolves an average of roughly 1.5 bits of entropy.

Optimality

The “divide and conquer” approach of forming subgroups and recursing on them has an elegant feel to it, but sadly it doesn’t appear to be optimal. I don’t actually know what the optimal strategy is, but I threw together a greedy approach that always performs the measurement with the higest expected information gain which outperforms the divide and conquer approach in the eight bottle case (computing the expected information gain with sixteen bottles is tricky due to the sheer size of the state space). On average it identifies all the wine bottles in 15.48 measurements, a slight improvement over divide and conquer’s 15.73 measurements.

Greedy information gain strategy.

Figuring out the actual optimal strategy (in terms of number of measurements needed across all possible permutations) in the eight and sixteen bottle case is left as an exercise to the reader ;)

Miscellaneous thoughts

  • I feel moderately guilty about releasing a new contrived interviewesque question into the atmosphere. The silver lining is I would find it really funny if I get asked this question in an interview some day.
  • While doing this writeup OpenAI released GPT-5, giving a nice opportunity to see how an LLM would do against a riddle that hasn’t been publicized anywhere. After thinking for 50 seconds it comes to the expected answer of 49 guesses, but then immediately fumbles it in step A by wasting 3 measurements and flounders to compensate for the wasted measurements in step C – maybe an artifact of the sequential nature in which LLMs generate output.
  • If you squint at the intended solution, it’s kinda like radix sort.
  • Is there a strategy that guarantees you identify every bottle in 48 or less guesses? This would require either:
    • Lowering the upper bound on the 8 bottle case.
    • Squeezing out better opportunities to gain information in the 16 bottle case.
    • Can we find/prove these kinds of results in the 32, 64, 128, etc… cases? I deliberately chose 16 wine bottles for this writeup since its just past the threshold where brute force searching through state space becomes intractable for the average mortal, and requires some pen and paper analysis.
  • A 16x16 grid visualization wasn’t possible because trying to calculate the density at each spot in the grid is, to my knowledge, intractable and boils down to computing the permanent of certain matrices. My best guess to approximate the values would be by throwing the grid into cvxpy and telling it to:
    • Maintain the constraints of a doubly stochastic matrix (non-negative entries, rows and columns each sum to 1)
    • Set entries that should be 0 based on direct observations to 0.
    • Maximize entropy under the prior constraints
  • If someone smarter than me knows of a better approximation it would be handy for a different problem I’ve been staring at. This puzzle was meant to be a toy example in a different writeup which involves a much larger, much less symmetric, slightly less contrived problem. (Hint: what if the 16 wine bottles were, say, ~2300 5-letter words?).
Written on August 11, 2025