TITLE: Solving a Fun Little Puzzle with Analysis and Simulation AUTHOR: Eugene Wallingford DATE: March 07, 2016 5:12 PM DESC: ----- BODY: I'm on a mailing list of sports fans who happen also to be geeks of various kinds, including programmers and puzzle nuts. Last Friday, one of my friends posted this link and puzzle to the list:
http://fivethirtyeight.com/features/can-you-win-this-hot-new-game-show/

Two players go on a hot new game show called "Higher Number Wins". The two go into separate booths, and each presses a button, and a random number between zero and one appears on a screen. (At this point, neither knows the other's number, but they do know the numbers are chosen from a standard uniform distribution.) They can choose to keep that first number, or to press the button again to discard the first number and get a second random number, which they must keep. Then, they come out of their booths and see the final number for each player on the wall. The lavish grand prize -- a case full of gold bouillon -- is awarded to the player who kept the higher number. Which number is the optimal cutoff for players to discard their first number and choose another? Put another way, within which range should they choose to keep the first number, and within which range should they reject it and try their luck with a second number?
From there, the conversation took off quiickly with a lot of intuition and analysis. There was immediate support for the intuitive threhold of 0.5, which a simple case analysis shows to give the maximum expected value for a player, 0.625. Some head-to-head analysis of various combinations, however, showed other values winning more often than 0.5, with values around 0.6 doing the best. What was up? One of these analyses was wrong, but we weren't sure which. One list member, who had built a quick model in Excel, said,
I think the optimum may be somewhere around .61, but I'm damned if I know why.
Another said,
I can't help thinking we're ignoring something game theoretical. I'm expecting we've all arrived at the most common wrong answer.
To confirm his suspicions, this person went off and wrote a C program -- a "terrible, awful, ugly, horrible C program" -- to compute all the expected values for all possible head-to-head cases. He announced,
We have a winner, according to my program. ... 61 wins.
He posted a snippet of the output from his program, which showed a steady rise in the win percentages for cutoffs up to a threshold of 0.61, which beat the 99 other cutoffs, with a steady decline for cutoffs thereafter. Before reading the conversation on the mailing list, I discussed the puzzle with my wife. We were partial to 0.5, too, but that seemed too simple... So I sat down and wrote a program of my own. My statistics skills are not as strong as many of my friends, and for this reason I like to write programs that simulate the situation at hand. My Racket program creates players who use all possible thresholds, plays matches of 1,000,000 games between each pair, and tallies up the results. Like the C program written by my buddy, my program is quick and dirty; it replays all hundred combinations on each pass, without taking advantage of the fact that the tournament matrix is symmetric. It's slower than it needs to be, but it gets the job done.
Player 57 defeats 95 opponents.
Player 58 defeats 96 opponents.
Player 59 defeats 99 opponents.
Player 60 defeats 100 opponents.
Player 61 defeats 98 opponents.
Player 62 defeats 96 opponents.
Player 63 defeats 93 opponents.
The results of my simulation mirrored the results of the brute-force case analysis. In simulation, 0.6 won, with 0.59 and 0.61 close behind. The two approaches gave similar enough results that it's highly likely there are bugs in neither program -- or both! Once my friends were confident that 0.5 was not the winner, they were able to diagnose the error in the reasoning that made us think it was the best we could do: Although a player's choice of strategies is independent of the other player's choice, we cannot treat the other player's value as a uniform distribution over [0..1]. That is true only when they choose a threshold of 0 or 1. In retrospect, this seems obvious, and maybe it was obvious to my mathematician friends right off the bat. But none of us on the mailing list is a professional statistician. I'm proud that we all stuck with the problem until we understood what was going on. I love how we can cross-check our intuitions about puzzles like this with analysis and simulation. There is a nice interplay between theory and empirical investigation here. A simple theory, even if incomplete or incorrect, gives us a first approximation. Then we run a test and use the results to go back and re-think our theory. The data helped us see the holes in our thinking. What works for puzzles also works for hairier problems out in the world, too. And we created the data we ended by writing a computer program. You know how much I like to do that. This the sort of situation we see when writing chess-playing programs and machine learning programs: We can write programs that are smarter than we are by starting from much simpler principles that we know and understand. This experience is also yet another reminder of why, if I ever go freelance as a consultant or developer, I plan to team up with someone who is a better mathematician than I am. Or at least find such a person to whom I can sub-contract a sanity check. -----