# Entropic Games

While trying to come up with mathematically interesting variants of the notion of a game, I’ve thought of this: What if each player was trying to maximize the entropy of their moves? You can imagine that each player has some message that she wants to transmit to some third party that can observe the move-history of the game but cannot receive any other information from the players. Then each player must balance the considerations of playing randomly to transmit more entropy at the moment versus playing moves that ensure that she’ll have a higher-entropy space of moves in the future.

For example, here are the rules for the entropic game Hot Potato: The game is played with one potato, that at each moment is possessed by one of the two platers. At a player’s turn, if she hold the potatoe she may either keep the potato or pass the potato on to the other player. If she does not have the potato she is forced to do nothing.

One strategy is for a player to give or keep the potato with even odds every time she has the potato. Then each move when she has the potato has an entropy of 1 bit, the maximum possible entropy for that move. However, this seems like a suboptimal strategy, since it doesn’t account for how keeping the potato is better than passing it: If you keep the potato you are guaranteed at a choice between keeping it and passing it next turn, but if you pass it you risk a zero-entropy move next turn if your opponent doesn’t pass it back. Thus you should keep the potato with better than even odds. However, even if keeping the potato is a better move you shouldn’t keep the potato all the time — that is a zero-entropy strategy.

And ordinary perfect-information two-player game can be made into an equivalent entropic game by giving the winning player a large entropic reward. That is, the winner is given an extra turn where she can recite a large amount of information — more than is needed to describe a game history not counting this move. Then the entropic incentive to reach a winning state is stronger than the entropic incentive to ranomize your moves before that, so players will mostly play as if they’re trying to win.

Here’s a formal model of entropic games: Suppose a player has $n$ possible moves $M_0, \dots, M_{n-1}$. The player has analyzed that after playing $M_i$ the expected entropy from all moves after that is $S_i$. Suppose she plays $M_i$ with probability $p_i$. Then the total expected entropy is

$S = \sum _{k=0} ^{n-1} p_k (\log p_k + S_i)$

Optimizing this value, we get that

$p_k = \frac {1} {Z} e^{-S_k}$
$Z = \sum_{k=0}^{n-1} e^{-S_k}$

This is a Boltzmann distribution, like the probability distribution for a system in thermal equilibrium. Now, the anticipated entropies $S_k$ can be calculated by taking expected values assuming the opponent picks her moves of the same form, and so on recursively.

Unlike ordinary combinatorial games this theory straightforwardly extends to more than two players. In ordinary game theory with three players there’s a indeterminacy with kingmaker positions, where a losing player must choose which of the two other players to make the winner. If a player faces two equally good positions in an entropic game, however, she will surely choose both with equal chances.