Human players do not always play the equilibrium strategy. Laboratory experiments reveal several factors that make players deviate from the equilibrium strategy, especially if matching pennies is played repeatedly: • Humans are not good at randomizing. They may try to produce "random" sequences by switching their actions from Heads to Tails and vice versa, but they switch their actions too often (due to a
gambler's fallacy). This makes it possible for expert players to predict their next actions with more than 50% chance of success. In this way, a positive
expected payoff might be attainable. • Humans are trained to detect patterns. They try to detect patterns in the opponent's sequence, even when such patterns do not exist, and adjust their strategy accordingly. • Humans' behavior is affected by
framing effects. When the Odd player is named "the misleader" and the Even player is named "the guesser", the former focuses on trying to randomize and the latter focuses on trying to detect a pattern, and this increases the chances of success of the guesser. Additionally, the fact that Even wins when there is a match gives him an advantage, since people are better at matching than at mismatching (due to the
stimulus-response compatibility effect). Moreover, when the payoff matrix is asymmetric, other factors influence human behavior even when the game is not repeated: • Players tend to increase the probability of playing an action which gives them a higher payoff, e.g. in the payoff matrix above, Even will tend to play more Heads. This is intuitively understandable, but it is not a Nash equilibrium: as explained above, the mixing probability of a player should depend only on the
other player's payoff, not his own payoff. This deviation can be explained as a
quantal response equilibrium. In a quantal-response-equilibrium, the best-response curves are not sharp as in a standard Nash equilibrium. Rather, they change smoothly from the action whose probability is 0 to the action whose probability 1 (in other words, while in a Nash-equilibrium, a player chooses the best response with probability 1 and the worst response with probability 0, in a quantal-response-equilibrium the player chooses the best response with high probability that is smaller than 1 and the worst response with smaller probability that is higher than 0). The equilibrium point is the intersection point of the smoothed curves of the two players, which is different from the Nash-equilibrium point. • The own-payoff effects are mitigated by
risk aversion. Players tend to underestimate high gains and overestimate high losses; this moves the quantal-response curves and changes the quantal-response-equilibrium point. This apparently contradicts theoretical results regarding the irrelevance of risk-aversion in finitely-repeated zero-sum games. == Real-life data ==