Trapped in Choice: Unraveling the Prisoner's Dilemma

The Prisoner's Dilemma is one of the most famous game theories, proposed in 1950 by Merrill Flood and Melvin Dresher of the RAND Corporation. It was later formalized and named by Canadian mathematician Albert William Tucker. The Prisoner's Dilemma essentially provides a framework for understanding how to strike a balance between cooperation and competition and is a useful tool for strategic decision-making [1]. Therefore, it is used in various fields, from business, finance, economics and politics to philosophy, psychology, biology and sociology.

{tocify} $title={Table of Contents}


Let us start by presenting a payoff matrix, as illustrated in the following table. In this matrix, the "payoff" is indicated in terms of the length of a prison sentence, where the length is indicated negatively; a higher figure is preferable. The actions "cooperate" and "defect" describe scenarios where the suspects either cooperate with each other (such as when neither suspect confesses) or defect (where one suspect confesses, while the other does not).

Player 2
Player 1
Cooperate Defect
- 1
- 1
- 10

In the classic formulation of the prisoner's dilemma, two suspects are apprehended and placed in separate interrogation rooms, eliminating any possibility of communication between them. Each suspect faces a crucial decision: to cooperate with their accomplice by remaining silent or to defect by betraying the other in exchange for a potentially lighter sentence. The choices made by each prisoner are concealed from the other, adding a layer of uncertainty to their decision-making process.

Analyze The Game

The outcomes of their choices are contingent upon the combination of their actions. If both suspects choose to cooperate by remaining silent, they each receive a relatively light sentence (for example 1 year each), as the authorities have limited evidence to convict them on the primary charges. Conversely, if one suspect defects and the other cooperates, the defector is rewarded with freedom (0 years), often as part of a plea deal, while the cooperator receives the harshest possible sentence (-10 years) for their supposed loyalty. However, if both suspects choose to defect, each attempting to secure their own release at the expense of the other, they both receive moderately severe sentences (5 years), as their mutual betrayal provides sufficient testimony to convict each other. 


  1. If Player 1 Cooperates:
    • If Player 2 Cooperates: Both Player 1 and Player 2 receive a payoff of -1. This outcome suggests a lighter sentence compared to most other outcomes, reflecting mutual cooperation.
    • If Player 2 Defects: Player 1 suffers significantly, receiving a payoff of -10, indicating a very harsh sentence, while Player 2 walks free with a payoff of 0.
  2. If Player 1 Defects:
    • If Player 2 Cooperates: Player 1 gains a payoff of 0 (indicating freedom), while Player 2 receives the harshest penalty of -10.
    • If Player 2 Defects: Both players end up with a payoff of -5. This outcome is worse than mutual cooperation but better than being the sole cooperator against a defector.

Dominant Strategies

From Player 1's perspective:

Defecting is the dominant strategy because it consistently results in better payoffs compared to cooperating. If Player 2 cooperates, defecting leads to freedom (0 vs. -1 if Player 1 cooperates). If Player 2 defects, defecting minimizes the sentence (-5 vs. -10 if Player 1 cooperates).

From Player 2's perspective:

Similarly, defecting remains the dominant strategy. Defecting either results in freedom (0 vs. -1 if Player 2 cooperates while Player 1 cooperates) or a lesser penalty (-5 vs. -10 if Player 2 cooperates while Player 1 defects).


The analysis of the payoff matrix reveals that for both Player 1 and Player 2, defecting is the rational, dominant strategy regardless of the other player’s action. This is due to the protection it offers against receiving the harshest penalty and the potential for the best outcome (freedom). The dilemma, however, lies in the fact that if both players chose to cooperate (each receiving -1), they would collectively fare better than if both chose to defect (each receiving -5), demonstrating the classic conflict between individual rationality and mutual benefit in the prisoner's dilemma. This scenario vividly illustrates the complex interplay between individual rationality and collective outcome, pivotal in understanding strategic decision-making in various contexts.

In economic theory, the concepts of Nash equilibrium and Pareto efficiency are fundamental in understanding the dynamics of strategic decisions and market behaviors. Nash equilibrium, a key concept in game theory, occurs when no participant in a game can improve their outcome by changing strategies while the other players' strategies remain unchanged. The Nash equilibrium does not always mean that the most optimal strategy is chosen  [2]. This concept often describes the stability of the strategic outcomes in various games and competitive scenarios, including free markets. Free markets, driven by individual decisions aimed at personal gain, tend to naturally evolve toward Nash equilibrium states. Each market participant adjusts their strategies based on the prevailing market conditions and the actions of others, striving to reach a position where no unilateral change would be beneficial. This dynamic often leads to outcomes where each player is doing the best they can given the choices of others, signifying a Nash equilibrium. 

However, this equilibrium does not necessarily coincide with Pareto efficiency, another pivotal concept in economics. Pareto efficiency is achieved when no reallocation can make someone better off without making someone else worse off. It represents an optimal distribution of resources where improving the situation of one individual requires harming another [3]. In many cases, the free market’s Nash equilibria are not Pareto efficient. The self-interested choices that drive market participants toward Nash equilibria can lead to inefficiencies where some resources are not optimally distributed, and potential improvements in total welfare are overlooked. 

The divergence between Nash equilibrium and Pareto efficiency in free markets highlights a critical insight of economic theory: while markets are powerful mechanisms for coordinating individual activities, they do not always lead to socially optimal outcomes. Market failures are examples where individual rationality leads to collective irrationality, necessitating external interventions like government regulations to realign the economy closer to Pareto efficiency. 


[1] Kuhn, Steven, "Prisoner’s Dilemma", The Stanford Encyclopedia of Philosophy (Winter 2019 Edition), Edward N. Zalta (ed.), URL =

[2] Nash equilibrium. Oxford Reference. Retrieved 8 May. 2024, from

[3] James Walsh, K.H. et al. (2022) Kenneth Arrow and the promise of Behavioral Development Economics, Brookings. Available at: (Accessed: 08 May 2024).

Thank you for visiting my blog! I am Stefanos Stavrianos, and I have studied at premier global universities. I hold a Specialization in Quantitative Finance from the Higher School of Economics in Moscow, and a Python 3 Programming Specialization from the University of Michigan. My academic interests encompass microeconomics, macroeconomics and monetary economics, with a research focus on financial crises.

Post a Comment