Cooperation vs. competition
When multiple agents are acting in a shared environment their interests might be aligned or misaligned. MARL allows exploring all the different alignments and how they affect the agents' behavior: • In
pure competition settings, the agents' rewards are exactly opposite to each other, and therefore they are playing
against each other. •
Pure cooperation settings are the other extreme, in which agents get the exact same rewards, and therefore they are playing
with each other. •
Mixed-sum settings cover all the games that combine elements of both cooperation and competition.
Pure competition settings When two agents are playing a
zero-sum game, they are in pure competition with each other. Many traditional games such as
chess and
Go fall under this category, as do two-player variants of video games like
StarCraft. Because each agent can only win at the expense of the other agent, many complexities are stripped away. There is no prospect of communication or social dilemmas, as neither agent is incentivized to take actions that benefit its opponent. The
Deep Blue and
AlphaGo projects demonstrate how to optimize the performance of agents in pure competition settings. One complexity that is not stripped away in pure competition settings is
autocurricula. As the agents' policy is improved using
self-play, multiple layers of learning may occur.
Pure cooperation settings MARL is used to explore how separate agents with identical interests can communicate and work together. Pure cooperation settings are explored in recreational
cooperative games such as
Overcooked, as well as real-world scenarios in
robotics. In pure cooperation settings all the agents get identical rewards, which means that social dilemmas do not occur. In pure cooperation settings, oftentimes there are an arbitrary number of coordination strategies, and agents converge to specific "conventions" when coordinating with each other. The notion of conventions has been studied in language and also alluded to in more general multi-agent collaborative tasks.
Mixed-sum settings Most real-world scenarios involving multiple agents have elements of both cooperation and competition. For example, when multiple
self-driving cars are planning their respective paths, each of them has interests that are diverging but not exclusive: Each car is minimizing the amount of time it's taking to reach its destination, but all cars have the shared interest of avoiding a
traffic collision. Zero-sum settings with three or more agents often exhibit similar properties to mixed-sum settings, since each pair of agents might have a non-zero utility sum between them. Mixed-sum settings can be explored using classic
matrix games such as
prisoner's dilemma, more complex
sequential social dilemmas, and recreational games such as
Among Us,
Diplomacy and
StarCraft II. Mixed-sum settings can give rise to communication and social dilemmas. == Social dilemmas ==