Prior to OpenAI Five, other AI versus human experiments and systems have been successfully used before, such as
Jeopardy! with
Watson,
chess with
Deep Blue, and
Go with
AlphaGo. In comparison with other games that have used AI systems to play against human players,
Dota 2 differs as explained below:
Long run view: The bots run at 30
frames per second for an average match time of 45 minutes, which results in 80,000 ticks per game. OpenAI Five observes every fourth frame, generating 20,000 moves. By comparison, chess usually ends before 40 moves, while Go ends before 150 moves.
Partially observed state of the game: Players and their allies can only see the map directly around them. The rest of it is covered in a
fog of war which hides enemies units and their movements. Thus, playing
Dota 2 requires making inferences based on this incomplete data, as well as predicting what their opponent could be doing at the same time. By comparison, Chess and Go are "full-information games", as they do not hide elements from the opposing player.
Continuous action space: Each playable character in a
Dota 2 game, known as a hero, can take dozens of actions that target either another unit or a position. The OpenAI Five developers allow the space into 170,000 possible actions per hero. Without counting the perpetual aspects of the game, there are an average of ~1,000 valid actions each tick. By comparison, the average number of actions in chess is 35 and 250 in Go.
Continuous observation space:
Dota 2 is played on a large map with ten heroes, five on each team, along with dozens of buildings and
non-player character (NPC) units. The OpenAI system observes the state of a game through developers’ bot API, as 20,000 numbers that constitute all information a human is allowed to get access to. A chess board is represented as about 70 lists, whereas a Go board has about 400 enumerations. == Reception ==