Go is considered much more difficult for computers to win than other games such as
chess, because its strategic and aesthetic nature makes it hard to directly construct an evaluation function, and its much larger
branching factor makes it prohibitively difficult to use traditional AI methods such as
alpha–beta pruning,
tree traversal and
heuristic search. Almost two decades after
IBM's computer
Deep Blue beat world chess champion
Garry Kasparov in the
1997 match, the strongest Go programs using
artificial intelligence techniques only reached about
amateur 5-dan level, In 2012, the software program
Zen, running on a four PC cluster, beat
Masaki Takemiya (
9p) twice at five- and four-stone handicaps. In 2013,
Crazy Stone beat
Yoshio Ishida (9p) at a four-stone handicap. According to DeepMind's
David Silver, the AlphaGo research project was formed around 2014 to test how well a neural network using
deep learning can compete at Go. AlphaGo represents a significant improvement over previous Go programs. In 500 games against other available Go programs, including Crazy Stone and Zen, AlphaGo running on a single computer won all but one. In a similar matchup, AlphaGo running on multiple computers won all 500 games played against other Go programs, and 77% of games played against AlphaGo running on a single computer. The distributed version in October 2015 was using 1,202
CPUs and 176
GPUs. a
2-dan (out of 9 dan possible) professional, five to zero. This was the first time a computer Go program had beaten a professional human player on a full-sized board without handicap. The announcement of the news was delayed until 27 January 2016 to coincide with the publication of a paper in the journal
Nature describing the algorithms used. which were video-streamed live. Out of five games, AlphaGo won four games and Lee won the fourth game which made him recorded as the only human player who beat AlphaGo in all of its 74 official games. AlphaGo ran on Google's cloud computing with its servers located in the United States. The match used
Chinese rules with a 7.5-point
komi, and each side had two hours of thinking time plus three 60-second
byoyomi periods.
The Economist reported that it used 1,920
CPUs and 280
GPUs. At the time of play, Lee Sedol had the second-highest number of Go international championship victories in the world after South Korean player
Lee Chang-ho who kept the world championship title for 16 years. Since there is no single official method of
ranking in international Go, the rankings may vary among the sources. While he was ranked top sometimes, some sources ranked Lee Sedol as the fourth-best player in the world at the time. AlphaGo was not specifically trained to face Lee nor was designed to compete with any specific human players. The first three games were won by AlphaGo following resignations by Lee. However, Lee beat AlphaGo in the fourth game, winning by resignation at move 180. AlphaGo then continued to achieve a fourth win, winning the fifth game by resignation. The prize was US$1 million. Since AlphaGo won four out of five and thus the series, the prize will be donated to charities, including
UNICEF. Lee Sedol received $150,000 for participating in all five games and an additional $20,000 for his win in Game 4. In June 2016, at a presentation held at a university in the Netherlands, Aja Huang, one of the Deep Mind team, revealed that they had patched the logical weakness that occurred during the 4th game of the match between AlphaGo and Lee, and that after move 78 (which was dubbed the "
divine move" by many professionals), it would play as intended and maintain Black's advantage. Before move 78, AlphaGo was leading throughout the game, but Lee's move caused the program's computing powers to be diverted and confused. Huang explained that AlphaGo's policy network of finding the most accurate move order and continuation did not precisely guide AlphaGo to make the correct continuation after move 78, since its value network did not determine Lee's 78th move as being the most likely, and therefore when the move was made AlphaGo could not make the right adjustment to the logical continuation.
Sixty online games On 29 December 2016, a new account on the
Tygem server named "Magister" (shown as 'Magist' at the server's Chinese version) from South Korea began to play games with professional players. It changed its account name to "Master" on 30 December, then moved to the FoxGo server on 1 January 2017. On 4 January, DeepMind confirmed that the "Magister" and the "Master" were both played by an updated version of AlphaGo, called
AlphaGo Master. As of 5 January 2017, AlphaGo Master's online record was 60 wins and 0 losses, including three victories over Go's top-ranked player,
Ke Jie, who had been quietly briefed in advance that Master was a version of AlphaGo. then changed its nationality to the United Kingdom. After these games were completed, the co-founder of
DeepMind,
Demis Hassabis, said in a tweet, "we're looking forward to playing some official, full-length games later [2017] in collaboration with Go organizations and experts". Google DeepMind offered 1.5 million dollar winner prizes for the three-game match between Ke Jie and Master while the losing side took 300,000 dollars. Master won all three games against Ke Jie, after which AlphaGo was awarded professional 9-dan by the Chinese Weiqi Association.
AlphaGo Zero and AlphaZero AlphaGo's team published an article in the journal
Nature on 19 October 2017, introducing AlphaGo Zero, a version without human data and stronger than any previous human-champion-defeating version. By playing games against itself, AlphaGo Zero surpassed the strength of
AlphaGo Lee in three days by winning 100 games to 0, reached the level of
AlphaGo Master in 21 days, and exceeded all the old versions in 40 days. In a paper released on
arXiv on 5 December 2017, DeepMind claimed that it generalized AlphaGo Zero's approach into a single AlphaZero algorithm, which achieved within 24 hours a superhuman level of play in the games of
chess,
shogi, and
Go by defeating world-champion programs,
Stockfish,
Elmo, and 3-day version of AlphaGo Zero in each case.
Teaching tool On 11 December 2017, DeepMind released an AlphaGo teaching tool on its website to analyze winning rates of different
Go openings as calculated by
AlphaGo Master. The teaching tool collects 6,000 Go openings from 230,000 human games each analyzed with 10,000,000 simulations by AlphaGo Master. Many of the openings include human move suggestions. ==Versions==