The book is divided into three sections: Prophecy, Agency, and Normativity. Each section covers researchers and engineers working on different challenges in the alignment of
artificial intelligence with human values.
Prophecy In the first section, Christian interweaves discussions of the history of artificial intelligence research, particularly the
machine learning approach of
artificial neural networks such as the
Perceptron and
AlexNet, with examples of how AI systems can have unintended behavior. He tells the story of
Julia Angwin, a journalist whose
ProPublica investigation of the
COMPAS algorithm, a tool for predicting
recidivism among criminal defendants, led to widespread criticism of its accuracy and bias towards certain demographics. One of AI's main alignment challenges is its
black box nature (inputs and outputs are identifiable but the transformation process in between is undetermined). The lack of transparency makes it difficult to know where the system is going right and where it is going wrong.
Agency In the second section, Christian similarly interweaves the history of the
psychological study of reward, such as
behaviorism and
dopamine, with the computer science of
reinforcement learning, in which AI systems need to develop policy ("what to do") in the face of a value function ("what rewards or punishment to expect"). He calls the
DeepMind AlphaGo and
AlphaZero systems "perhaps the single most impressive achievement in automated curriculum design." He also highlights the importance of curiosity, in which reinforcement learners are intrinsically motivated to explore their environment, rather than exclusively seeking the external reward.
Normativity The third section covers training AI through the imitation of human or machine behavior, as well as philosophical debates such as between
possibilism and
actualism that imply different ideal behavior for AI systems. Of particular importance is
inverse reinforcement learning, a broad approach for machines to learn the objective function of a human or another agent. Christian discusses the
normative challenges associated with
effective altruism and
existential risk, including the work of philosophers
Toby Ord and
William MacAskill who are trying to devise human and machine strategies for navigating the alignment problem as effectively as possible. ==Reception==