AlphaDev

AlphaDev is an artificial intelligence system developed by Google DeepMind to discover enhanced computer science algorithms using reinforcement learning. AlphaDev is based on AlphaZero, a system that mastered the games of chess, shogi and go by self-play. AlphaDev applies the same approach to finding faster algorithms for fundamental tasks such as sorting and hashing.

Development

On June 7, 2023, Google DeepMind published a paper in Nature introducing AlphaDev, which discovered new algorithms that outperformed the state-of-the-art methods for small sort algorithms. Upon analysing the algorithms in-depth, AlphaDev discovered two unique sequences of assembly instructions called the AlphaDev swap and copy moves that avoid a single assembly instruction each time they are applied. This was the first change to the C++ Standard Library sorting algorithms in more than a decade and the first update to involve an algorithm discovered using AI. ==Design==

Design

AlphaDev is built on top of AlphaZero, the reinforcement-learning model that DeepMind trained to master games such as Go and chess. The company's breakthrough was to treat the problem of finding a faster algorithm as a game and then train its AI to win it. AlphaDev plays a single-player game where the objective is to iteratively build an algorithm in the assembly language that is both fast and correct. AlphaDev uses a neural network to guide its search for optimal moves, and learns from its own experience and synthetic demonstrations. AlphaDev showcases the potential of AI to advance the foundations of computing and optimize code for different criteria. Google DeepMind hopes that AlphaDev will inspire further research on using AI to discover new algorithms and improve existing ones. == Algorithm ==

Algorithm

The primary learning algorithm in AlphaDev is an extension of AlphaZero. Encoding assembly programming into a game In order to use AlphaZero on assembly programming, the authors created a Transformer-based vector representation of assembly programs designed to capture their underlying structure. This finite representation allows a neural network to play assembly programming like a game with finitely many possible moves (like Go), The representation uses the following components: • A Transformer network, to encode assembly opcodes are converted to one-hot encodings and concatenated to form the raw input sequence. • A multilayer perceptron network, which encodes the "CPU state", that is, the states of each register and memory location for a given set of inputs, Playing the game The game state is the assembly program generated up to a given point. The game move is an extra instruction appended to the current assembly program. The game's reward is a function of the assembly program's correctness and latency. To reduce cost, AlphaDev only computes actual measured latency on less than 0.002% of generated programs, as it does not evaluate latency during the search process. Instead, it uses two functions that estimate the correctness and latency by being trained via supervised learning using the real measured correctness and latency values. == Result ==

Result

Hashing AlphaDev developed hashing algorithms for inputs from 9 to 16 bytes to Abseil, an open-source collection of prewritten C++ algorithms. LLVM standard sorting library AlphaDev discovered new sorting algorithms, which led to up to 70% improvements in the LLVM libc++ sorting library for shorter sequences and about 1.7% improvements for sequences exceeding 250,000 elements. These improvements apply to the uint32, uint64 and float data types for ARMv8, Intel Skylake and AMD Zen 2 CPU architectures. AlphaDev's branchless conditional assembly and new swap move contributed to these performance improvements. The discovered algorithms were reverse-engineered from low-level assembly to C++, and have officially been included in the libc++ standard sorting library. outperforming the human benchmark for single valued inputs by approximately three times in terms of speed. AlphaDev also discovered a new VarInt assignment move, combining two operations into a single instruction for latency savings. Comparison with logical AI approach The AlphaDev's performance was compared to stochastic superoptimization, a logical AI approach. The latter was run with at least the same amount of resources and wall-clock time as AlphaDev. The results showed that AlphaDev-S requires a prohibitive amount of time to optimize directly for latency, as latency needs to be computed after every mutation. As such, AlphaDev-S optimizes for a latency proxy, specifically algorithm length, and, then, at the end of training, all correct programs generated by AlphaDev-S are searched through. ==References==

Source: Wikipedia ↗

tickerdossier.com tickerdossier.substack.com