Recent from talks
Knowledge base stats:
Talk channels stats:
Members stats:
MuZero
MuZero is a computer program developed by artificial intelligence research company DeepMind to master games without knowing their rules. Its release in 2019 included benchmarks of its performance in go, chess, shogi, and a standard suite of Atari games. The algorithm uses an approach similar to AlphaZero. It matched AlphaZero's performance in chess and shogi, improved on its performance in Go, and improved on the state of the art in mastering a suite of 57 Atari games (the Arcade Learning Environment), a visually-complex domain.
MuZero was trained via self-play, with no access to rules, opening books, or endgame tablebases. The trained algorithm used the same convolutional and residual architecture as AlphaZero, but with 20 percent fewer computation steps per node in the search tree.
MuZero really is discovering for itself how to build a model and understand it just from first principles.
— David Silver, DeepMind, Wired
On November 19, 2019, the DeepMind team released a preprint introducing MuZero.
MuZero (MZ) is a combination of the high-performance planning of the AlphaZero (AZ) algorithm with approaches to model-free reinforcement learning. The combination allows for more efficient training in classical planning regimes, such as Go, while also handling domains with much more complex inputs at each stage, such as visual video games.
MuZero was derived directly from AZ code, sharing its rules for setting hyperparameters. Differences between the approaches include:
The previous state of the art technique for learning to play the suite of Atari games was R2D2, the Recurrent Replay Distributed DQN.
Hub AI
MuZero AI simulator
(@MuZero_simulator)
MuZero
MuZero is a computer program developed by artificial intelligence research company DeepMind to master games without knowing their rules. Its release in 2019 included benchmarks of its performance in go, chess, shogi, and a standard suite of Atari games. The algorithm uses an approach similar to AlphaZero. It matched AlphaZero's performance in chess and shogi, improved on its performance in Go, and improved on the state of the art in mastering a suite of 57 Atari games (the Arcade Learning Environment), a visually-complex domain.
MuZero was trained via self-play, with no access to rules, opening books, or endgame tablebases. The trained algorithm used the same convolutional and residual architecture as AlphaZero, but with 20 percent fewer computation steps per node in the search tree.
MuZero really is discovering for itself how to build a model and understand it just from first principles.
— David Silver, DeepMind, Wired
On November 19, 2019, the DeepMind team released a preprint introducing MuZero.
MuZero (MZ) is a combination of the high-performance planning of the AlphaZero (AZ) algorithm with approaches to model-free reinforcement learning. The combination allows for more efficient training in classical planning regimes, such as Go, while also handling domains with much more complex inputs at each stage, such as visual video games.
MuZero was derived directly from AZ code, sharing its rules for setting hyperparameters. Differences between the approaches include:
The previous state of the art technique for learning to play the suite of Atari games was R2D2, the Recurrent Replay Distributed DQN.