Preface xi
the first consumers of this book, and their feedback on all parts of the notation,
ideas, and manuscript has been invaluable.
We are very grateful for the excellent feedback – narrative and technical –
provided to us by Adam White and our anonymous reviewers, which allowed us
to make substantial improvements on the original draft. We thank Rich Sutton,
Andy Barto, Csaba Szepesvári, Kevin Murphy, Aaron Courville, Doina Precup,
Prakash Panangaden, David Silver, Joelle Pineau, and Dale Schuurmans, for
discussions on book-writing and serving as role-models on taking an effort
larger than anything else we had previously done. We appreciate the technical
and conceptual input of many of our colleagues at Google, DeepMind, Mila,
and beyond: Pierre-Luc Bacon, Hado van Hasselt, Thomas Degris, Tom Schaul,
Adam Oberman, Derek Nowrouzezahrai, Danny Tarlow, Bernardo Avila Pires,
Bilal Piot, Audrunas Gruslys, Volodymyr Mnih, Shie Mannor, Yoshua Bengio,
Sal Candido, Olivier Pietquin, Michael Bowling, and Jason Baldridge. We
further thank the many people who reviewed parts of this book, and helped fill in
some of the gaps in our knowledge: Blake Richards, Chris Finlay, Yinlam Chow,
Erick Delage, Elliot Ludvig, Amir massoud Farahmand, Jesse Farebrother,
Pierluca D’Oro, Simone Totaro, Tadashi Kozuno, Andrea Michi, Daniel Slater,
Tyler Kastner, Rylan Schaeffer, Karolis Ramanauskas, Jun Tian, Doug Eck, and
Hugo Larochelle. Finally, we thank Francis Bach, Elizabeth Swayze, and the
team at MIT Press for championing this work and making it a possibility.
Marc gives further thanks to Judy Loewen, Frédéric Lavoie, Jacqueline Smith,
Madeleine Fugère, Samantha Work, Damon MacLeod, and Andreas Fidjeland,
for support along the scientific journey; and to Lauren Busheikin, for being an
incredibly supportive partner for over a decade. Further thanks go to CIFAR
and the Mila academic community for providing the fertile scientific ground
from which the writing of this book began, and DeepMind and Google Brain
for providing support and inspiration to take on ever larger challenges.
Will wishes to additionally thank Zeb Kurth-Nelson and Matt Botvinick for
their patience and scientific rigor as we explored distributional RL in neuro-
science; Koray Kavukcuoglu and Demis Hassabis for their enthusiasm and
encouragement surrounding the project; Rémi Munos for supporting our pursuit
of random, risky research ideas; and Blair Lyonev for being a supportive partner,
providing both encouragement and advice surrounding the challenges of writing
a book.
Mark would like to thank Maciej Dunajski, Andrew Thomason, Adrian Weller,
Krzysztof Choromanski, Rich Turner, and John Aston for their supervision and
mentorship, and his family and Kristin Goffe for all their support.