Recent from talks
Knowledge base stats:
Talk channels stats:
Members stats:
Richard S. Sutton
Richard Stuart Sutton FRS FRSC (born 1957 or 1958) is a Canadian computer scientist. He is a professor of computing science at the University of Alberta, fellow & Chief Scientific Advisor at the Alberta Machine Intelligence Institute, and a research scientist at Keen Technologies. Sutton is considered one of the founders of modern computational reinforcement learning. In particular, he contributed to temporal difference learning and policy gradient methods. He received the 2024 Turing Award with Andrew Barto.
Richard Sutton was born in either 1957 or 1958 in Ohio, and grew up in Oak Brook, Illinois, a suburb of Chicago, United States.
Sutton received his Bachelor of Arts (BA) degree in psychology from Stanford University in 1978 before taking an Master of Science (1980) and PhD (1984) in computer science from the University of Massachusetts Amherst supervised by Andrew Barto. His doctoral dissertation introduced actor-critic architectures and temporal credit assignment.
He was influenced by Harry Klopf's work in the 1970s, which proposed that supervised learning is insufficient for AI or explaining intelligent behavior, and trial-and-error learning, driven by "hedonic aspects of behavior", is necessary. This focused his interest to reinforcement learning.
Sutton held a postdoctoral research position at the University of Massachusetts Amherst in 1984. He worked at GTE Laboratories in Waltham, Massachusetts as principal member of technical staff from 1985 to 1994, then returned to the University of Massachusetts Amherst as a senior research scientist. He joined AT&T Labs Shannon Laboratory in Florham Park, New Jersey as principal technical staff member from 1998 to 2002. He has been a professor of computing science at the University of Alberta since 2003, where he helped establish the Reinforcement Learning and Artificial Intelligence Laboratory. In 2017 he became a distinguished research scientist with Google DeepMind and helped launch DeepMind Alberta in Edmonton, a research office operated in close collaboration with the University of Alberta.
Sutton joined Andrew Barto in the early 1980s at UMass, trying to explore the behavior of neurons in the human brain as the basis for human intelligence, a concept that had been advanced by computer scientist A. Harry Klopf. Sutton and Barto used mathematics toward furthering the concept and using it as the basis for artificial intelligence. This concept became known as reinforcement learning and went on to becoming a key part of artificial intelligence techniques.
Barto and Sutton used Markov decision processes (MDP) as the mathematical foundation to explain how agents (algorithmic entities) made decisions when in a stochastic or random environment, receiving rewards at the end of every action. Traditional MDP theory assumed the agents knew all information about the MDPs in their attempt toward maximizing their cumulative rewards. Barto and Sutton's reinforcement learning techniques allowed for both the environment and the rewards to be unknown, and thus allowed for these category of algorithms to be applied to a wide array of problems.
Sutton returned to Canada in the 2000s and continued working on the topic which continued to develop in academic circles until one of its first major real world applications saw Google's AlphaGo program built on this concept defeating the then prevailing human champion. Barto and Sutton have widely been credited and accepted as pioneers of modern reinforcement learning, with the technique itself being foundational to the AI boom.
Hub AI
Richard S. Sutton AI simulator
(@Richard S. Sutton_simulator)
Richard S. Sutton
Richard Stuart Sutton FRS FRSC (born 1957 or 1958) is a Canadian computer scientist. He is a professor of computing science at the University of Alberta, fellow & Chief Scientific Advisor at the Alberta Machine Intelligence Institute, and a research scientist at Keen Technologies. Sutton is considered one of the founders of modern computational reinforcement learning. In particular, he contributed to temporal difference learning and policy gradient methods. He received the 2024 Turing Award with Andrew Barto.
Richard Sutton was born in either 1957 or 1958 in Ohio, and grew up in Oak Brook, Illinois, a suburb of Chicago, United States.
Sutton received his Bachelor of Arts (BA) degree in psychology from Stanford University in 1978 before taking an Master of Science (1980) and PhD (1984) in computer science from the University of Massachusetts Amherst supervised by Andrew Barto. His doctoral dissertation introduced actor-critic architectures and temporal credit assignment.
He was influenced by Harry Klopf's work in the 1970s, which proposed that supervised learning is insufficient for AI or explaining intelligent behavior, and trial-and-error learning, driven by "hedonic aspects of behavior", is necessary. This focused his interest to reinforcement learning.
Sutton held a postdoctoral research position at the University of Massachusetts Amherst in 1984. He worked at GTE Laboratories in Waltham, Massachusetts as principal member of technical staff from 1985 to 1994, then returned to the University of Massachusetts Amherst as a senior research scientist. He joined AT&T Labs Shannon Laboratory in Florham Park, New Jersey as principal technical staff member from 1998 to 2002. He has been a professor of computing science at the University of Alberta since 2003, where he helped establish the Reinforcement Learning and Artificial Intelligence Laboratory. In 2017 he became a distinguished research scientist with Google DeepMind and helped launch DeepMind Alberta in Edmonton, a research office operated in close collaboration with the University of Alberta.
Sutton joined Andrew Barto in the early 1980s at UMass, trying to explore the behavior of neurons in the human brain as the basis for human intelligence, a concept that had been advanced by computer scientist A. Harry Klopf. Sutton and Barto used mathematics toward furthering the concept and using it as the basis for artificial intelligence. This concept became known as reinforcement learning and went on to becoming a key part of artificial intelligence techniques.
Barto and Sutton used Markov decision processes (MDP) as the mathematical foundation to explain how agents (algorithmic entities) made decisions when in a stochastic or random environment, receiving rewards at the end of every action. Traditional MDP theory assumed the agents knew all information about the MDPs in their attempt toward maximizing their cumulative rewards. Barto and Sutton's reinforcement learning techniques allowed for both the environment and the rewards to be unknown, and thus allowed for these category of algorithms to be applied to a wide array of problems.
Sutton returned to Canada in the 2000s and continued working on the topic which continued to develop in academic circles until one of its first major real world applications saw Google's AlphaGo program built on this concept defeating the then prevailing human champion. Barto and Sutton have widely been credited and accepted as pioneers of modern reinforcement learning, with the technique itself being foundational to the AI boom.
.jpg)