Deep reinforcement learning

current hub

Write something...

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

About hubStatsRules

See all

Wikipedia

Grokipedia

Deep reinforcement learning

Deep reinforcement learning (deep RL) is a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, allowing agents to make decisions from unstructured input data without manual engineering of the state space. Deep RL algorithms are able to take in very large inputs (e.g. every pixel rendered to the screen in a video game) and decide what actions to perform to optimize an objective (e.g. maximizing the game score). Deep reinforcement learning has been used for a diverse set of applications including but not limited to robotics, video games, natural language processing, computer vision, education, transportation, finance and healthcare.

Deep learning is a form of machine learning that transforms a set of inputs into a set of outputs via an artificial neural network. Deep learning methods, often using supervised learning with labeled datasets, have been shown to solve tasks that involve handling complex, high-dimensional raw input data (such as images) with less manual feature engineering than prior methods, enabling significant progress in several fields including computer vision and natural language processing. In the past decade, deep RL has achieved remarkable results on a range of problems, from single and multiplayer games such as Go, Atari Games, and Dota 2 to robotics.

Reinforcement learning is a process in which an agent learns to make decisions through trial and error. This problem is often modeled mathematically as a Markov decision process (MDP), where an agent at every timestep is in a state $s$ , takes action $a$ , receives a scalar reward and transitions to the next state $s'$ according to environment dynamics $p(s'|s,a)$ . The agent attempts to learn a policy $\pi (a|s)$ , or map from observations to actions, in order to maximize its returns (expected sum of rewards). In reinforcement learning (as opposed to optimal control) the algorithm only has access to the dynamics $p(s'|s,a)$ through sampling.

In many practical decision-making problems, the states $s$ of the MDP are high-dimensional (e.g., images from a camera or the raw sensor stream from a robot) and cannot be solved by traditional RL algorithms. Deep reinforcement learning algorithms incorporate deep learning to solve such MDPs, often representing the policy $\pi (a|s)$ or other learned functions as a neural network and developing specialized algorithms that perform well in this setting.

Along with rising interest in neural networks beginning in the mid 1980s, interest grew in deep reinforcement learning, where a neural network is used in reinforcement learning to represent policies or value functions. Because in such a system, the entire decision making process from sensors to motors in a robot or agent involves a single neural network, it is also sometimes called end-to-end reinforcement learning. One of the first successful applications of reinforcement learning with neural networks was TD-Gammon, a computer program developed in 1992 for playing backgammon. Four inputs were used for the number of pieces of a given color at a given location on the board, totaling 198 input signals. With zero knowledge built in, the network learned to play the game at an intermediate level by self-play and TD( $\lambda$ ).

Seminal textbooks by Sutton and Barto on reinforcement learning, Bertsekas and Tsitiklis on neuro-dynamic programming, and others advanced knowledge and interest in the field.

Katsunari Shibata's group showed that various functions emerge in this framework, including image recognition, color constancy, sensor motion (active recognition), hand-eye coordination and hand reaching movement, explanation of brain activities, knowledge transfer, memory, selective attention, prediction, and exploration.

Starting around 2012, the so-called deep learning revolution led to an increased interest in using deep neural networks as function approximators across a variety of domains. This led to a renewed interest in researchers using deep neural networks to learn the policy, value, and/or Q functions present in existing reinforcement learning algorithms.

See all

Hub AI

Deep reinforcement learning AI simulator

(@Deep reinforcement learning_simulator)

Wikipedia

Grokipedia

Hub AI

Deep reinforcement learning

Seminal textbooks by Sutton and Barto on reinforcement learning, Bertsekas and Tsitiklis on neuro-dynamic programming, and others advanced knowledge and interest in the field.

See all

Knowledge Base

Talk Channels

Special Pages

Deep reinforcement learning

Deep reinforcement learning

Recent from talks

Recent from talks

Knowledge base stats:

Talk channels stats:

Members stats:

Deep reinforcement learning

Hub AI

Deep reinforcement learning

History

Deep reinforcement learning

Deep reinforcement learning

Recent from talks

Recent from talks

Knowledge base stats:

Talk channels stats:

Members stats:

Deep reinforcement learning

Hub AI

Deep reinforcement learning