• No results found

Temporal Difference (TD) Learning

True Online Temporal-Difference Learning

True Online Temporal-Difference Learning

... The temporal-difference methods TD( λ ) and Sarsa( λ ) form a core part of modern rein- forcement ...the learning speed of the true online methods are often better, but never worse than that of the ...

40

An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning

An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning

... Parametric temporal-difference learning was first studied as the key “learning by gen- eralization” algorithm in Samuel’s (1959) checker ...TD learning was convincingly demonstrated by ...

29

Investigating learning rates for evolution and temporal difference learning

Investigating learning rates for evolution and temporal difference learning

... which is approximately 298 bits in the case of game size n = 64. Considering the way in which temporal difference learning is usually applied to game strategy learning i.e. to train a function ...

8

The application of temporal difference learning in optimal diet models

The application of temporal difference learning in optimal diet models

... aversive learning model of foraging behaviour in uncertain environments is ...of Temporal Difference learning motivated by growing evidence for neural corre- lates in natural reinforcement ...

18

-Models: Generative Temporal Difference Learning for Infinite-Horizon Prediction

-Models: Generative Temporal Difference Learning for Infinite-Horizon Prediction

... likelihood learning techniques are substantially better understood at this point in time and can be accompanied by rigorous guarantees even when incorporating function ...of temporal difference ...

12

Control of Multivariable Systems Based on Emotional Temporal Difference Learning Controller

Control of Multivariable Systems Based on Emotional Temporal Difference Learning Controller

... the learning module of Emotional temporal difference controllers have been dynamically increased for credit assignment by means of temporal difference ...

14

SSCC TD: A Serial and Simultaneous Configural-Cue Compound Stimuli Representation for Temporal Difference Learning

SSCC TD: A Serial and Simultaneous Configural-Cue Compound Stimuli Representation for Temporal Difference Learning

... in temporal difference learning considerably enhances the ability of the model to cope with a number of non-linear ...associative learning theory depend on stimuli presented sequen- tially ...

25

SSCC TD: a serial and simultaneous configural cue compound stimuli representation for temporal difference learning

SSCC TD: a serial and simultaneous configural cue compound stimuli representation for temporal difference learning

... in temporal difference learning considerably enhances the ability of the model to cope with a number of non-linear ...associative learning theory depend on stimuli presented sequen- tially ...

24

Temporal Difference Learning in the Tetris Game

Temporal Difference Learning in the Tetris Game

... Since all available free implementations of neural networks are designed for batch learning, we implemented our own feed-forward network. The input to the network is the raw encoding of the state which consists of ...

6

A Complementary Learning Systems approach to Temporal Difference Learning

A Complementary Learning Systems approach to Temporal Difference Learning

... generalization. We take these findings as evidence that the SOM is encoding states that violate generalizations made by the DNN and that this is responsible for CTDLs improved performance. If the SOM does encode states ...

14

A Complementary Learning Systems approach to Temporal Difference Learning

A Complementary Learning Systems approach to Temporal Difference Learning

... generalization. We take these findings as evidence that the SOM is encoding states that violate generalizations made by the DNN and that this is responsible for CTDLs improved performance. If the SOM does encode states ...

14

Proximal Gradient Temporal Difference Learning Algorithms

Proximal Gradient Temporal Difference Learning Algorithms

... reinforcement learning methods can be formally derived, not with respect to their original objective functions as previously attempted, but rather with respect to primal-dual saddle-point objective ...

5

On Generalized Bellman Equations and Temporal-Difference Learning

On Generalized Bellman Equations and Temporal-Difference Learning

... off-policy temporal-difference (TD) learning in discounted Markov decision processes, where the goal is to evaluate a policy in a model-free way by using observations of a state process generated ...

49

Temporal difference Learning with Sampling Baseline for Image Captioning

Temporal difference Learning with Sampling Baseline for Image Captioning

... reinforcement learning method to train the image captioning ...the temporal-difference (TD) learn- ing method, which takes the correlation between temporally successive actions into ...

8

Transfer Learning via Inter-Task Mappings for Temporal Difference Learning

Transfer Learning via Inter-Task Mappings for Temporal Difference Learning

... Temporal difference (TD) learning (Sutton and Barto, 1998) has become a popular reinforcement learning technique in recent ...generalize learning to novel situations, have had some ...

43

Sparse temporal difference learning via alternating direction method of multipliers

Sparse temporal difference learning via alternating direction method of multipliers

... squares fixed-point. As the name suggests, their LARS-TD algorithm is inspired by the Least Angle Regression (LARS) algorithm. However, as is shown, the algorithm only converges to the fixed-point under some strong ...

6

Weak Convergence Properties of Constrained Emphatic Temporal-difference Learning with Constant and Slowly Diminishing Stepsize

Weak Convergence Properties of Constrained Emphatic Temporal-difference Learning with Constant and Slowly Diminishing Stepsize

... off-policy learning algorithm proposed recently by Sutton, Mahmood, and White (2016): the emphatic temporal-difference (TD) learning al- gorithm, or ...

58

Temporal Difference Learning of Position Evaluation in the Game of Go

Temporal Difference Learning of Position Evaluation in the Game of Go

... We found self-play alone to be rather cumbersome for two reasons: firstly, the single-ply search used to evaluate all legal moves is computationally intensive — and although we are investigating faster ways to accomplish ...

8

Self Play and Using an Expert to Learn to Play Backgammon with Temporal Difference Learning

Self Play and Using an Expert to Learn to Play Backgammon with Temporal Difference Learning

... reinforcement learning algorithms that can learn a game position evaluation ...1) Learning by self-play, 2) Learning by playing against an expert program, and 3) Learning from viewing ex- ...

12

Temporal uncertainty during overshadowing: A temporal difference account

Temporal uncertainty during overshadowing: A temporal difference account

... the temporal properties of the ...the temporal properties of a stimulus, and hence cannot easily make predictions about the effects such properties might have on the magnitude of ...these temporal ...

13

Show all 10000 documents...

Related subjects