Top PDF Temporal Difference

An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning

... Temporal-difference (TD) learning is perhaps the most important idea to come out of the field of reinforcement learning. The problem it solves is that of efficiently learning to make a sequence of long-term ...

29

Transfer Learning via Inter-Task Mappings for Temporal Difference Learning

... Temporal difference (TD) learning (Sutton and Barto, 1998) has become a popular reinforcement learning technique in recent years. TD methods, relying on function approximators to generalize learning to ...

43

Temporal difference Learning with Sampling Baseline for Image Captioning

... The existing methods for image captioning usually train the language model under the cross entropy loss, which results in the exposure bias and inconsistency of evaluation metric. Recent research has shown these two ...

8

The application of temporal difference learning in optimal diet models

... An experience-based aversive learning model of foraging behaviour in uncertain environments is presented. We use Q-learning as a model-free implementation of Temporal Difference learning motivated by ...

18

A Complementary Learning Systems approach to Temporal Difference Learning

... Reinforcement Learning (RL) (Sutton & Barto, 1998) repre- sents a computational framework for modelling complex reward- driven behaviour in both artificial and biological agents. For cognitive scientists it is of ...

14

Control of Multivariable Systems Based on Emotional Temporal Difference Learning Controller

... Considering the achievements of emotional control, this paper seeks to answer more, complex issues and fulfill more objectives. To do this, capability of the learning module of Emotional temporal ...

14

Self Play and Using an Expert to Learn to Play Backgammon with Temporal Difference Learning

... that temporal difference learning and gradient descent is better for fine-tuning the evaluation function than a more random- ized evolutionary search ...

12

SSCC TD: A Serial and Simultaneous Configural-Cue Compound Stimuli Representation for Temporal Difference Learning

... of Temporal Difference to accurately predict solution of complex discriminations, such as negative patterning, that rely on a stimulus compound structure, while retaining CSC TD temporal ...

25

Temporal uncertainty during overshadowing: A temporal difference account

... dures. Standard associative theories cannot easily accommodate findings of this type, as by and large they fail to conceptualise the temporal properties of the stimulus. Hybrid models, which attempt to accommodate ...

13

Weak Convergence Properties of Constrained Emphatic Temporal-difference Learning with Constant and Slowly Diminishing Stepsize

... emphatic temporal-difference (TD) algorithm, ETD(λ), for learning the value functions of stationary policies in a discounted, finite state and action Markov decision ...

58

SSCC TD: a serial and simultaneous configural cue compound stimuli representation for temporal difference learning

... of Temporal Difference to accurately predict solution of complex discriminations, such as negative patterning, that rely on a stimulus compound structure, while retaining CSC TD temporal ...

24

An Empirical Investigation of the Emergence of Money: Contrasting Temporal Difference and Opportunity Cost Reinforcement Learning

... a temporal difference reinforcement learning (TD-RL) model, which allows the value to back-propagate from one state to previous ones while not assuming any knowledge about the structure of the ...

26

On Generalized Bellman Equations and Temporal-Difference Learning

... off-policy temporal-difference (TD) learning in discounted Markov decision processes, where the goal is to evaluate a policy in a model-free way by using observations of a state process generated without ...

49

True Online Temporal-Difference Learning

... Temporal-difference (TD) learning is a core learning technique in modern reinforcement learning (Sutton, 1988; Kaelbling et al., 1996; Sutton & Barto, 1998; Szepesv´ ari, 2010). One of the main ...

40

A Convergent Online Single Time Scale Actor Critic Algorithm

... online temporal difference based actor-critic algorithm which is proved to converge to a neighborhood of a local maximum of the average ...the temporal difference signal, which is passed from ...

44

Exploring the geographical context for quality assessment of VGI in flood management domain

... resources areas from GeoSampa, a platform that is maintained by the city hall of the city of São Paulo. In the platform, it is possible to obtain hydrological data, topographic data, etc. Secondly, we retrieved ...

11

Evolutionary Function Approximation for Reinforcement Learning

... Temporal difference methods are theoretically grounded and empirically effective methods for ad- dressing reinforcement learning problems. In most real-world reinforcement learning tasks, TD methods require ...

41

Ruminative Reinforcement Learning: Improve Intelligent Inventory Control by Ruminating on the Past

... Index Terms —Temporal difference learning, ruminative behavior, markov decision problem, artificial intelligence, reinforcement learning, inventory control, approximate dynamic progra[r] ...

6

Spatial and Temporal Variation of Normalized Difference Vegetation Index (NDVI) and Rainfall in the North East Arid Zone of Nigeria

... Monthly rainfall data archived at The Nigerian Mete- orological Agency, Oshodi, Lagos, which covers the time period of January 1981 through to December 1999, was used for analysis in this research. The NDVI data were ...

6

Does the Normalized Difference Vegetation Index explain spatial and temporal variability in sap velocity in temperate forest ecosystems?

... Abstract. Understanding the link between vegetation char- acteristics and tree transpiration is a critical need to facil- itate satellite-based transpiration estimation. Many studies use the Normalized Difference ...

15

Temporal Difference

Related subjects