• No results found

Shaping the Reward Function

Deep Reward Shaping from Demonstrations

Deep Reward Shaping from Demonstrations

... potential reward function by training a deep supervised convolutional neural ...shaped function is added to the reward function used in deep-Q-learning (DQN) to perform off-policy ...

8

Combining Multiple Correlated Reward and Shaping Signals by Measuring Confidence

Combining Multiple Correlated Reward and Shaping Signals by Measuring Confidence

... non-normalized shaping functions, the prox- imity and separation shaping functions are of too large a magnitude compared to the base reward and drown it, re- sulting in very bad ...these ...

8

Reward Shaping with Recurrent Neural Networks for Speeding up On Line Policy Learning in Spoken Dialogue Systems

Reward Shaping with Recurrent Neural Networks for Speeding up On Line Policy Learning in Spoken Dialogue Systems

... for Reward Shaping 2.1 Reward Shaping Reward shaping provides the system with an ex- tra reward signal F in addition to environmental reward R, making the system ...

5

The Optimal Reward Problem: Designing Effective Reward for Bounded Agents.

The Optimal Reward Problem: Designing Effective Reward for Bounded Agents.

... the reward function to directly motivate behaviors which are auxiliary to an externally defined ...to shaping rewards, the reward functions in this section tend to be agent-centric, ...

136

Dynamic Potential-Based Reward Shaping

Dynamic Potential-Based Reward Shaping

... It is common intuition that as reward shaping directs ex- ploration it can be both beneficial and detrimental to an agent’s learning performance. If a good heuristic is used, common in previous published ...

9

Dynamic Potential-Based Reward Shaping

Dynamic Potential-Based Reward Shaping

... It is common intuition that as reward shaping directs ex- ploration it can be both beneficial and detrimental to an agent’s learning performance. If a good heuristic is used, common in previous published ...

9

Reward Shaping in Episodic Reinforcement Learning

Reward Shaping in Episodic Reinforcement Learning

... potential-based reward shaping can alter the set of equilibria in general-sum stochastic games, and a new equilibrium can be introduced when terminal states have non-zero ...into reward ...

9

Theory and Application of Reward Shaping in Reinforcement Learning

Theory and Application of Reward Shaping in Reinforcement Learning

... overcome shaping function error. Therefore, for higher horizons, shaping error can dominate the learning time, indicating that the number of utility subgoals was too ...irreconcilable shaping ...

102

Theoretical Considerations of Potential-Based Reward Shaping for Multi-Agent Systems

Theoretical Considerations of Potential-Based Reward Shaping for Multi-Agent Systems

... Unknown to the agent, the same local state-action pair will have a different transition function even though the global state-joint action pair has not changed. It has been shown in single-agent reinforcement ...

9

Theoretical Considerations of Potential-Based Reward Shaping for Multi-Agent Systems

Theoretical Considerations of Potential-Based Reward Shaping for Multi-Agent Systems

... Unknown to the agent, the same local state-action pair will have a different transition function even though the global state-joint action pair has not changed. It has been shown in single-agent reinforcement ...

9

Potential-Based Reward Shaping for Knowledge-Based, Multi-Agent Reinforcement Learning

Potential-Based Reward Shaping for Knowledge-Based, Multi-Agent Reinforcement Learning

... Plan-Based Reward Shaping with Belief Revision In my experiments attempting to overcome the gap in performance between individual-plan- based agents and joint-plan-based agents, I concluded that an ...

112

Knowledge-Based Reward Shaping with Knowledge Revision in Reinforcement Learning

Knowledge-Based Reward Shaping with Knowledge Revision in Reinforcement Learning

... using reward shaping and knowledge revision are visible especially at the start of the ...MDP reward shaping, mainly because there is constant updating of beliefs until the end of an ...

119

Reward circuitry function in autism spectrum disorders

Reward circuitry function in autism spectrum disorders

... Imaging data analytic strategy Anticipation and outcome phases were analyzed separately. For both phases, the primary method of analysis was to evaluate clusters that revealed a significant effect of diagnostic status on ...

13

The function of the lateral hypothalamus with regard to gustatory and reward related processes

The function of the lateral hypothalamus with regard to gustatory and reward related processes

... particular function that the prefrontal cortex has been shown to be involved in is spatially based foraging behaviour on a radial arm maze (Seamans et al 1995; Floresco et ...

275

1 Reward in a downturn. Reward. in a. downturn

1 Reward in a downturn. Reward. in a. downturn

... Impact deepening around the world While the downturn was slower to impact on high-growth economies in Asia, Eastern Europe and South America, the survey confirms that the downturn has hit these economies hard in the past ...

24

Modelling and control of the flame temperature distribution using probability density function shaping

Modelling and control of the flame temperature distribution using probability density function shaping

... 1 Institute of Automation, Chinese Academy of Sciences, Beijing 100080, P.R. China 2 Control Systems Centre, University of Manchester, Manchester M60 1QD, U.K. This paper presents three control algorithms for the output ...

28

Shaping Early Reorganization of Neural Networks Promotes Motor Function after Stroke.

Shaping Early Reorganization of Neural Networks Promotes Motor Function after Stroke.

... of function early after ...motor function, cortical excitability, and resting-state fMRI were assessed 1 day prior to the first stimulation and 1 day after the last ...

13

A numerical method for the expected penalty–reward function in a Markov-modulated jump–diffusion process.

A numerical method for the expected penalty–reward function in a Markov-modulated jump–diffusion process.

... penalty–reward function is studied in this context, including ruin probabilities (a first-passage problem) as a special ...the function of interest is ...

6

Transition to drug addiction: a negative reinforcement model based on an allostatic decrease in reward function

Transition to drug addiction: a negative reinforcement model based on an allostatic decrease in reward function

... the reward set point corresponds to the minimum level of reward system re- sponsivity that relieves the preexisting ...of reward (Mason et ...brain reward cir- cuits are unlikely to develop ...

18

Reward based Crowdfunding: Reward Characteristics and their influence

Reward based Crowdfunding: Reward Characteristics and their influence

... giving. Reward-based crowdfunding as done by Kickstarter does not have a charitable aspiration however both charitable and non-charitable crowdfunding use rewards to attract ...the reward can be designed by ...

9

Show all 10000 documents...

Related subjects