• No results found

discounted Markov decision processes

Simplex Algorithm for Countable state Discounted Markov Decision Processes

Simplex Algorithm for Countable state Discounted Markov Decision Processes

... We consider discounted Markov Decision Processes (MDPs) with countably-infinite state spaces, finite action spaces, and unbounded rewards. Typical examples of such MDPs are inventory ...

36

Strategy improvement algorithm for 
		singularly perturbed discounted Markov decision processes

Strategy improvement algorithm for singularly perturbed discounted Markov decision processes

... perturbed Markov decision process with the discounted reward ...and discounted factor are perturbed ...irreducible processes. We introduce the limit Markov control problem which ...

7

Strategy iteration algorithms for games and Markov decision processes

Strategy iteration algorithms for games and Markov decision processes

... bound holds for games and for MDPs. For many years, people were unable to find ex- amples upon which strategy improvement equipped with the greedy policy took sig- nificantly more than a linear number of iterations to ...

226

Variance Optimization for Continuous Time Markov Decision Processes

Variance Optimization for Continuous Time Markov Decision Processes

... the discounted MDP in infinite stage and the average reward problem in infinite stage [2] ...the decision maker’s expected reward is often assumed to be a constant, and then the investor chooses a policy ...

15

Randomized and Relaxed Strategies in Continuous-Time Markov Decision Processes

Randomized and Relaxed Strategies in Continuous-Time Markov Decision Processes

... The following remark explains the novelty of the current work and its connection to the previous results and the known methods. As was mentioned (see also section 5), the discounted cost is a special case of the ...

31

Policy-Gradient Algorithms for Partially Observable Markov Decision Processes

Policy-Gradient Algorithms for Partially Observable Markov Decision Processes

... around the correct phoneme, the more likely the agent is to receive a positive reward. The effect of past transitions on the current Viterbi probabilities become progressively smaller as time goes on, possibly at an ...

303

Some contributions to Markov decision processes

Some contributions to Markov decision processes

... standard discounted MDP model can be equiv- alently viewed as an undiscounted MDP ...general discounted MDP model with a state-action-dependent ...the discounted MDP model would immediately follow ...

160

Markov Decision Processes and Approximate Dynamic Programming Methods for Optimal Treatment Design.

Markov Decision Processes and Approximate Dynamic Programming Methods for Optimal Treatment Design.

... Maxwell et al. [67] use an ADP approach to determine the best strategy for dynamic repo- sitioning of ambulances in metropolitan areas in order to maximize the number of calls reached within a designated length of time. ...

149

ON THE FIRST PASSAGE g-MEAN-VARIANCE OPTIMALITY FOR DISCOUNTED CONTINUOUS-TIME MARKOV DECISION PROCESSES

ON THE FIRST PASSAGE g-MEAN-VARIANCE OPTIMALITY FOR DISCOUNTED CONTINUOUS-TIME MARKOV DECISION PROCESSES

... on discounted continuous-time MDPs in a finite or countable state space and with a bounded reward ...risk-averse decision maker might prefer a policy with a reasonable mean performance g (not necessarily the ...

19

Discrete Time Hybrid Decision Processes:  The Discounted Case

Discrete Time Hybrid Decision Processes: The Discounted Case

... a Markov-type hy- brid process from stochastic kernel and credibilistic ker- ...hybrid processes in the near future, it is meaningful to consider the case where the behavior of hybrid processes given ...

5

What if the World Were Different? Gradient-Based Exploration for New Optimal Policies

What if the World Were Different? Gradient-Based Exploration for New Optimal Policies

... as Markov decision processes, this problem can be modeled as a constrained optimization problem, in which the agent balances the benefits arising from changing the world with the potential costs ...

14

Approximate Newton Methods for Policy Search in Markov Decision Processes

Approximate Newton Methods for Policy Search in Markov Decision Processes

... An avenue of research that has received less attention is the application of Newton’s method to Markov decision processes. Although such an extension of the GPOMDP algo- rithm is provided in the work ...

51

A POMDP Framework to Find Optimal Inspection and Maintenance Policies via Availability and Profit Maximization for Manufacturing Systems

A POMDP Framework to Find Optimal Inspection and Maintenance Policies via Availability and Profit Maximization for Manufacturing Systems

... to decision maker's ...Observable Markov Decision Process (POMDP) framework for a partially observable and stochastically deteriorating system in which inspection and maintenance optimal policies of ...

8

Continuous Observation Partially Observable Semi Markov Decision Processes for Machine Maintenance

Continuous Observation Partially Observable Semi Markov Decision Processes for Machine Maintenance

... The emergence of technologically advanced data-collecting techniques, such as vibration monitoring, acoustics and physical condition monitoring, have been explored for improving reliability prediction and maintenance ...

20

Partially Observable Markov Decision Processes for Prostate Cancer Screening.

Partially Observable Markov Decision Processes for Prostate Cancer Screening.

... referral decision depend on the patient’s age and PSA history? Surprisingly, there has been very little research on determining optimal decisions related to these ...observable Markov decision ...

166

Continuous-observation partially observable semi-Markov decision processes for machine maintenance

Continuous-observation partially observable semi-Markov decision processes for machine maintenance

... Though the POSMDP model has existed for decades, there has been little effort in bridging the gap between POSMDP and machine maintenance. Moreover, the documented works on employing POSMDP in machine maintenance are all ...

20

A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes

A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes

... Despite the sustained interest in model-based BRL, the deployment to real-world applications is limited both by scalability and representation issues. In terms of representation, an important chal- lenge for many ...

42

Adaptive Layer Approach For Power Management In Wireless Communication

Adaptive Layer Approach For Power Management In Wireless Communication

... online decision making ...on Markov decision processes and reinforcement learning (RL), for simultaneously utilizing both PHY centric and system-level techniques to achieve the minimum ...

6

Robust Approximate Bilinear Programming for Value Function Approximation

Robust Approximate Bilinear Programming for Value Function Approximation

... large Markov Decision Processes (MDPs) is a very useful, but computationally challenging problem addressed widely in the AI literature, particularly in the area of reinforcement ...

37

Optimal Control of Customers to the Service Facility with Two Types of Customers

Optimal Control of Customers to the Service Facility with Two Types of Customers

... The In this article we analyzed a discrete time MDP in service facility systems with two types of customers. We control the number of customers admitted to the system by observing two types of customers in the potential ...

7

Show all 10000 documents...

Related subjects