• No results found

Strategy Improvement For Markov Decision Processes

Strategy improvement algorithm for 
		singularly perturbed discounted Markov decision processes

Strategy improvement algorithm for singularly perturbed discounted Markov decision processes

... perturbed Markov decision process with the discounted reward ...irreducible processes. We introduce the limit Markov control problem which is the optimization problem that should be solved in ...

7

Strategy iteration algorithms for games and Markov decision processes

Strategy iteration algorithms for games and Markov decision processes

... the strategy improvement setting can be extended to apply to the Markov decision process ...greedy strategy improvement will make the same decisions at those vertices, and an ...

226

Markov Decision Processes

Markov Decision Processes

... • Need to define states, actions, rewards, and dynamics 52 Chapter 3: Finite Markov Decision Processes Example 3.3 Recycling Robot A mobile robot has the job of collecting empty soda cans in an ...

15

Sufficient Markov Decision Processes.

Sufficient Markov Decision Processes.

... Sufficient Markov Decision ...data-driven decision making include autonomous vehicles, intelligent power grids, and precision medicine through mobile ...health. Markov decision ...

121

Configurable Markov Decision Processes

Configurable Markov Decision Processes

... both Markov decision processes with imprecise probabilities and non- stationary Markov decision processes do not admit the pos- sibility to dynamically alter the environmental ...

10

Robust Markov Decision Processes

Robust Markov Decision Processes

... Although transition sampling has theoretical appeal, it is often prohibitively costly or even infeasible in practice. To obtain independent samples for each state-action pair, one needs to repeatedly direct the MDP into ...

48

Risk-sensitive Markov Decision Processes

Risk-sensitive Markov Decision Processes

... 2.1 Introduction In this chapter, we consider a risk-sensitive continuous-time Markov decision process over a finite time duration. From the results of chapter 5 about the PDMDP, it is naturally to think ...

183

Some contributions to Markov decision processes

Some contributions to Markov decision processes

... c i (x, a)µ(dx×da) for every Markov chain Q ϕ µ (dy|x) with respect to which µ(dx × A) is its i.p.m., where µ(dx × da) = ϕ µ (da|x)µ(dx × A), and i = 0, 1, . . . , M. Assumption 5.4 is similar to the traditional ...

160

One-Counter Markov Decision Processes

One-Counter Markov Decision Processes

... Recursive Markov Decision Processes (RMDPs) was studied in [10, ...a strategy that yields termination probability 1, or even approximating the maximum probability within any non-trivial ...

36

Scalable Verification of Markov Decision Processes

Scalable Verification of Markov Decision Processes

... Abstract Markov decision processes (MDP) are useful to model concur- rent process optimisation problems, but verifying them with numerical methods is often ...

13

Compositional Reasoning for Markov Decision Processes

Compositional Reasoning for Markov Decision Processes

... The rest of this paper is organised as follows. In Section 2 we introduce the model of weighted MDPs, the notation of hyper-derivations and some important properties. Then we define a behavioural preorder based on ...

16

Hedging Bets in Markov Decision Processes

Hedging Bets in Markov Decision Processes

... of Markov decision processes with costs or rewards, while widely used to formalize optimal decision making, cannot capture scenarios where there are multiple objectives for the agent during ...

20

Markov decision processes with uncertain parameters

Markov decision processes with uncertain parameters

... The ANTG case study models a complex museum with a variety of collections. Due to the popularity of the museum, there are many visitors at the same time. Different visitors may have different preferences of arts. We ...

136

Augmenting Markov Decision Processes with Advising

Augmenting Markov Decision Processes with Advising

... Advice-MDPs aim to enable operators to augment the plan- ning model with advice, towards fitting the irregularities of the environment (like ASRL) with prescriptive a priori advising (like MIPSs). This way, Advice-MDPs ...

8

Multiple-Environment Markov Decision Processes

Multiple-Environment Markov Decision Processes

... single strategy that wins the objective with probability one (almost surely winning) in all the environments of the ...one strategy in the family that wins the objective with probability larger than 1 −  ...

13

Solving Hybrid Markov Decision Processes

Solving Hybrid Markov Decision Processes

... 5 Conclusions and Future Work In this paper, a novel approach for solving continuous and hybrid MDPs is de- scribed. In the first phase we use an exploration strategy of the environment and a machine learning ...

11

Bounded-parameter Markov decision processes

Bounded-parameter Markov decision processes

... 8. Related work and conclusions Our definition for bounded-parameter MDPs is related to a number of other ideas appearing in the literature on Markov decision processes; in the following, we mention ...

39

Structural Results for Constrained Markov Decision Processes

Structural Results for Constrained Markov Decision Processes

... offers decision-makers the ability to further stratify cus- tomers into more refined groups, and allows for more flexibility in how service level constraints are placed upon these ...

159

Compositional reasoning for weighted Markov decision processes

Compositional reasoning for weighted Markov decision processes

... We have proposed a model of weighted Markov decision processes, wMDP, for compositional reasoning about the behaviour of systems with uncertainty. Amortised weighted simulation is coinductively ...

43

Hazard Avoidance Alerting With Markov Decision Processes

Hazard Avoidance Alerting With Markov Decision Processes

... Figure 1.2: Anticipating New Information and Decision Opportunities maybe unnecessary. Because future observations will reduce uncertainty about the mode, deferring the alert would clarify the need for an alert ...

141

Show all 10000 documents...

Related subjects