[PDF] Top 20 Some contributions to Markov decision processes

Some contributions to Markov decision processes

... rational decision making, which can be overcome by optimizing with respect to the aggregated (iterated) ...the contributions in Chapter 4 is to allow cost functions be- ing defined in a more relaxed ... See full document

160

Investigation of Computational Reduction Strategies for Markov Decision Processes.

... Bellman[Bel57] first proposed the Markov decision process (MDP) problem. Howard[How60] pre- sented the value iteration method and the policy iteration method to solve the MDP problem, which laid the ... See full document

50

Multi-Objective Markov Decision Processes for Data-Driven Decision Support

... of deciding which actions are optimal is more complex, but we will still leverage the idea of a partial order. The problem of identifying Pareto-optimal policies is of significant interest in RL (Perny and Weng, 2010; ... See full document

28

Randomized and Relaxed Strategies in Continuous-Time Markov Decision Processes

... The following remark explains the novelty of the current work and its connection to the previous results and the known methods. As was mentioned (see also section 5), the discounted cost is a special case of the ... See full document

31

Simplex Algorithm for Countable state Discounted Markov Decision Processes

... Countable-state MDPs were studied by many researchers, including [4, 10, 12, 21, 22, 23], with predominant solution methods summarized as the three algorithms in [21, 22] and [23]. We will review these in Section 1.2 in ... See full document

36

Strategy improvement algorithm for singularly perturbed discounted Markov decision processes

... A discrete Markovian decision process (MDP, for short) is observed at discrete time points t= 0, 1, 2, …. The state space is denoted by S= {1, 2, …, N}. With each state 𝑠 ∈ 𝑆 we associate a finite action set A(s) ... See full document

7

Approximate Newton Methods for Policy Search in Markov Decision Processes

... (detailing some widely used con- trollers for which this condition holds) guaranteeing that the search direction is an ascent direction; We show that the method is invariant to affine transformations of the ... See full document

51

Partially Observable Markov Decision Processes for Prostate Cancer Screening.

... referral decision depend on the patient’s age and PSA history? Surprisingly, there has been very little research on determining optimal decisions related to these ...observable Markov decision ... See full document

166

Policy-Gradient Algorithms for Partially Observable Markov Decision Processes

... Adaptive history methods grow or shrink the number of past events that are needed to reveal the hidden state. Probabilistic Suffix Automata [Ron et al., 1994] model partially observable Markov decision ... See full document

303

On the complexity of computing maximum entropy for Markovian Models

... On top of a purely probabilistic model like MCs, it is probably more interesting to consider probabilistic models with nondeterminism, typically Interval Markov chains (IMCs) and Markov Decision ... See full document

14

Continuous Observation Partially Observable Semi Markov Decision Processes for Machine Maintenance

... To date, the POSMDP model has not been developed for the case of discrete states, discrete actions and continuous observations. In addition, many of the applications of the POSMDP have failed to capture the subtleties of ... See full document

20

Markov Decision Processes and Approximate Dynamic Programming Methods for Optimal Treatment Design.

... of the environment (anything that cannot be changed by the agent), a policy that describes how the agent acts under certain conditions, rewards that indicate what is favorable in the short term, and a value function that ... See full document

149

Continuous-observation partially observable semi-Markov decision processes for machine maintenance

... semi-Markov decision processes (POS- MDPs) provide a rich framework for planning under both state transition uncertainty and observation ...maintenance decision process via a real industrial ... See full document

20

Robust Approximate Bilinear Programming for Value Function Approximation

... large Markov Decision Processes (MDPs) is a very useful, but computationally challenging problem addressed widely in the AI literature, particularly in the area of reinforcement ... See full document

37

What if the World Were Different? Gradient-Based Exploration for New Optimal Policies

... as Markov decision processes, this problem can be modeled as a constrained optimization problem, in which the agent balances the benefits arising from changing the world with the potential costs ... See full document

14

Adaptive Layer Approach For Power Management In Wireless Communication

... online decision making ...on Markov decision processes and reinforcement learning (RL), for simultaneously utilizing both PHY centric and system-level techniques to achieve the minimum ... See full document

6

A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes

... on some of the cells. Each rock has an unknown binary quality (good or bad). The goal of the robot is to gather samples of the good rocks. Sampling a good rock yields high reward (+10), in contrast to sampling a ... See full document

42

Compositional reasoning for weighted Markov decision processes

... We have proposed a model of weighted Markov decision processes, wMDP, for compositional reasoning about the behaviour of systems with uncertainty. Amortised weighted simulation is coinductively ... See full document

43

Performance Guarantees for Homomorphisms beyond Markov Decision Processes

... Continuous state-action space. The results in this paper easily extend to the continuous state-action space homomorphisms for measurable maps. The summations change to integrals and the measurability constraint make ... See full document

8

On Discounted Dynamic Programming with Unbounded Returns

... present some modifications of the results in Rinc´on-Zapatero and Rodrigues-Palmero (2003) which are very useful to study Markov decision processes, in particular stochastic optimal growth ... See full document

15