[PDF] Top 20 Sufficient Markov Decision Processes.

Sufficient Markov Decision Processes.

... any decision process can be made into an MDP by concatenating data over multiple decision points (see Section ...a decision process into the MDP framework in this way can lead to high-dimensional ... See full document

121

Strategy improvement algorithm for singularly perturbed discounted Markov decision processes

... Engineers typically refer to these models as “Markov control problems”, and in this paper we shall use these labels interchangeably. The early MDP models were studied by Howard [21] and Blackwell [9] and, ... See full document

7

Markov Decision Processes and Approximate Dynamic Programming Methods for Optimal Treatment Design.

... In Chapter 3 we describe an initial MDP model for optimal control of medication initiation decisions. This model aims to answer the following question: When and in what order should medications be initiated to reduce the ... See full document

149

Controlling Listening oriented Dialogue using Partially Observable Markov Decision Processes

... To solve this problem, this paper aims to auto- matically build a dialogue control component of a listening agent using partially observable Markov decision processes (POMDPs). POMDPs, which make it ... See full document

9

Continuous Observation Partially Observable Semi Markov Decision Processes for Machine Maintenance

... semi-Markov decision processes (POS- MDPs) provide a rich framework for planning under both state transition uncertainty and observation ...maintenance decision process via a real industrial ... See full document

20

Adaptive Layer Approach For Power Management In Wireless Communication

... online decision making ...on Markov decision processes and reinforcement learning (RL), for simultaneously utilizing both PHY centric and system-level techniques to achieve the minimum ... See full document

6

A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes

... rameters of the underlying POMDP domain. 2 We derive optimal algorithms for belief tracking and finite-horizon planning in this model. However, because the size of the state space in a BAPOMD can be countably infinite, ... See full document

42

Simplex Algorithm for Countable state Discounted Markov Decision Processes

... The class of Markov decision processes (MDPs) provides a popular framework which covers a wide variety of sequential decision-making problems. An MDP is classified by its criterion being ... See full document

36

Partially Observable Markov Decision Processes for Prostate Cancer Screening.

... referral decision depend on the patient’s age and PSA history? Surprisingly, there has been very little research on determining optimal decisions related to these ...observable Markov decision ... See full document

166

Approximate Newton Methods for Policy Search in Markov Decision Processes

... Many of these problems are not particular to Markov decision processes, but are general longstanding issues that plague Newton’s method. Various methods have been developed in the optimization ... See full document

51

Conditional Value at Risk for Random Immediate Reward Variables in Markov Decision Processes

... [13] Y. Ohtsubo, “Optimal Threshold Probability in Dis- counted Markov Decision Processes with a Target Set,” Applied Mathematics and Computation, Vol. 149, No. 2, 2004, pp. 519-532. ... See full document

6

Markov processes in blockchain systems

... block‑structured Markov pro‑ cesses in the queueing study of blockchain systems, which can provide analysis both for the stationary performance measures and for the sojourn time of any transaction or ...mining ... See full document

28

Comparative effectiveness research on patients with acute ischemic stroke using Markov decision processes

... Formulating an MDP model for the treatment of AIS According to clinical experience and TCM theory, treatment decision-making depends on the current condition of patient, and the corresponding TCM/integrative ... See full document

10

Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes

... semi-Markov decision processes, where actions are temporally extended, and value functions are decomposed using the hierarchical structure of a ... See full document

63

What if the World Were Different? Gradient-Based Exploration for New Optimal Policies

... as Markov decision processes, this problem can be modeled as a constrained optimization problem, in which the agent balances the benefits arising from changing the world with the potential costs ... See full document

14

Robust Approximate Bilinear Programming for Value Function Approximation

... large Markov Decision Processes (MDPs) is a very useful, but computationally challenging problem addressed widely in the AI literature, particularly in the area of reinforcement ... See full document

37

Continuous-observation partially observable semi-Markov decision processes for machine maintenance

... semi-Markov decision processes (POS- MDPs) provide a rich framework for planning under both state transition uncertainty and observation ...maintenance decision process via a real industrial ... See full document

20

A linear programming approach to constrained nonstationary infinite horizon Markov decision processes

... One of the main objectives in this paper is to study extreme points of (P), the LP formulation of constrained nonstationary MDPs. The definition of an extreme point of a convex set is a point in the set that cannot be ... See full document

23

Essays on semiparametric estimation of Markov decision processes

... We reflect on th e com putational effort required of th e proposed m ethod. I t will be helpful to have in m ind the methodology of Pesendorfer and Schmidt-Dengler (2008) as our m ethods coincide when th e X is finite ... See full document

193

Variance Optimization for Continuous Time Markov Decision Processes

... the decision maker’s expected reward is often assumed to be a constant, and then the investor chooses a policy with a given expected return to minimize this risk, we can see that the Markowitz mean-variance ... See full document

15