[PDF] Top 20 Performance Guarantees for Homomorphisms beyond Markov Decision Processes

Performance Guarantees for Homomorphisms beyond Markov Decision Processes

... It is important to highlight that homomorphism is not the only technique for abstracting actions. The options framework is a competing method for temporal action abstrac- tions (Sutton, Precup, and Singh 1999). In the ... See full document

8

Approximate Newton Methods for Policy Search in Markov Decision Processes

... important performance guarantees including guaranteed as- cent directions, invariance to affine transformation of the parameter space, and convergence ... See full document

51

Policy-Gradient Algorithms for Partially Observable Markov Decision Processes

... The poor performance of GAMP when β < 0.99 indicates that the mixing time of the Heaven/Hell scenario is at least hundreds of steps. Intuitively, this means it takes hundreds of steps for the effects of ... See full document

303

Investigation of Computational Reduction Strategies for Markov Decision Processes.

... From the above data, we can conclude that using variable cheap iterations has better performance in almost all cases. For the fixed cheap iteration approach, 10 cheap iterations (C10) is almost always best. For ... See full document

50

A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes

... Finally, we consider the case where the true underlying POMDP model is changed such that the sensor has a constant probability ε of making mistakes for all distances; the prior is sampled as for the results of Figure 3. ... See full document

42

Markov Decision Processes and Approximate Dynamic Programming Methods for Optimal Treatment Design.

... each decision epoch, the person in charge of scheduling the patients must decide which appointment slots to allocate to patients waiting to be ...patients beyond their targeted time, diverting patients, and ... See full document

149

Variance Optimization for Continuous Time Markov Decision Processes

... important performance metric of stochastic ...the decision maker’s expected reward is often assumed to be a constant, and then the investor chooses a policy with a given expected return to minimize this ... See full document

15

Controlling Listening oriented Dialogue using Partially Observable Markov Decision Processes

... higher performance in subjective evaluations than other statistically mo- tivated systems, such as an HMM-based one, that work by selecting the most likely subsequent action in the dialogue ... See full document

9

Sufficient Markov Decision Processes.

... Human trafficking is a form of modern day slavery that victimizes millions of people. It has become a norm for sex traffickers to use escort websites to openly advertise the victims. We designed an ordinal regression ... See full document

121

Some contributions to Markov decision processes

... Chapter 5 and 6 tackle MDPs with long-run expected average cost criterion. In Chapter 5, we consider a constrained MDP with possibly un- bounded (from both above and below) cost functions. Under Lyapunov- like ... See full document

160

Small sets and Markov transition densities

... for Markov chain Monte Carlo (see also the extended notion of pseudo- small sets described by Roberts and Rosenthal [20, 21]) and also (under the rubric of gamma-coupling) to produce effective Coupling from the ... See full document

21

What if the World Were Different? Gradient-Based Exploration for New Optimal Policies

... The second scenario considered models the decision process of a robotic agent that was trained to pour water in cups located at a specific position ¯ θ. The state space X = {I, S, F, A} includes initial, success, ... See full document

14

Adaptive Layer Approach For Power Management In Wireless Communication

... Simulation setup NS2 is an open-source event-driven simulator designed specifically for research in computer communication networks. NS2 has continuously gained tremendous interest from industry, academia, and ... See full document

6

Compositional Reasoning for Markov Decision Processes

... The rest of this paper is organised as follows. In Section 2 we introduce the model of weighted MDPs, the notation of hyper-derivations and some important properties. Then we define a behavioural preorder based on ... See full document

16

Strategy improvement algorithm for singularly perturbed discounted Markov decision processes

... action Markov decision process (MDPs for short) are dynamic, stochastic, systems controlled by controller, sometimes referred to as “decision ... See full document

7

A POMDP Framework to Find Optimal Inspection and Maintenance Policies via Availability and Profit Maximization for Manufacturing Systems

... In Table 1, solving time of SDP and MDP is compared in small instances with maximum 20 stages, 2 monitors and 3 states (as new, middle and damaged) and some intuitive primary states probabilities as (1,0,0) for as new, ... See full document

8

Optimal Control of Customers to the Service Facility with Two Types of Customers

... In this article, we considered a discrete-time service facility system, viewed as a Markov Decision Process(MDP). Decisions are taken at discrete time epochs to control admissions to the system. Here the ... See full document

7

Continuous-observation partially observable semi-Markov decision processes for machine maintenance

... The emergence of technologically advanced data-collecting techniques, such as vibration monitoring, acoustics and physical condition monitoring, have been explored for improving reliability prediction and maintenance ... See full document

20

Continuous Observation Partially Observable Semi Markov Decision Processes for Machine Maintenance

... semi-Markov decision processes (POS- MDPs) provide a rich framework for planning under both state transition uncertainty and observation ...maintenance decision process via a real industrial ... See full document

20

Robust Approximate Bilinear Programming for Value Function Approximation

... Figure 2 shows the Bellman residual attained by the methods. It clearly shows that the robust bilinear formulation most reliably minimizes the Bellman residual. The other two bilinear formu- lations are not much worse. ... See full document

37