• No results found

finite Markov decision processes

Sufficient Markov Decision Processes.

Sufficient Markov Decision Processes.

... any decision process can be made into an MDP by concatenating data over multiple decision points (see Section ...a decision process into the MDP framework in this way can lead to high-dimensional ...

121

Approximate Newton Methods for Policy Search in Markov Decision Processes

Approximate Newton Methods for Policy Search in Markov Decision Processes

... In Tetris there exists a board, which is typically a 20 × 10 grid, which is empty at the beginning of a game. During each stage of the game a four block piece, called a tetrzoid, appears at the top of the board and ...

51

Strategy improvement algorithm for 
		singularly perturbed discounted Markov decision processes

Strategy improvement algorithm for singularly perturbed discounted Markov decision processes

... A discrete Markovian decision process (MDP, for short) is observed at discrete time points t= 0, 1, 2, …. The state space is denoted by S= {1, 2, …, N}. With each state 𝑠 ∈ 𝑆 we associate a finite action ...

7

Variance Optimization for Continuous Time Markov Decision Processes

Variance Optimization for Continuous Time Markov Decision Processes

... in finite stage, the discounted MDP in infinite stage and the average reward problem in infinite stage [2] ...the decision maker’s expected reward is often assumed to be a constant, and then the investor ...

15

A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes

A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes

... offline finite POMDP solvers (Pineau et ...the finite POMDP representa- tion presented in Section ...any finite planning horizon t, one can compute exactly the optimal value func- tion, as only a ...

42

A hemimetric extension of simulation for semi-markov decision processes

A hemimetric extension of simulation for semi-markov decision processes

... We have given an efficient algorithm to compute the simulation distance and identified a class of distributions for which the algorithm works on finite SMDPs. Furthermore, we have shown that, under mild conditions ...

17

Essays on semiparametric estimation of Markov decision processes

Essays on semiparametric estimation of Markov decision processes

... their finite dimensional param eters and the infinite dimensional param eters in some sieve space and do not discuss estim ation of the asym ptotic ...e finite dimensional param eter is an essential feature ...

193

Performance Guarantees for Homomorphisms beyond Markov Decision Processes

Performance Guarantees for Homomorphisms beyond Markov Decision Processes

... a finite state POMDP then our re- sults provide the performance-loss guarantee by represent- ing a belief-state based value function of the POMDP by a state-based value ...

8

Compositional reasoning for weighted Markov decision processes

Compositional reasoning for weighted Markov decision processes

... The rest of this paper is organised as follows. Section 2 is devoted to an exposition of our model, which we call weighted Markov Decision Processes, wMDPs. These correspond to the diagrams we have ...

43

Some contributions to Markov decision processes

Some contributions to Markov decision processes

... Chapter 5 and 6 tackle MDPs with long-run expected average cost criterion. In Chapter 5, we consider a constrained MDP with possibly un- bounded (from both above and below) cost functions. Under Lyapunov- like ...

160

Simplex Algorithm for Countable state Discounted Markov Decision Processes

Simplex Algorithm for Countable state Discounted Markov Decision Processes

... However, classes of CILPs considered so far ([7, 8, 19]) have a special structure that each constraint has only a finite number of variables and each variable appears only in a finite number of constraints. ...

36

Multi-Objective Markov Decision Processes for Data-Driven Decision Support

Multi-Objective Markov Decision Processes for Data-Driven Decision Support

... Multi-Objective Markov Decision Processes for developing sequential decision support systems from ...sequential decision-making data to provide support that is useful to many different ...

28

Policy-Gradient Algorithms for Partially Observable Markov Decision Processes

Policy-Gradient Algorithms for Partially Observable Markov Decision Processes

... One interesting exception to the poor performance of FSC methods is Glickman and Sycara [2001], where an Elman network — an RNN where all outputs are fed back to the hidden layer — parameterises an agent for the New York ...

303

ICTS AND ECONOMIC GROWTH IN AFRICAN COUNTRIES

ICTS AND ECONOMIC GROWTH IN AFRICAN COUNTRIES

... discrete Markov processes in pure birth-death processes with final number of states were ...the Markov pure birth-death processes can be distinguished by the value of the ...

5

A POMDP Framework to Find Optimal Inspection and Maintenance Policies via Availability and Profit Maximization for Manufacturing Systems

A POMDP Framework to Find Optimal Inspection and Maintenance Policies via Availability and Profit Maximization for Manufacturing Systems

... both finite and infinite horizon the model is solved approximately by Perseus and exactly by the new method ...to Decision Maker's point of view is valuable work and brings the model closer to the ...

8

A new method for approximating vector autoregressive processes by finite state Markov chains

A new method for approximating vector autoregressive processes by finite state Markov chains

... autoregressive processes) by a finite-state Markov ...the Markov chain by targeting conditional mo- ments of the underlying continuous process as in Rouwenhorst (1995), rather than directly ...

28

What if the World Were Different? Gradient-Based Exploration for New Optimal Policies

What if the World Were Different? Gradient-Based Exploration for New Optimal Policies

... a decision-theoretic framework, in which the task the agent must complete is described by a Markov decision process, and the different world configurations translate in different transition ...

14

Robust Approximate Bilinear Programming for Value Function Approximation

Robust Approximate Bilinear Programming for Value Function Approximation

... large Markov Decision Processes (MDPs) is a very useful, but computationally challenging problem addressed widely in the AI literature, particularly in the area of reinforcement ...

37

Optimal Control of Customers to the Service Facility with Two Types of Customers

Optimal Control of Customers to the Service Facility with Two Types of Customers

... The In this article we analyzed a discrete time MDP in service facility systems with two types of customers. We control the number of customers admitted to the system by observing two types of customers in the potential ...

7

Adaptive Layer Approach For Power Management In Wireless Communication

Adaptive Layer Approach For Power Management In Wireless Communication

... learning is that it is able to compare the expected utility of the available actions without requiring a model of the environment. Additionally, Q-learning can handle problems with stochastic transitions and rewards, ...

6

Show all 10000 documents...

Related subjects