• No results found

multi-armed bandit problem

Algorithms for the multi-armed bandit problem

Algorithms for the multi-armed bandit problem

... stochastic multi-armed bandit problem is an important model for studying the exploration- exploitation tradeoff in reinforcement ...the problem are well-understood theoretically, ...

32

The Sample Complexity of Exploration in the Multi-Armed Bandit Problem

The Sample Complexity of Exploration in the Multi-Armed Bandit Problem

... Lower bounds for different variants of the multi-armed bandit have been studied by several authors. For the expected regret model, where the regret is defined as the difference between the ideal ...

26

The Non-stationary Stochastic Multi-armed Bandit Problem

The Non-stationary Stochastic Multi-armed Bandit Problem

... the multi-armed bandit problem that generalize the stationary stochastic, piecewise- stationary and adversarial bandit ...switching bandit problem with SER4 by adding a ...

21

Towards an Improved Strategy for Solving Multi Armed Bandit Problem

Towards an Improved Strategy for Solving Multi Armed Bandit Problem

... the multi-armed bandit problem using ...because multi-armed bandit’s agent is better described by its fitness, measured by how much it has learnt about its space than its ...

5

Training a Quantum Neural Network to Solve the Contextual Multi Armed Bandit Problem

Training a Quantum Neural Network to Solve the Contextual Multi Armed Bandit Problem

... contextual multi-armed bandit problem, a problem in the domain of reinforcement learn- ing, which demonstrates that quantum reinforcement learning algorithms can be learned by a quantum ...

11

More specifically, we design a learning algo- rithm that connects active learning with the well-known multi-armed bandit problem

More specifically, we design a learning algo- rithm that connects active learning with the well-known multi-armed bandit problem

... well-known multi-armed bandit ...the multi-armed ban- dit learner, it is possible to estimate the performance of different strategies on the ...

7

Scalable Discrete Sampling as a Multi-Armed Bandit Problem

Scalable Discrete Sampling as a Multi-Armed Bandit Problem

... 3. Algorithms for MABs with a Finite Population and Fixed Confidence The key difference of our problem from the regular MABs is that our rewards are generated from a finite population while regular MABs assume ...

17

Lower Bounds and Selectivity of Weak-Consistent Policies in Stochastic Multi-Armed Bandit Problem

Lower Bounds and Selectivity of Weak-Consistent Policies in Stochastic Multi-Armed Bandit Problem

... general problem of ...learning problem with perfect information, commonly addressed by prediction with expert advice (see, ...related problem is the one of regret against the best strategy from a ...

21

Monotone multi-armed bandit allocations

Monotone multi-armed bandit allocations

... Peter Auer, Nicol` o Cesa-Bianchi, Yoav Freund, and Robert E. Schapire. The nonstochastic multi- armed bandit problem. SIAM J. Comput., 32(1):48–77, 2002b. Preliminary version in 36th IEEE ...

5

Multi-armed Bandit Problems with History

Multi-armed Bandit Problems with History

... Five different SVM classifiers were trained using a sub- set of twenty thousand examples. The different SVMs corresponded to different C parameter values (which trade-off between margin and slack variables in SVM). ...

11

A multi-armed bandit approach for exploring partially observed networks

A multi-armed bandit approach for exploring partially observed networks

... that bandit. In a multi-armed prob- lem with a discrete set of available actions, choosing an action corresponds to playing an arm in a multi-armed bandit ...of bandit ...

18

Slow Fading Channel Selection: A Restless Multi-Armed Bandit Formulation

Slow Fading Channel Selection: A Restless Multi-Armed Bandit Formulation

... a multi-access wireless network in which transmitters dynamically select a frequency band to communicate ...selection problem as a restless multi-armed bandit problem and we ...

5

Selecting Multiple Web Adverts - a Contextual Multi-armed Bandit with State Uncertainty

Selecting Multiple Web Adverts - a Contextual Multi-armed Bandit with State Uncertainty

... Contextual Multi-armed Bandit with State Uncertainty Abstract We present a method to solve the problem of choosing a set of adverts to display to each of a sequence of web ...the ...

31

Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems

Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems

... classical multi-armed bandit problem, but rather than looking at the expected regret, we develop PAC style ...the multi-armed bandit problem, where the main aim is ...

27

A cost sensitive decision tree learning algorithm based on a multi armed bandit framework

A cost sensitive decision tree learning algorithm based on a multi armed bandit framework

... the multi-armed bandit problem, in which a player in a casino has to decide which slot machine (bandit) from a selection of slot machines is likely to pay out the ...this ...

38

A cost-sensitive decision tree learning algorithm based on a multi-armed bandit framework

A cost-sensitive decision tree learning algorithm based on a multi-armed bandit framework

... the multi-armed bandit problem, in which a player in a casino has to decide which slot machine (bandit) from a selection of slot machines is likely to pay out the ...this ...

38

Approximations of the Restless Bandit Problem

Approximations of the Restless Bandit Problem

... the multi-armed bandit problem with strongly dependent pay-offs at its full generality is beyond the scope of this paper, we provide a complementary example for this ...the bandit arms ...

37

Mechanisms with learning for stochastic multi armed bandit problems

Mechanisms with learning for stochastic multi armed bandit problems

... • Section 4: Mechanism Design Overview: We provide an overview of classical mechanism design including key definitions and concepts. • Section 5: Stochastic MAB Mechanisms: First, we describe a mechanism design ...

44

Optimizing Adaptive Marketing Experiments with the Multi-Armed Bandit

Optimizing Adaptive Marketing Experiments with the Multi-Armed Bandit

... the Multi-Armed Bandit Abstract Sequential decision making is central to a range of marketing ...the multi-armed bandit, the conceptual and methodological backbone of this ...

148

muMAB. A multi-armed bandit model for wireless network selection

muMAB. A multi-armed bandit model for wireless network selection

... the problem of adaptively switching between a conservative algorithm, like muUCB1, and a more aggressive one, like MLI, depending on the current estimate of the rewards provided by the different ...a ...

22

Show all 10000 documents...

Related subjects