• No results found

Multi-armed bandit

Algorithms for the multi-armed bandit problem

Algorithms for the multi-armed bandit problem

... stochastic multi-armed bandit problem is an important model for studying the exploration- exploitation tradeoff in reinforcement ...popular multi-armed bandit ...of bandit ...

32

Monotone multi-armed bandit allocations

Monotone multi-armed bandit allocations

... Peter Auer, Nicol` o Cesa-Bianchi, Yoav Freund, and Robert E. Schapire. The nonstochastic multi- armed bandit problem. SIAM J. Comput., 32(1):48–77, 2002b. Preliminary version in 36th IEEE FOCS, ...

5

Multi-armed Bandit Problems with History

Multi-armed Bandit Problems with History

... stochastic multi-armed bandit ...stochastic bandit settings, such as bandits with a con- tinuum of arms, dueling bandits, ...for bandit problems with historic ...

11

Mechanisms with learning for stochastic multi armed bandit problems

Mechanisms with learning for stochastic multi armed bandit problems

... The multi-armed bandit (MAB) problem is a widely studied problem in machine learning litera- ture in the context of online learning. In this article, our focus is on a specific class of problems ...

44

Optimizing Adaptive Marketing Experiments with the Multi-Armed Bandit

Optimizing Adaptive Marketing Experiments with the Multi-Armed Bandit

... the Multi-Armed Bandit Abstract Sequential decision making is central to a range of marketing ...the multi-armed bandit, the conceptual and methodological backbone of this ...

148

The Sample Complexity of Exploration in the Multi-Armed Bandit Problem

The Sample Complexity of Exploration in the Multi-Armed Bandit Problem

... Lower bounds for different variants of the multi-armed bandit have been studied by several authors. For the expected regret model, where the regret is defined as the difference between the ideal ...

26

The Non-stationary Stochastic Multi-armed Bandit Problem

The Non-stationary Stochastic Multi-armed Bandit Problem

... the multi-armed bandit problem that generalize the stationary stochastic, piecewise- stationary and adversarial bandit ...switching bandit problem with SER4 by adding a probability of ...

21

Towards an Improved Strategy for Solving Multi Armed Bandit Problem

Towards an Improved Strategy for Solving Multi Armed Bandit Problem

... Abstract: Multi-Armed Bandit (MAB) problem is one of the classical reinforcements learning problems that describe the friction between the agent’s exploration and ...

5

A multi-armed bandit approach for exploring partially observed networks

A multi-armed bandit approach for exploring partially observed networks

... a multi-armed bandit based exploration algorithm for par- tially observed incomplete ...nonparametric multi-armed bandit algorithm iKNN-UCB with sublinear ...iKNN-UCB ...

18

muMAB. A multi-armed bandit model for wireless network selection

muMAB. A multi-armed bandit model for wireless network selection

... In this work, a new model for Multi-Armed Bandit problems, referred to as muMAB, was proposed. The model introduces the presence of two different actions: to measure and to use. In addition, the ...

22

On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models

On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models

... stochastic multi-armed bandit model is a simple abstraction that has proven useful in many different contexts in statistics and machine ...two armed-bandits, we derive refined lower bounds in ...

42

Estimation Bias in Multi-Armed Bandit Algorithms for Search Advertising

Estimation Bias in Multi-Armed Bandit Algorithms for Search Advertising

... In search advertising, the search engine needs to select the most profitable adver- tisements to display, which can be formulated as an instance of online learning with partial feedback, also known as the stochastic ...

9

Multi-Armed Bandit in Action: Optimizing Performance in Dynamic Hybrid Networks

Multi-Armed Bandit in Action: Optimizing Performance in Dynamic Hybrid Networks

... Multi-Armed Bandit in Action: Optimizing Performance in Dynamic Hybrid Networks Sébastien Henri , Christina Vlachou , and Patrick Thiran, Fellow, IEEE Abstract— Today’s home networks are often ...

14

A Multi-Armed Bandit Model Selection for Cold-Start User Recommendation

A Multi-Armed Bandit Model Selection for Cold-Start User Recommendation

... philippe.preux@inria.fr ABSTRACT How can we effectively recommend items to a user about whom we have no information? This is the problem we focus on in this paper, known as the cold-start problem. In most existing works, ...

10

Slow Fading Channel Selection: A Restless Multi-Armed Bandit Formulation

Slow Fading Channel Selection: A Restless Multi-Armed Bandit Formulation

... I. I NTRODUCTION Next generation of wireless networks is expected to be characterized by a high decentralization/distribution of control functions among nodes to support self-organizing and self- healing capabilities. ...

5

Combinatorial Multi-Armed Bandit and Its Extension to Probabilistically Triggered Arms

Combinatorial Multi-Armed Bandit and Its Extension to Probabilistically Triggered Arms

... We define a general framework for a large class of combinatorial multi-armed bandit (CMAB) problems, where subsets of base arms with unknown distributions form super arms. In each round, a super arm ...

33

Cooperation control in Parallel SAT Solving: a Multi-armed Bandit Approach

Cooperation control in Parallel SAT Solving: a Multi-armed Bandit Approach

... Abstract: In recent years, Parallel SAT solvers have leveraged with the so called Parallel Portfolio architecture. In this setting, a collection of independent Conflict- Directed Clause Learning (CDCL) algorithms compete ...

19

Partially Observable Multi-Sensor Sequential Change Detection: A Combinatorial Multi-Armed Bandit Approach

Partially Observable Multi-Sensor Sequential Change Detection: A Combinatorial Multi-Armed Bandit Approach

... Observable Multi-sensor Sequential Change Detec- tion (POMSCD), where only a subset of sensors can be ob- served to monitor a target system for change-point detection at each online learning ...traditional ...

8

Customer Acquisition via Display Advertising Using Multi-Armed Bandit Experiments

Customer Acquisition via Display Advertising Using Multi-Armed Bandit Experiments

... percentage of impressions to allocate to each ad? This paper answers that question, resolving the well-known “learn-and-earn” trade-off using multi-armed bandit (MAB) methods. The online advertiser’s ...

69

Training a Quantum Neural Network to Solve the Contextual Multi Armed Bandit Problem

Training a Quantum Neural Network to Solve the Contextual Multi Armed Bandit Problem

... contextual multi-armed bandit problem, a problem in the domain of reinforcement learn- ing, which demonstrates that quantum reinforcement learning algorithms can be learned by a quantum ...

11

Show all 10000 documents...

Related subjects