Top PDF Monte Carlo Tree Search and Its Applications

Monte Carlo Tree Search and Its Applications

Monte Carlo Tree Search and Its Applications

In order to address problems with larger search spaces, we must turn to alternative methods. Monte Carlo tree search (MCTS) has had a lot of success in Go and in other appli- cations [2] [1]. MCTS eschews the typical brute force tree searching methods, and utilizes statistical sampling instead. This makes MCTS a probabilistic algorithm. As such, it will not always choose the best action, but it still performs rea- sonably well given sufficient time and memory. MCTS per- forms lightweight simulations that randomly select actions. These simulations are used to selectively grow a game tree over a large number of iterations. Since these simulations do not take long to perform, it allows MCTS to explore search spaces quickly. This is what gives MCTS the advantage over deterministic methods in large search spaces.
Show more

7 Read more

Prediction Distortion in Monte Carlo Tree Search and an Improved Algorithm

Prediction Distortion in Monte Carlo Tree Search and an Improved Algorithm

Teaching computer programs to play games through machine learning has been an important way to achieve better artificial intelligence (AI) in a variety of real-world applications. Monte Carlo Tree Search (MCTS) is one of the key AI techniques developed recently that enabled AlphaGo to defeat a legendary professional Go player. What makes MCTS particularly attractive is that it only understands the basic rules of the game and does not rely on expert-level knowledge. Researchers thus expect that MCTS can be applied to other com- plex AI problems where domain-specific expert-level knowledge is not yet available. So far there are very few analytic studies in the literature. In this pa- per, our goal is to develop analytic studies of MCTS to build a more funda- mental understanding of the algorithms and their applicability in complex AI problems. We start with a simple version of MCTS, called random playout search (RPS), to play Tic-Tac-Toe, and find that RPS may fail to discover the correct moves even in a very simple game position of Tic-Tac-Toe. Both the probability analysis and simulation have confirmed our discovery. We con- tinue our studies with the full version of MCTS to play Gomoku and find that while MCTS has shown great success in playing more sophisticated games like Go, it is not effective to address the problem of sudden death/win. The main reason that MCTS often fails to detect sudden death/win lies in the random playout search nature of MCTS, which leads to prediction distortion. There- fore, although MCTS in theory converges to the optimal minimax search, with real world computational resource constraints, MCTS has to rely on RPS as an important step in its search process, therefore suffering from the same fun- damental prediction distortion problem as RPS does. By examining the de- tailed statistics of the scores in MCTS, we investigate a variety of scenarios where MCTS fails to detect sudden death/win. Finally, we propose an im- proved MCTS algorithm by incorporating minimax search to overcome pre- diction distortion. Our simulation has confirmed the effectiveness of the pro- posed algorithm. We provide an estimate of the additional computational How to cite this paper: Li, W. (2018)
Show more

34 Read more

Evaluation of Simulation Strategy on Single-Player Monte-Carlo Tree Search and its Discussion for a Practical Scheduling Problem

Evaluation of Simulation Strategy on Single-Player Monte-Carlo Tree Search and its Discussion for a Practical Scheduling Problem

Bubble Breaker. From the results, it is thought that ex- act number of nodes might depend on the characteristic of problems. This paper also considers that the combi- nation with a pruning algorithm like beam search will be important to obtain more efficient results quickly. This paper considers that SP-MCTS might be a good match with practical scheduling problems, especially a reentrant scheduling problem [8]. We have been focused on the improvement of a printing process as a practical scheduling. In the printing process, dial plates used for car tachometers are printed with various colors and char- acter plates. At this time, the production lead time can be shortened by collecting the products printed by the same type of color or character plate. When the type of color or character plate is switched to another type, the process requires “setup operation” with production idle time. So the problem can be formulated by the mini-
Show more

6 Read more

Human like Natural Language Generation Using Monte Carlo Tree Search

Human like Natural Language Generation Using Monte Carlo Tree Search

However, this problem is intrinsically difficult be- cause it is hard to encode what to say into a sentence while ensuring its syntactic correctness. We propose to use Monte Carlo tree search (MCTS) (Kocsis and Szepesvari, 2006; Browne et al., 2012), a stochastic search algorithm for decision processes, to find an optimal solution in the decision space. We build a search tree of possible syntactic trees to generate a sentence, by selecting proper rules through numer- ous random simulations of possible yields.

8 Read more

Monte Carlo Tree Search in Finding Feasible Solutions for Course Timetabling Problem

Monte Carlo Tree Search in Finding Feasible Solutions for Course Timetabling Problem

Abstract— We are addressing the course timetabling problem in this work. In a university, students can select their favorite courses each semester. Thus, the general requirement is to allow them to attend lectures without clashing with other lectures. A feasible solution is a solution where this and other conditions are satisfied. Constructing reasonable solutions for course timetabling problem is a hard task. Most of the existing methods failed to generate reasonable solutions for all cases. This is since the problem is heavily constrained and an e ff ective method is required to explore and exploit the search space. We utilize Monte Carlo Tree Search (MCTS) in finding feasible solutions for the first time. In MCTS, we build a tree incrementally in an asymmetric manner by sampling the decision space. It is traversed in the best-first manner. We propose several enhancements to MCTS like simulation and tree pruning based on a heuristic. The performance of MCTS is compared with the methods based on graph coloring heuristics and Tabu search. We test the solution methodologies on the three most studied publicly available datasets. Overall, MCTS performs better than the method based on graph coloring heuristic; however, it is inferior compared to the Tabu based method. Experimental results are discussed.
Show more

8 Read more

Ensemble Determinization in Monte Carlo Tree Search for the Imperfect Information Card Game Magic: The Gathering

Ensemble Determinization in Monte Carlo Tree Search for the Imperfect Information Card Game Magic: The Gathering

A recent study [44] has investigated why PIMC search gives strong results despite its theoretical limitations. By examining the particular qualities of imperfect information games and creating artificial test environments that highlighted these qualities, Long et al [44] were able to show that the potential effectiveness of a PIMC approach was highly dependent on the presence or absence of certain features in the game. They identified three features, leaf correlation, bias, and disambiguation. Leaf correlation measures the probability that all sibling terminal nodes in a tree having the same payoff value; bias measures the probability that the game will favour one of the players and disambiguation refers to how quickly hidden information is revealed in the course of the game. The study found that PIMC performs poorly in games where the leaf correlation is low, although it is arguable that most sample-based approaches will fail in this case. PIMC also performed poorly when disambiguation was either very high or very low. The effect of bias was small in the examples considered and largely dependent on the leaf correlation value. This correlates well with the observed performance in actual games with PIMC performing well in trick taking games such as Bridge [21] and Skat [18] where information is revealed progressively as each trick is played so that the disambiguation factor has a moderate value. The low likelihood of the outcome of the game hinging on the last trick also means that leaf correlation is fairly high.
Show more

19 Read more

Learning to Bootstrap for Entity Set Expansion

Learning to Bootstrap for Entity Set Expansion

Bootstrapping for Entity Set Expansion (ESE) aims at iteratively acquiring new instances of a specific target category. Traditional bootstrap- ping methods often suffer from two problems: 1) delayed feedback, i.e., the pattern evalua- tion relies on both its direct extraction qual- ity and the extraction quality in later iterations. 2) sparse supervision, i.e., only few seed enti- ties are used as the supervision. To address the above two problems, we propose a novel boot- strapping method combining the Monte Carlo Tree Search (MCTS) algorithm with a deep similarity network, which can efficiently esti- mate delayed feedback for pattern evaluation and adaptively score entities given sparse su- pervision signals. Experimental results con- firm the effectiveness of the proposed method.
Show more

10 Read more

Confidence bands for Brownian motion and applications to Monte Carlo simulation

Confidence bands for Brownian motion and applications to Monte Carlo simulation

Carlo in March 2004, thus giving him the opportunity to collaborate with the first author, and to George Casella for discussions about a preliminary perspec- tive on that problem that eventually led to Chapter 4 in Robert and Casella (2004). Feedback from the audience of the ISBA 2004 poster session in Vi˜ na del Mar, Chile, is also gratefully acknowledged. A con- versation with Marc Yor in the premises of this re- search was equally helpful. Finally, comments from two anonymous referees were very helpful in improv- ing the clarity of the paper.

12 Read more

A Monte Carlo Tree Search Player for Birds of a Feather Solitaire

A Monte Carlo Tree Search Player for Birds of a Feather Solitaire

Artificial intelligence in games serves as an excellent plat- form for facilitating collaborative research with undergradu- ates. This paper explores several aspects of a research chal- lenge proposed for a newly-developed variant of a solitaire game. We present multiple classes of game states that can be identified as solvable or unsolvable. We present a heuris- tic for quickly finding goal states in a game state search tree. Finally, we introduce a Monte Carlo Tree Search-based player for the solitaire variant that can win almost any solv- able starting deal efficiently.
Show more

6 Read more

Information Set Monte Carlo Tree Search

Information Set Monte Carlo Tree Search

There is some asymmetry between the two players in terms of the relative strengths of the algorithms. For player 1, SO-ISMCTS and MO-ISMCTS are on a par while SO-ISMCTS + POM underperforms. For player 2, SO-ISMCTS is outper- formed by SO-ISMCTS + POM which is in turn outperformed by MO-ISMCTS. The three algorithms differ mainly in the assumptions they make about future play. SO-ISMCTS as- sumes that all actions are fully observable, which is both optimistic (I can respond optimally to my opponent’s actions) and pessimistic (my opponent can respond optimally to my actions). SO-ISMCTS hence suffers from strategy fusion, since it is assumed the agent can act differently depending on infor- mation it cannot observe. In a phantom game, SO-ISMCTS + POM optimistically assumes that the opponent plays ran- domly. MO-ISMCTS’s opponent model is more realistic: each opponent action has its own statistics in the opponent tree and so the decision process is properly modeled, but whichever action is selected leads to the same node in the player’s own tree thus preventing the player from tailoring its response to the selected action. This addresses the strategy fusion problem which affects SO-ISMCTS.
Show more

25 Read more

Machine Learning Based Heuristic Search Algorithms to Solve Birds of a Feather Card Game

Machine Learning Based Heuristic Search Algorithms to Solve Birds of a Feather Card Game

moves to be organized into a decision tree, which can be tra- versed through with various types of algorithms. Depth-first search (DFS) and breadth-first search (BFS) are two rudi- mentary approaches to tree traversal that are straightforward to implement and can solve the game if possible. However, there is still room left for improving their performance us- ing auxiliary algorithms. As part of the challenge proposed by Neller, teams were asked to explore potential heuristics to guide the performance of these graph algorithms. This compelled our team to delve into more intelligent solutions such as heuristic-based traversal algorithms. There are sev- eral features that can be extracted from any state of a BoF game to be used directly as a heuristic to guide the traver- sal. This abundance of applicable features facilitated a ma- chine learning approach: suitable features can be used as an input, and the output can be applied as a heuristic - which allows the traversal to be directed by multiple features rather than just one. The team compared the provided depth- first search to heuristic algorithms such as Monte Carlo tree search (MCTS), as well as a novel heuristic search algorithm guided by machine learning.
Show more

6 Read more

Exact Simulation of Jump Diffusion Processes with Monte Carlo Applications

Exact Simulation of Jump Diffusion Processes with Monte Carlo Applications

6.3. Results. We have compared the Monte Carlo estimator generated by the Jump Exact Algorithm (say E1) with the estimators generated by the two Euler schemes (40) and (41) (say respectively E2 and E3) for the discrete average (or asian option) functional and the maximum (or lookback option) functional. For the maximum, E3 employes the continuous version (42) of the Euler scheme. To address our comparison, we require that the Monte Carlo 95%-confidence interval generated by E2 and E3 has non empty intersection with the corresponding interval generated by E1. This seems a very reasonable (although somewhat arbitrary) accuracy. Figures 3-4 and Table 1 summarize the relevant results from a Monte Carlo numerical test for a selection of values of the two parameters l and β affecting the jump component. In order to ensure negligible Monte Carlo error we have chosen a Monte Carlo sample size of 5 10 5 iterations. In Figures 3-4 we have plotted the Monte Carlo estimates of E1 and E3 Vs the number of discretisation steps (per time unit)
Show more

32 Read more

Chiral 2N and 3N interactions and quantum Monte Carlo applications

Chiral 2N and 3N interactions and quantum Monte Carlo applications

Abstract. Chiral Effective Field Theory (EFT) two- and three-nucleon forces are now widely employed. Since they were originally formulated in momentum space, these in- teractions were non-local, making them inaccessible to Quantum Monte Carlo (QMC) methods. We have recently derived a local version of chiral EFT nucleon-nucleon and three-nucleon interactions, which we also used in QMC calculations for neutron matter and light nuclei. In this contribution I go over the basics of local chiral EFT and then summarize recent results.

6 Read more

Search for Heavy Majorana Neutrinos at LHC Using Monte Carlo Simulation

Search for Heavy Majorana Neutrinos at LHC Using Monte Carlo Simulation

Heavy neutrinos can be discovered at LHC. Many extensions for Standard Model predict the existence of a new neu- trino which has a mass at high energies. B-L model is one of them which predict the existence of three heavy (right- handed) neutrinos one per generation, new gauge massive boson and a new scalar Higgs boson which is different from the SM Higgs. In the present work we search for heavy neutrino in 4 leptons + missing energy final state events which are produced in proton-proton collisions at LHC using data produced from Monte Carlo simulation using B-L model at different center of mass energies. We predict that the heavy neutrinos pairs can be produced fr Z B-L new gauge
Show more

6 Read more

Bayesian Inference on Gravitational Waves

Bayesian Inference on Gravitational Waves

Let’s proceed with our example in which we take , and as our true signal parameters and . Data is generated for duration of a unit time (say one second) over discrete time points with an interval of . The signal, noise and signal plus noise (data) are plotted against time in Figure 5, which is self-explanatory. A Bayesian MCMC strategy was setup to extract the signal back from the data with prior information as , and . The results are given in Figure 6. We can see that the signal parameters are well spotted in just less than 2000 iterations and thus the signal can be easily reconstructed. Now let’s temper with noise’s amplitude by increasing the value of from 1.0 to 2.5 and see the consequences. We keep the signal parameters unaltered. The resulting MCMC search history is given in Figure [TracePlotsK3]. It is easy to conclude that for the same target signal if the noise amplitude was increased the MCMC will still find the true signal although it takes a bit longer. One can imagine the case when noise amplitude is several orders of magnitude larger than that of the target signal as it usually will be because of other sources of noise.
Show more

21 Read more

Monte Carlo methods

Monte Carlo methods

If the choice of q is not obvious, we recommend the use of an adaptive strategy, such as population Monte Carlo. A description of population MC and an application to model selection in cosmology can be found in [24]. Basically, first make a wild guess q (0) for q, say a Gaussian with a large variance. Apply importance sampling a first time to obtain an estimate of π and fit a Gaussian q (1) to this estimate of π. Now re-apply importance sampling with q (1) as a proposal, and re-fit a new Gaussian q (2) to π, etc. After T iterations, q (T ) should be a good proposal distribution for importance sampling. Of course, you can apply this procedure with other candidate proposals than Gaussians, you should indeed choose a family of distributions among which you think you may find a good approximation of π . If you have reasons to believe that π is bimodal, for example, you should probably fit a mixture of two distributions as in [24] rather than a Gaussian, which is unimodal. Usually, with the right choice of family of distributions, a few iterations are enough to get a reasonable q, and you can stop when q (t) does not change a lot with t.
Show more

21 Read more

Criticality Analysis and Quality Appraisal of Innoson Injection Mould System

Criticality Analysis and Quality Appraisal of Innoson Injection Mould System

There are various methods of data collection, but for this work, data were personally obtained from the production and maintenance manager in Innoson Plastic Industries, Enugu State. Appendix 1, shows the raw data for reliability and failure rate; Table 2 shows downtime, while Table 3 display defective production of the five major components in Injection moulding machine, for a period of ten (10) years. Reliable data is needed to build strong reliability, and Injection Moulding Machines are no exception. In analyzing the reliability and corrective maintenance of Injection Moulding machine, Monte Carlo Normal Distribution Simulator and Obudulu model were used for the work. These software, employ the use of tables, graphs, standard formulas and models as an exploratory method intended to discover what the data 𝑚 𝑚 𝑘
Show more

20 Read more

Applications of the Variational Quantum Monte Carlo Method to the Two-Electron Atoms

Applications of the Variational Quantum Monte Carlo Method to the Two-Electron Atoms

S. B. Doma et.al [30] used the VMC method to calculate the ground state of the helium atom as well as the ground state of hydrogen negative ion in the presence of a magnetic field regime between 0 a.u. and 10 a.u. They used two types of compact and accurate trial wave functions. The results are in good agreement with the exact values. Also, Doma and El- Gammal [31] presented a study for helium, Li + and Be 2+ ions under the compression effect of a spherical box. They used optimized wave function with five variational parameters. Furthermore, they investigated the total energies of the excited states of the helium atom in a strong magnetic field, taking into account the point of transition from the ground state to the excited states [32]. Moreover, S. B. Doma et.al [33] used the VMC technique to study the lithium atom and its like ions up to = 10 in the presence of a magnetic field. The calculations for the ground and some excited states performed for magnetic field strengths ranging from zero up to 100 a.u. Most of their work depended on the Jastrow correlation wave function to consider the electron-electron interaction.
Show more

12 Read more

Generalizations of the Multivariate Logistic Distribution with Applications to Monte Carlo Importance Sampling

Generalizations of the Multivariate Logistic Distribution with Applications to Monte Carlo Importance Sampling

These importance samplers should provide reasonable estimates of the E[t]. For comparison, the MVN with c = 1 was also used as an importance sampler. This distribution should not perform as well as the other three since an evaluation of the weights it produced indicated that its tails were not as thick as those of the posterior distribution, and hence asymptotic normality of these estimates cannot be assumed. Table 4.6 displays the estimates of E[t] (the posterior means) and their corresponding standard errors. Since the posterior distribution was unnormalized, the estimation of the normalization constant has to be taken into account when calculating the standard errors. If T (i) is the i th vector generated from the importance sampler, g,
Show more

199 Read more

Quantum Monte Carlo for Transition Metal Systems: Method Developments and Applications

Quantum Monte Carlo for Transition Metal Systems: Method Developments and Applications

Both LDA and GGA suffer from the self-interaction problem, where each elec- tron interacts with itself. This is seen in the second term in Eqn 2.7, where the electron density interacts with itself. This is valid within classical electrostatics, but not in quan- tum mechanics. There have been methods to fix this, including LDA+U[30] and the self- interaction correction[31], although there is some arbitrariness in their application. A large collaboration[32] recently tested various corrections to LDA on the MnO solid, and found that they show rather large variation. It is thus difficult to know which of the corrections, if any, are appropriate. Another method to improve the DFT functional is to add some per- centage of the Hartree-Fock energy to the functional. There is no theoretical way to know the correct percentage, however, so it is typically fitted to experiment, as in B3LYP, which uses the LYP functional in the 3-parameter fit by Becke[33]. While theoretically not a first principles method, B3LYP is heavily used in the chemistry community as a very accurate semi-empirical model. Unfortunately, for solids, B3LYP often performs much worse than PBE. There are also the new meta-GGAs, which add a dependence on the second derivative of the density. Of these, TPSS[34, 35] and its semi-empirical hybrid[36] are claimed to be quite accurate, and they will be compared to QMC in Chapter 7.
Show more

121 Read more

Show all 10000 documents...