MonteCarloTreeSearch (MCTS) is a directed search technique that has gained prominence in recent years and has been used with success for several types of games such as Go (Silver et al. 2016) and Kriegspiel (Ciancarini and Favini 2009). The basic algorithm involves an iterative construction of a searchtree until some computational limit is achieved. (Browne et al. 2012). There are four steps per- formed during each iteration of MCTS: selection, expan- sion, simulation, and backpropagation. Figure 2 shows the structure of each MCTS phase and the searchtree associat- ed.
The central theme of this paper is the use of multiple determinized trees as a means of dealing with imperfect information in a MCTS search and we have shown that this approach provides significant benefits in playing strength, becoming competitive with a sophisticated expert rules player with a simulation budget of less than one CPU second on standard hardware, despite having no access to expert knowl- edge. In addition to that we have presented a wide variety of enhancements to the determinized trees and analysed the effect on playing strength that each enhancement offers. All of these enhancements show further improvement. We investigated a modification of the structure of the decision tree to a binary tree, well suited to M:TG where decisions amount to the choice of a subset of cards from a small set, rather than an individual card. As well as providing significant improvements in playing strength, the binary tree representation substantially reduced CPU time per move. Dominated move pruning used limited domain knowledge, of a type applicable to a wide variety of games involving subset choice, to significantly reduce the branching factor within the tree. Another promising approach maintained pressure on the MonteCarloTreeSearch algorithm by choosing “interesting” determinizations which were balanced between the two players. An enhancement which used decaying reward to encourage delaying moves when behind had some positive effect, but was not as effective as the preceding three enhancements.
Teaching computer programs to play games through learning has been an im- portant way to achieve better artificial intelligence (AI) in a variety of real-world applications . In 1997 a computer program called IBM Deep Blue defeated Garry Kasparov, the reigning world chess champion. The principle idea of Deep Blue is to generate a searchtree and to evaluate the game positions by applying expert knowledge on chess. A searchtree consists of nodes representing game positions and directed links each connecting a parent and a child node where the game position changes from the parent node to the child node after a move is taken. The searchtree lists computer’s moves, opponent’s responses, and com- puter’s next-round responses and so on. Superior computational power allows the computer to accumulate vast expert knowledge and build deep search trees, thereby greatly exceeding the calculation ability of any human being. However, the same principle cannot be easily extended to play Go, an ancient game still very popular in East Asia, because the complexity of Go is much higher: the board size is much larger, games usually last longer, and more importantly, it is harder to evaluate game positions. In fact the research progress had been so slow that even an early 2016 Wikipedia  noted that “Thus, it is very unlikely that it will be possible to program a reasonably fast algorithm for playing the Go end- game flawlessly, let alone the whole Go game”. Still, to many people’s surprise, in March 2016, AlphaGo, a computer program created by Google’s DeepMind, won a five game competition by 4-1 over Lee Sedol, a legendary professional Go player. An updated version of AlphaGo named “Master” defeated many of the world’s top players with an astonishing record of 60 wins 0 losses, over seven days, in December 2016. AlphaGo accomplished this milestone because it used a very different set of AI techniques from what Deep Blue had used . MonteCarloTreeSearch (MCTS) is one of the two key techniques (the other is deep learning) .
In recent years there has been much interest in the MonteCarloTreeSearch (MCTS) algorithm. In 2006 it was a new, adaptive, randomized optimization algo- rithm [Cou06, KS06]. In fields as diverse as Artificial Intelligence, Operations Re- search, and High Energy Physics, research has established that MCTS can find valu- able approximate answers without domain-dependent heuristics [KPVvdH13]. The strength of the MCTS algorithm is that it provides answers with a random amount of error for any fixed computational budget [GBC16]. Much effort has been put into the development of parallel algorithms for MCTS to reduce the running time. The ef- forts are applied to a broad spectrum of parallel systems; ranging from small shared- memory multi-core machines to large distributed-memory clusters. In the last years, parallel MCTS played a major role in the success of AI by defeating humans in the game of Go [SHM + 16, HS17].
Abstract—Monte-CarloTreeSearch (MCTS) is a best-ﬁrst search where the pseudorandom simulations guide the solution of problem. Recent improvements on MCTS have produced strong computer Go pro- gram, which has a large search space, and the suc- cess is a hot topic for selecting the best move. So far, most of reports about MCTS have been on two- player game, and MCTS has been applied rarely in one-player games. MCTS does not need an admis- sible heuristic, so the application of MCTS for one- player games might be an interesting alternative. Ad- ditionally, one-player games changed its situation by player’s decision like puzzles are describable as net- work diagrams like PERT with the representation of interdependences between each operation. Therefore if MCTS for one-player games is developed as a meta- heuristic algorithm, we would use this for not only many practical problems, but also combinatorial op- timization problems. This paper investigated the ap- plication of Single Player MCTS (SP-MCTS) intro- duced by Schadd et al. to a puzzle game called Bubble Breaker. Next this paper showed the eﬀectiveness of new simulation strategies on SP-MCTS by numerical experiments, and found the diﬀerences between the search methods and their parameters. Based on the results, this paper discussed the application potential- ity of SP-MCTS for a practical scheduling problem.
However, this problem is intrinsically difficult be- cause it is hard to encode what to say into a sentence while ensuring its syntactic correctness. We propose to use MonteCarlotreesearch (MCTS) (Kocsis and Szepesvari, 2006; Browne et al., 2012), a stochastic search algorithm for decision processes, to find an optimal solution in the decision space. We build a searchtree of possible syntactic trees to generate a sentence, by selecting proper rules through numer- ous random simulations of possible yields.
In order to address problems with larger search spaces, we must turn to alternative methods. MonteCarlotreesearch (MCTS) has had a lot of success in Go and in other appli- cations  . MCTS eschews the typical brute force tree searching methods, and utilizes statistical sampling instead. This makes MCTS a probabilistic algorithm. As such, it will not always choose the best action, but it still performs rea- sonably well given sufficient time and memory. MCTS per- forms lightweight simulations that randomly select actions. These simulations are used to selectively grow a game tree over a large number of iterations. Since these simulations do not take long to perform, it allows MCTS to explore search spaces quickly. This is what gives MCTS the advantage over deterministic methods in large search spaces.
Abstract— We are addressing the course timetabling problem in this work. In a university, students can select their favorite courses each semester. Thus, the general requirement is to allow them to attend lectures without clashing with other lectures. A feasible solution is a solution where this and other conditions are satisfied. Constructing reasonable solutions for course timetabling problem is a hard task. Most of the existing methods failed to generate reasonable solutions for all cases. This is since the problem is heavily constrained and an e ﬀ ective method is required to explore and exploit the search space. We utilize MonteCarloTreeSearch (MCTS) in finding feasible solutions for the first time. In MCTS, we build a tree incrementally in an asymmetric manner by sampling the decision space. It is traversed in the best-first manner. We propose several enhancements to MCTS like simulation and tree pruning based on a heuristic. The performance of MCTS is compared with the methods based on graph coloring heuristics and Tabu search. We test the solution methodologies on the three most studied publicly available datasets. Overall, MCTS performs better than the method based on graph coloring heuristic; however, it is inferior compared to the Tabu based method. Experimental results are discussed.
Rock-Paper-Scissors and Poker are two examples where this is clearly the case, where mixing strategies is important to achieving a strong level of play. The algorithms described in this paper are not designed explicitly to seek mixed policies but they often do so anyway, in the sense of choosing different actions when presented with the same state. This arises from the random nature of MonteCarlo simulation: the MCTS algorithm is not deterministic. Sha Þ ei et al.  demonstrate that the UCT algorithm Þ nds a mixed policy for Rock-Paper-Scissors, and our preliminary investigations suggest that ISMCTS Þ nds mixed policies for the small, solved game of Kuhn Poker . However, MCTS often fails to Þ nd optimal (Nash) policies for games of imperfect information. Ponsen et al.  suggest that algorithms such as MonteCarlo counterfactual regret (MCCFR)  are a better Þ t for approximating Nash equi- libria in games whose trees contain millions of nodes, whereas the strength of an MCTS approach lies in Þ nding a strong suboptimal policy but Þ nding it in reasonable computation time for complex games with combinatorially large trees.
Bootstrapping for Entity Set Expansion (ESE) aims at iteratively acquiring new instances of a specific target category. Traditional bootstrap- ping methods often suffer from two problems: 1) delayed feedback, i.e., the pattern evalua- tion relies on both its direct extraction qual- ity and the extraction quality in later iterations. 2) sparse supervision, i.e., only few seed enti- ties are used as the supervision. To address the above two problems, we propose a novel boot- strapping method combining the MonteCarloTreeSearch (MCTS) algorithm with a deep similarity network, which can efficiently esti- mate delayed feedback for pattern evaluation and adaptively score entities given sparse su- pervision signals. Experimental results con- firm the effectiveness of the proposed method.
In this work, we have shown that MonteCarloTreeSearch combined with deep neural network policies and an in-scope filter can be used to effectively perform chemical synthesis planning. One advantage of our approach is that it can provide full retrosynthetic pathways within seconds. In contrast to earlier approaches, our purely data-driven ansatz can be initially set up within a few days without the need for tedious and biased expert encoding or curation, and is equally applicable to both in-house and discipline-scale datasets. We have demonstrated that our 3N-MCTS approach has the best performance characteristics in terms of speed and number of solved problems compared with established search methods. In the past, retrosynthetic systems have been criticized for producing more noise than signal. Here, we have shown by double blind AB testing that organic chemists did not generally prefer literature routes over routes found by 3N-MCTS. We observed that heuristic best-first search without neural network guiding did lead to many unreasonable steps being proposed in the routes, while the 3N-MCTS approach proposed more reasonable routes. This is supported by the double-blind AB experiments, where the participating organic chemists showed clear preference towards 3N-MCTS over the traditional approach.
moves to be organized into a decision tree, which can be tra- versed through with various types of algorithms. Depth-first search (DFS) and breadth-first search (BFS) are two rudi- mentary approaches to tree traversal that are straightforward to implement and can solve the game if possible. However, there is still room left for improving their performance us- ing auxiliary algorithms. As part of the challenge proposed by Neller, teams were asked to explore potential heuristics to guide the performance of these graph algorithms. This compelled our team to delve into more intelligent solutions such as heuristic-based traversal algorithms. There are sev- eral features that can be extracted from any state of a BoF game to be used directly as a heuristic to guide the traver- sal. This abundance of applicable features facilitated a ma- chine learning approach: suitable features can be used as an input, and the output can be applied as a heuristic - which allows the traversal to be directed by multiple features rather than just one. The team compared the provided depth- first search to heuristic algorithms such as MonteCarlotreesearch (MCTS), as well as a novel heuristic search algorithm guided by machine learning.
The paper is organised as follows. In section 2 rules of tarok and basic approaches to a tarok- playing program are presented. Section 3 gives the overview of Silicon Tarokist. Section 4 describes its game-treesearch algorithm, i.e. an advanced version of alpha-beta search. In section 5 MonteCarlo sampling is described, which is used to deal with imperfect informa- tion. Section 6 discusses the performance of the search algorithm and presents the results of testing against human players. Section 7 con- cludes the paper and lists some ideas that might produce an even better tarok-playing program.
We presented a parser which analyzes new input by probabilistically combining fragments from LFG-annotated corpora into new analyses. We have seen that the parse accuracy increased with increasing fragment size, and that LFG's functional structures contribute to significantly higher parse accuracy on tree structures. We tested two search techniques for the most probable analysis, Viterbi n best and MonteCarlo. While these two techniques achieved about the same accuracy, Viterbi n best was about 100 times faster than MonteCarlo.
The idea of updating a tree by adding leaves dates back to at least Felsenstein (1981), in which he describes, for maxi- mum likelihood estimation, that an effective search strategy in tree space is to add species one by one. More recent work also makes use of the idea of adding sequences one at a time: ARGWeaver (Rasmussen et al. 2014) uses this approach to initialise MCMC on (in this case, a space of graphs), t + 1 sequences using the output of MCMC on t sequences, and TreeMix (Pickrell and Pritchard 2012) uses a similar idea in a greedy algorithm. In work conducted simultaneously to our own, Dinh et al. (2018) also propose a sequential MonteCarlo approach to inferring phylogenies in which the sequence of distributions is given by introducing sequences one by one. However, their approach: uses different proposal distribu- tions for new sequences; does not infer the mutation rate simultaneously with the tree; does not exploit intermediate distributions to reduce the variance; and does not use adaptive MCMC moves. Further investigation of their approach can be found in Fourment et al. (2018), where different guided proposal distributions are explored but that still presents the aforementioned limitations.
Integrating over cos^w on the polarised differential cross-sections gives the total polarised cross-section. The total polarised cross-section divided by the total cross-section gives the fraction of each polarisation state. Calculat ing these numbers from generator level MonteCarlo and from the corrected polarised cross-sections will give a quantitative check of the detector correc tion. Figure 8.6 shows the calculated fractions from generator level MonteCarlo and also fully simulated MonteCarlo, both before and after the detec tor correction. Shown, are the results for a number of Standard Model MonteCarlo samples and also some non-Standard Model samples. It is obvious th at the detector simulation has a large effect on the measured polarised fraction. In all cases, for both Standard and non-Standard Model MonteCarlo the de tector correction gives results that are within one standard deviation of the true polarised fractions. It is interesting to note th a t for all MonteCarlo sam ples the detector simulation has a similar effect. The fraction of T T pairs is decreased and thus the fraction of other polarised W -pairs is enhanced. This is due to the detector simulation having the largest effect in the high cos region and this is where most of the T T W-pairs are found, as can be seen for example in figure 8.5.
optimal error estimates for the resulting semi-discrete scheme which then provide corresponding error estimates for expectation values and Monte-Carlo approxima- tions. Application of efficient solution techniques, such as adaptivity , multigrid methods , and Multilevel Monte-Carlo techniques [3, 9, 10] is very promising but beyond the scope of this paper. In our numerical experiments we investigate a corresponding fully discrete scheme based on an implicit Euler method and observe optimal convergence rates.
Considerable works [3–9] have investigated and empirically modeled the path loss in the forest. These empirical models might be applicable to the agricultural fruit orchards because they are also dominated by trees with a single trunk. However, the special characteristics of the fruit orchards that diﬀer from the forest environments might limit such the model applicability. These characteristics include the periodic distribution of the trees in the orchard with the grid manner, as well as the leaf density, canopy size, and trunk size which are similar over the trees since they have the same age of growth. On the other hand, in the forest environments, the tree distribution in the space, leaf density, canopy size, and trunk size are random. On the basis of these randomnesses, the same empirical model may be used to predict the path loss in any angular direction in the forest, but the same angular path loss model may not be applicable to the grid planting pattern of the fruit orchards.
We think that jackknife standard errors should be routinely used in reporting MonteCarlo results. They are much easier to obtain than those from the standard delta method approach, especially when using the MonteCarlo descriptive ratio statistics discussed in Section 3. The F -statistic studied in Section 3, based on individual estimators and their jackknife variance estimate in each sample, is a potentially useful method for testing equality of parameters from k populations.
This is particularly challenging for Bayesian statistics since the posterior dis- tribution can become too computationally expensive to evaluate or simulate from exactly, and has lead researchers to develop a range of new approximate algorithms; see [Angelino et al., 2016] for an overview. Bardenet et al.  also offer a discus- sion of solutions in the MonteCarlo literature, whilst Hoffman et al.  discuss this issue in the context of variational inference. Another direction of research in the tall data setting has been to consider methods to summarise large datasets with a subset of representative weighted samples. This is called a coreset [Bachem et al., 2017; Huggins et al., 2016; Campbell and Broderick, 2017] and can be used instead of the entire dataset to reduce the computational cost associated with evaluating likelihoods. However, these methodologies are still in their infancy and further de- velopments are required.