These measurements, averaged across all 5000 deals, are pre- sented in Table II. It should be noted that these measurements are a function of the deal; the Þ rst measurement is exact for each deal, while the second depends on the sampled deter- minizations. These measurements were made only for the non-Landlord players since the playing strength experiments in Section VII-B were conducted from the point of view of the Landlord player. This means the algorithms tested always had the same number of branches at nodes where the Land- lord makes a move, since the Landlord can see his cards in hand. The Þ rst measurement is an indicator for the number of branches that may be expected at opponent nodes for the cheating UCT player as the Landlord. Similarly, the second measurement indicates the number of branches for opponent nodes with determinized UCT as the Landlord. Both of these measurements are upper bounds, since if an opponent has played any cards at all then the number of leading plays will be smaller. The third, fourth, and Þ fth measurements indicate how many expansions ISMCTS will be making at opponent nodes after a certain number of visits, since a new determinization is used on each iteration. Again this measurement is an upper bound since only one move is actually added per iteration and if there were moves unique to a determinization which were never seen again, only one of them would be added to the **tree**.

Show more
25 Read more

From medicines to materials, small organic molecules are indispensable for hu- man well-being. To plan their syntheses, chemists employ a problem solving technique called retrosynthesis. In retrosynthesis, target molecules are recursively transformed into increasingly simpler precursor compounds until a **set** of readily available starting materials is obtained. Computer-aided retrosynthesis would be a highly valuable tool, however, past approaches were slow and provided results of unsatisfactory quality. Here, we employ **Monte** **Carlo** **Tree** **Search** (MCTS) to efficiently discover retrosynthetic routes. MCTS was combined with an expan- sion policy network that guides the **search**, and an “in-scope” filter network to pre-select the most promising retrosynthetic steps. These deep neural networks were trained on 12 million reactions, which represents essentially all reactions ever published in organic chemistry. Our system solves almost twice as many molecules and is 30 times faster in comparison to the traditional **search** method based on extracted rules and hand-coded heuristics. Finally after a 60 year history of computer-aided synthesis planning, chemists can no longer distinguish between routes generated by a computer system and real routes taken from the scientific literature. We anticipate that our method will accelerate drug and materials dis- covery by assisting chemists to plan better syntheses faster, and by enabling fully automated robot synthesis.

Show more
19 Read more

The paper is organised as follows. In section 2 rules of tarok and basic approaches to a tarok- playing program are presented. Section 3 gives the overview of Silicon Tarokist. Section 4 describes its game-**tree** **search** algorithm, i.e. an advanced version of alpha-beta **search**. In section 5 **Monte** **Carlo** sampling is described, which is used to deal with imperfect informa- tion. Section 6 discusses the performance of the **search** algorithm and presents the results of testing against human players. Section 7 con- cludes the paper and lists some ideas that might produce an even better tarok-playing program.

Show more
the dictionary, we prepared another dictionary from two words window before/after of the given words on Wikipedia as the candidate words other than noun or verbs (Wiki-2-Ws). During generation, we gener- ate sentences using these dictionaries depending on the part-of-speech of each word to reduce the **search** space, and classified a sentence as “lose” when it contains words out of the Given words and Similar words. Wiki-5 is the statistics from five-words win- dow before/after of the Given words and the Similar words on Wikipedia to compute AP or PP.

Bootstrapping for Entity **Set** Expansion (ESE) aims at iteratively acquiring new instances of a specific target category. Traditional bootstrap- ping methods often suffer from two problems: 1) delayed feedback, i.e., the pattern evalua- tion relies on both its direct extraction qual- ity and the extraction quality in later iterations. 2) sparse supervision, i.e., only few seed enti- ties are used as the supervision. To address the above two problems, we propose a novel boot- strapping method combining the **Monte** **Carlo** **Tree** **Search** (MCTS) algorithm with a deep similarity network, which can efficiently esti- mate delayed feedback for pattern evaluation and adaptively score entities given sparse su- pervision signals. Experimental results con- firm the effectiveness of the proposed method.

Show more
10 Read more

In the graph coloring problem, GCH is considered a classical approach. The heuristics derived from graph coloring problem are often utilized in CTP. Usually, difficult events are assigned first with the hope that easier events will be assigned later as the environment getting more restricted. Algorithm 8 presents the GCH method. At every iteration, an event is selected and assigned to a selected time slot. It is a one-pass method. In our experiments, Dynamic **Search** Rearrangement (DSR) [19] is utilized. DSR is a heuristic often used in constraint satisfaction problem. It is dynamic as the next selected event is determined at every iteration. In DSR, we select an event randomly from the **set** E={events with the least number of suitable time slots}. Next, we select a timeSlot randomly from the **set** P={time slots suitable for event and fit the least number of left events}.

Show more
The purpose of this paper is to evaluate SP-MCTS for one-player perfect-**information** games, Bubble Breaker given similar rules with SameGame, and to examine ap- plication potentiality of the solution as a meta-heuristic algorithm. The one-player perfect-**information** game is so-called puzzle games, and most of them give closely- deﬁned rules. Previous researches have shown that most of puzzle games are equal to optimization problems be- longing to the class of NP-Complete, which are diﬃcult to ﬁnd an optimal solution, and puzzle games are able to be considered as typical combinatorial optimization problems. Therefore if we can clarify the eﬀectiveness of SP-MCTS for such problems, we will open up new possi- bilities of SP-MCTS for many practical problems that is able to be described as combinatorial optimization prob- lems. This paper ﬁrstly applies SP-MCTS for Bubble Breaker already reported NP-Completeness [9], and eval- uates its eﬀectiveness. This paper develops a softawre of Bubble Breaker with Action Script 3.0 to present the sequence of moves visually until a game is solved. This paper develops SP-MCTS based on concept of Schadd et al., and proposes new heuristics. This paper shows the results of numerical experiments, and discusses the char- acteristic of SP-MCTS. Finally based on the eﬀorts of this paper, this paper examines the availability of SP-MCTS for practical problems by generalizing the procedures of SP-MCTS for one-player games. This paper mentions a reentrant scheduling problem as an example of practical problem [8].

Show more
The idea of updating a **tree** by adding leaves dates back to at least Felsenstein (1981), in which he describes, for maxi- mum likelihood estimation, that an effective **search** strategy in **tree** space is to add species one by one. More recent work also makes use of the idea of adding sequences one at a time: ARGWeaver (Rasmussen et al. 2014) uses this approach to initialise MCMC on (in this case, a space of graphs), t + 1 sequences using the output of MCMC on t sequences, and TreeMix (Pickrell and Pritchard 2012) uses a similar idea in a greedy algorithm. In work conducted simultaneously to our own, Dinh et al. (2018) also propose a sequential **Monte** **Carlo** approach to inferring phylogenies in which the sequence of distributions is given by introducing sequences one by one. However, their approach: uses different proposal distribu- tions for new sequences; does not infer the mutation rate simultaneously with the **tree**; does not exploit intermediate distributions to reduce the variance; and does not use adaptive MCMC moves. Further investigation of their approach can be found in Fourment et al. (2018), where different guided proposal distributions are explored but that still presents the aforementioned limitations.

Show more
14 Read more

of detection since both robots have not perceived any of the landmarks yet (the initial value of the status variable is **set** to be false). As we are dealing with global localization using MCL, all particles will be uniformly distributed all over the state space to model the global uncertainty. If there is no observation of any of landmarks to help robot to update its knowledge about the giving environment, the uncertainty will still stay in a relatively high level. Thus the exchange of their beliefs will not benefit the localization process. Figure 3.3 is a visual perception of this situation. Both robot A (red circle) and robot B (blue circle) are running on the same environment with a rectangle shape. Each black dot is a representative of one cluster which contains a certain number of particles. Figure 3.3(a) represents the current internal belief of robot A, while Figure 3.3(b) represents the internal belief of robot B at the same time. We can see if there is no detection of any landmarks, the beliefs of both robots are highly uncertain even though they move around for a while. At this level of uncertainty, the exchange of **information** will only increase the computational burden without any help to the localization process. Therefore we do not allow an exchange of **information** happen in this situation.

Show more
116 Read more

For sampling methods based on Markov chains that explore the space locally, like the RWM and MALA, it may be advantageous to instead impose a different metric structure on the space, X , so that some points are drawn closer together and others pushed further apart. Intuitively, one can picture distances in the space being defined, such that if the current position in the chain is far from an area of X , which is “likely to occur” under π(·), then the distance to such a typical **set** could be reduced. Similarly, once this region is reached, the space could be “stretched” or “warped”, so that it is explored as efficiently as possible.

Show more
30 Read more

However, practical problems in statistics include several parameters of interest and conclusions will often be drawn on one or more parameters at a time. In particular, Bayesian analyses for complicated models can be carried out on a fixed data **set** relatively simple using **Monte** **Carlo** methods to simulate posterior distributions. **Monte** **Carlo** methods allow exploring a great variety of models with relative ease, and thus it is possible to pursue the scientific goals of marching models to data more effectively and with less algebraic digression. MCMC is essentially **Monte** **Carlo** integration using Markov chains which provides enormous scope for realistic statistical modeling through a unifying framework within which many complex problems can be analyzed using generic software. The inference and efficiency in the output analysis of MCMC is to improve monitoring and inference through more effective use of the **information** in the Markov chain simulation.

Show more
Our research showed that the task-level parallelization method combined with a lock-free data structure for the GSCPM algorithm achieved very good performance and scalability on multi-core and many-core processors (see Section 5.7). The GSCPM algorithm was design based on the iteration-level parallelism which relies on the iter- ation pattern (see Section 4.4) that violates the iteration-level data dependencies (see Subsection 6.1.2). The result of this violation is **search** overhead. Therefore, scalabil- ity is only one issue, although it is an important one. The second issue is to handle the **search** overhead. Thus, we designed the 3PMCTS algorithm based on operation-level parallelism which relies on the pipeline pattern (the answer to the first part of RQ4) to avoid violating the iteration-level data dependencies (see Section 6.2). Hence, we managed to control the **search** overhead using the flexibility of task decomposition (the answer to the second part of RQ4). Different pipeline constructions provided the higher levels of flexibility that allow fine-grained managing of the execution of operations in MCTS (see Subsection 6.6.2).

Show more
196 Read more

In order to address problems with larger **search** spaces, we must turn to alternative methods. **Monte** **Carlo** **tree** **search** (MCTS) has had a lot of success in Go and in other appli- cations [2] [1]. MCTS eschews the typical brute force **tree** searching methods, and utilizes statistical sampling instead. This makes MCTS a probabilistic algorithm. As such, it will not always choose the best action, but it still performs rea- sonably well given sufficient time and memory. MCTS per- forms lightweight simulations that randomly select actions. These simulations are used to selectively grow a game **tree** over a large number of iterations. Since these simulations do not take long to perform, it allows MCTS to explore **search** spaces quickly. This is what gives MCTS the advantage over deterministic methods in large **search** spaces.

Show more
A common approach to finding solutions to games is to create a game **tree** representing the possible game states and moves and use a **tree** **search** algorithm to locate a goal state (Paul and Helmert 2016). Classic **tree** **search** algo- rithms, like Depth First **Search** (DFS), blindly expand nodes during the **search** process until either a goal node is found or all **search** options are exhausted (Chijindu 2012). This approach can be time consuming, especially if the **search** **tree** has a high branching factor, is likely to **search** many unnecessary nodes, and the solution it produces may not be the optimal solution. One possible improvement to an uninformed **search** is to use a heuristic function to help guide the order of node expansion. A heuristic function , takes a node and returns a non-negative real num- ber that is an estimate of the cost of the least-cost path from node to a goal node. The function is an ad- missible heuristic if is always less than or equal to the actual cost of a lowest-cost path from node to a goal. With a well-crafted heuristic it is likely that a solution can be found with significantly less nodes expanded than an uninformed **search**.

Show more
Teaching computer programs to play games through machine learning has been an important way to achieve better artificial intelligence (AI) in a variety of real-world applications. **Monte** **Carlo** **Tree** **Search** (MCTS) is one of the key AI techniques developed recently that enabled AlphaGo to defeat a legendary professional Go player. What makes MCTS particularly attractive is that it only understands the basic rules of the game and does not rely on expert-level knowledge. Researchers thus expect that MCTS can be applied to other com- plex AI problems where domain-specific expert-level knowledge is not yet available. So far there are very few analytic studies in the literature. In this pa- per, our goal is to develop analytic studies of MCTS to build a more funda- mental understanding of the algorithms and their applicability in complex AI problems. We start with a simple version of MCTS, called random playout **search** (RPS), to play Tic-Tac-Toe, and find that RPS may fail to discover the correct moves even in a very simple game position of Tic-Tac-Toe. Both the probability analysis and simulation have confirmed our discovery. We con- tinue our studies with the full version of MCTS to play Gomoku and find that while MCTS has shown great success in playing more sophisticated games like Go, it is not effective to address the problem of sudden death/win. The main reason that MCTS often fails to detect sudden death/win lies in the random playout **search** nature of MCTS, which leads to prediction distortion. There- fore, although MCTS in theory converges to the optimal minimax **search**, with real world computational resource constraints, MCTS has to rely on RPS as an important step in its **search** process, therefore suffering from the same fun- damental prediction distortion problem as RPS does. By examining the de- tailed statistics of the scores in MCTS, we investigate a variety of scenarios where MCTS fails to detect sudden death/win. Finally, we propose an im- proved MCTS algorithm by incorporating minimax **search** to overcome pre- diction distortion. Our simulation has confirmed the effectiveness of the pro- posed algorithm. We provide an estimate of the additional computational How to cite this paper: Li, W. (2018)

Show more
34 Read more

This research was conducted by an interdisciplinary team of two undergraduate students and a faculty to explore solutions to the Birds of a Feather (BoF) Research Challenge. BoF is a newly-designed perfect-**information** solitaire-type game. The focus of the study was to design and implement different al- gorithms and evaluate their effectiveness. The team compared the provided depth-first **search** (DFS) to heuristic algorithms such as **Monte** **Carlo** **tree** **search** (MCTS), as well as a novel heuristic **search** algorithm guided by machine learning. Since all of the studied algorithms converge to a solution from a solvable deal, effectiveness of each approach was measured by how quickly a solution was reached, and how many nodes were traversed until a solution was reached. The employed methods have a potential to provide artificial intelligence en- thusiasts with a better understanding of BoF and novel ways to solve perfect-**information** games and puzzles in general. The results indicate that the proposed heuristic **search** algo- rithms guided by machine learning provide a significant im- provement in terms of number of nodes traversed over the provided DFS algorithm.

Show more
also been made in multi-player poker and Skat [18] which show promise towards challenging the best human players. Determinization, where all hidden and random **information** is assumed known by all players, allows recent advances in MCTS to be applied to games with incomplete **information** and randomness. The determinization approach is not perfect: as discussed by Frank and Basin [19], it does not handle situ- ations where different (indistinguishable) game states suggest different optimal moves, nor situations where the opponent’s influence makes certain game states more likely to occur than others. In spite of these problems, determinization has been applied successfully to several games. An MCTS-based AI agent which uses determinization has been developed that plays Klondike Solitaire [20], arguably one of the most popular computer games in the world. For the variant of the game considered, the performance of MCTS in this case exceeds human performance by a substantial margin. A determinized **Monte** **Carlo** approach to Bridge [21], which uses **Monte** **Carlo** simulations with a **tree** of depth one has also yielded strong play. The combination of MCTS and determinization is discussed in more detail in Section V.

Show more
19 Read more

is used in the computation of the absolute time until the selected reaction is next performed; this operation along with carrying out the reaction scales as O(1). The inclusion of the dependency graph permits the update of propensities following the reaction to also occur in constant time. The algorithm then proceeds by instituting changes to the indexed priority queue in response to the executed reaction. In this **tree** structure, the number of nodes corresponds to the number of reactions M ; since the dependency graph allowed for the recalculation of only the minimal number of propensities, percolating changes through the **tree** takes at most log 2 (M ) operations. The resulting structure contains the minimum τ as the top node, which enables the next reaction µ to be identified in constant time. While producing the same reaction as would result from Gillespie’s algorithm, Gibson and Bruck’s next reaction method improves the simulation of a single realization of the chemical process to logarithmic time.

Show more
52 Read more

66 Read more

The structure and the contributions of the paper are as follows. In Sec- tion 2, we describe different **Monte** **Carlo** algorithms. In particular, we pro- pose an algorithm that takes into account not only the **information** about the last visited page (as in [3, 7]), but about all visited pages during the simulation run. In Section 3, we analyze and compare the convergence of **Monte** **Carlo** algorithms in terms of confidence intervals. We show that the PageRank of relatively important pages can be determined with high accu- racy even after the first iteration. In Section 4, we show that experiments with real data from the Web confirm our theoretical analysis. Finally, we summarize the results of the present work in Section 5. Technical proofs we put in the Appendix.

Show more
20 Read more