Parallel Synchronous Execution - Profiling of Parallel Machine Learning Algorithms

5.3 Profiling of Parallel Machine Learning Algorithms

6.1.1 Parallel Synchronous Execution

To allow for parallelization with a synchronous model update, infill criteria and techniques (constant liar, Kriging believer, qEI [GLC10], qLCB [HHL12], MOI- MBO [BWB+14]) have been suggested that propose multiple configurations in each iteration. Multi-point proposals are able to derive q configuration proposals

x∗₁, . . . ,x∗_q simultaneously instead of only proposing one configuration x∗ from a

surrogate model. As described in the fundamentals of Chapter 2, the surrogate model is used as a regression model, as it is comparably inexpensive to evaluate and therefore often used when function evaluations are very expensive [FSK08].

Hutter et al.[HHL12] introduced the qLCB criterion which is an extension of the single-point LCB (2.4)criterion using an exponentially distributed random variable to generateqdifferent candidate proposals by drawing random values ofλj ∼Exp(λ)

(j=1, ..., q) from the exponential distribution.

qLCB(x, λj) =µˆ(x) −λjsˆ(x) withλj∼Exp(λ) (6.1)

The λ variable guides the exploration-exploitation trade-off. Sampling multiple different λj thus might result in different “good” configurations by varying the

impact of the standard deviation term.

Figure6.2presents two exemplary MBO iterations where qLCB is used, proposing two configurations per iteration for parallel execution. As described in Figure 2.3 of Chapter 2, they (solid line) in the upper parts of the two MBO iteration figures denotes the output of the unknown black-box function f, while yˆ (dotted line in upper parts) denotes the outputs of the surrogate regression model ˆf that tries to approximate the black-box function. Gray areas around yˆ represent the uncertainty of the surrogate model.

In the first iteration step, the initial set of configurations (red dots) are already evaluated. In the lower part of the visualization of iteration one, two dotted lines represent the qLCB infill criterion for q =2 with two different values for λwith qLCB(x, λ1) and qLCB(x, λ2) to vary the impact of the standard deviation ˆs(x).

Based on these two different λvalues, different minima for the infill criterion (lcb) are computed, to propose two new configurations (blue triangles).

In the second iteration, ˆf is refitted with the evaluated configurations (green rectangles) and two new configurations are proposed (blue triangles). This process continues until the budget is exhausted. The qLCB criterion is comparably inexpensive for generating many independent candidate proposals and thus used in the resource-aware scheduling strategies for parallel MBO [RKB+16;KRL+17], which are part of this thesis.

Another popular multi-point infill criterion is the qEI criterion [GLC10] which directly optimizes the single-point EI(2.3)criterion overqpoints. As the computation of EI is using Monte Carlo sampling, it is quite expensive [CG13]. Therefore, a less expensive alternative, the Kriging believer approach [GLC10], is often chosen. Here, the first configuration is proposed based on the standard single-point EI criterion. Its

First MBO iteration

Second MBO iteration

Figure 6.2: Visualization of two exemplary MBO iterations with qLCB as infill criterion and two parallel configuration evaluations per iteration.

posterior mean value is treated as a real value off to refit the surrogate, penalizing the surrounding region with a lower standard deviation for the next point proposal using EI again. This is repeated until q proposals are generated.

The above mentioned multi-point infill criteria can cause inefficient resource utilization when the parallel executed evaluations have heterogeneous resource demands like execution times. For the synchronous parallel execution of MBO, the number q of proposed configurations is usually chosen to equal the number of available CPUs.

Figure 6.3shows an exemplary schedule for q=4 jobs (evaluations) on 4 CPUs where the jobs have varying execution times. The vertical arrows indicate the start of an MBO iteration where the multi-point criterion proposes 4 job configurations

x1, . . . , x4 for the first iteration. The boxes represent the jobs execution times. At

the end of the iteration, the model gets updated with the results of the configurations x1, . . . , x4. All CPUs have to wait for the slowest evaluation (e.g., configuration x2

in iteration one) to finish before receiving new proposals. This can lead to idling CPUs that are not contributing to the optimization. Here, spaces between jobs and the vertical arrow (model update) indicate idling CPU time caused by heterogeneous execution times of the jobs executed within one MBO iteration. After the results of all jobs are gathered the model is updated synchronously and new proposals can be generated and executed for the second MBO iteration. The general goal is to use the underlying parallel architecture in a resource-efficient way to solve the optimization problem.

Figure 6.3: Exemplary scheduling for synchronous parallel MBO withq=4 executed evaluations (jobs) per MBO iteration, with varying execution times leading to idling CPUs.

Varying execution times of parallel evaluations have already been addressed by Snoek et al. [SLA12] where the authors suggest to model these with an additional surrogate leading to an “expected improvement per second”, favoring less expensive configurations. The parallel MBO approaches developed in this thesis also use regression models to estimate resource requirements, but instead of adapting the infill criterion, they use it to guide the scheduling of parallel evaluations. Here, the goal is to guide MBO to interesting regions in a faster and resource-efficient way without directly favoring less expensive configurations.

Another approach that addresses heterogeneous execution times of configuration evaluations executed in parallel is the asynchronous execution of MBO, which is described in the next paragraph.

In document Methods for efficient resource utilization in statistical machine learning algorithms (Page 96-98)