Linescan validation - Tuning and validation

6.3 Tuning and validation

6.3.5 Linescan validation

The probably most convincing argument that the Professor method works is to show that the interpolation behaves like the actual generator response. Since our parameter space is nine-dimensional, ordinary plotting is ruled out and only projections can be studied.

In order to do so an arbitrary straight line can be defined in the parameter hypercube and certain number of equidistant parameter points can be sampled from this line. For each of these points the generator is run with the according set of parameters. The idea

is to be able to compare a global χ2_/N

df of the true generator response to that obtained

from the parameterization.

This procedure will be referred to in the next paragraphs as line-scan. Since an arbitrary line will not be very useful, three sensible line-scans will be presented in the following. Illustrations of the samplings can be found in Figure 73 and Figure 75.

The parameterizations used to calculate the global χ2_/N

df were also used for the

minimizations. Furthermore, we do not use all the available parameterizations but only those where the according minimization result projects onto the according line-scan axes.

Comparing with tune S0 This line-scan is considered to give an impression how well

the Professor tuning is doing compared to a tuning that already delivers a quite good description of the data. Furthermore, this scan-line is the largest of the three discussed line-scans with a fraction of 0.3 of the body diagonal of the initial parameter sampling hypercube. Therefore it is giving a very global overview of the parameterization. This line-scan is not intended to play down the performance of tune S0 which was tuned to partially different observables (charged particle multiplicities at the Tevatron and UA5) than those used in this study. It shows how the Professor-tuning and the tune S0 perform on the observables and weights to be found in Table 8 which Pythia 6 was explicitly tuned to in this effort.

The scan-line is defined by the parameter point that is considered as the best (Professor) tune and the parameter point given by the original tune S0. To look “beyond” the parameter points the scan-line is stretched symmetrically by a fraction of about twenty percent of the length of the scan-line.

Figure 74 shows that the quadratic parameterizations in principle reproduce the true generator response (white dots) reasonably well and that the cubic interpolation does an even better job. In the region of the minimum there is no big difference between the

6 TUNING THE UNDERLYING EVENT 6.3 Tuning and validation true generator response and the predictions derived from quadratic or cubic parameterizations, when considering only the goodness-of-fit measure.

There are some discrepancies, however. The quadratic parameterizations underestimate the true response over a wide range of the parameter space, although the agreement is very good in the minimum. The latter is very flat in this parameter space, meaning that there is quite a large region in parameter space where the generator yields equally good descriptions of the data. This can be explained by the presence of parameters that have no or only little influence on the distributions included in the tuning.

By looking at the direction of this scan-line in the parameter space, this can be investigated further. The values in Table 10 identify the direction being dominated by PARP(71) and PARP(93), parameters that we are not very sensitive to, as can be seen in Figure 71 where the parameters in question show a very broad distribution. On top of the line-scan plots the distribution of the minimization results, if projected onto the scan-line, can be found. Since we already know that we look in a direction where the minimum is not very sharp, it is not astonishing that the histogram of the projected minimization results is also very broad.

This dependence on the parameter sensitivity stands as motivation for the next line- scan-plots (see Figure 75, Figure 76 and Figure 77, ), where the scan-line is such that it is along either the direction of the largest or the smallest parameter uncertainty.

Parameter 1

Parameter

Tune 1

Tune 2

Minimization results that do not project onto the scan-line Parameter point scampled from scan-line

Scan-line

Tuning parameter point Parameter sampling hypercube

Minimization results that project onto the scan-line

Figure 73: Scanning along a straight line in a two-dimensional “hypercube” that pierces two certain

tuning points in parameter space. This illustrates how the sampling from the scan-line is done and shows how minimization results can be categorized by checking if they can be projected on the scan-line or not.

Determining the size of the scan-line If a scan along a line in parameter space is

6.3 Tuning and validation 6 TUNING THE UNDERLYING EVENT 0.1 0.5 0.9 100 101 χ 2 / Ndf ˜p

quadratic interpolation(s), Nruns=194

cubic interpolation(s), Nruns=393

scan MC data min(χ2_/N_df_{) =}_4.6 Best tune Tune S0 10 40 ∆ N ∆˜p

Figure 74: Line scan validation of the tuning result obtained with Professor along a straight line

in parameter space that pierces the best tune and the parameter point belonging to the tune S0. The histogram on top shows the distribution of the minimization results obtained with these parameterizations if projected on the scan-line. The scan-line is parameterised by ˜p, its values are scaled: ˜p∈ [0· · ·1].

sampling hypercube does it represent. The length of the scan-line is calculated in Professor relative to the body diagonal of the initial parameter sampling hypercube,

whose edges are given by the vectors~a and~b, while the endpoints of the scan-line are

defined by the vectors~c andd.~

The relative length of the N-dimensional scan-line, `relmay then be calculated as

`rel =" ∑ N i=1(ci−di)2 ∑iN=1(ai−bi)2 #_N1 (21) In the case of the line-scan between the proclaimed best tune and the tune S0 the

summations run only over those parameters i, where the values for ci and di are

different. The value of N is decreased accordingly.

The values of `calculated for the three performed line-scans can be found in Table 9.

Determining the direction of the largest/smallest uncertainty The extremal values in

the parameter uncertainties should indicate those directions, where the χ2_/N

df-function

is very shallow or very steep, respectively. These directions can be calculated using

6 TUNING THE UNDERLYING EVENT 6.3 Tuning and validation

Line-scan `rel

Best tune – tune S0 0.297

Smallest uncertainty 0.066

Largest uncertainty 0.142

Table 9: Lengths of scan-lines relative to the length of the body diagonal of the initial parameter sampling

hypercube.

Since we are interested in the extremal directions only, we can simplify this task by

diagonalisingC with T, such that Σ is diagonal:

C =TT_ΣT _, ₍₂₂₎

where T is a rotation matrix, TT _{its transposed and Σ is the matrix that has the}

eigenvalues on its diagonal and zeros anywhere else. This eigen-decomposition can be applied to any real, symmetric matrix and the procedure we used is implemented in

the linear algebra package of the scientific python (scipy) library.

The eigenvalues of Σ are related to the axes of the rotated hyper-ellipsoid of C . The

largest and the smallest (absolute) eigenvalues σmax,min (largest and smallest ellipsoid-

axes) are picked. The corresponding eigenvectors,~σmax, min, are rotated back into the

original system:

dmax,min =T~σmax, min (23)

The directions of d~max,min are used to define define the scan-lines, they can be found in

Table 10.

Scanning along the direction of largest uncertainty Since we are now able to calculate

the direction of the largest uncertainty it is possible to define a straight line in our parameter space and to sample points from this line. We start at the parameter point of the best tune and move forward (and backward) along this direction to iteratively find the piercing points with the initial parameter sampling cube. Thereby we make sure that the scan-line stays inside the region of the interpolation. A schematic illustration for a two-dimensional case can be found in Figure 75.

From the set of parameterizations available we choose only those, where projections of the corresponding parameter point of the minimum found on the scan-line is inside the initial parameter sampling hypercube.

The achieved results are presented in Figure 76. The interpretation is more transparent here than in the previous line-scan. Again, the predictions from the parameterizations

6.3 Tuning and validation 6 TUNING THE UNDERLYING EVENT and the actual generator response are in very good agreement, especially in the region of the minimum. By looking at the goodness of fit values only almost no difference can be found. However, the quadratic interpolations seem to shift the line-scan a little to the left.

The error of the proclaimed best tune, if projected on the scan-line, is displayed as a gray band. It fits in perfectly and is large enough to cover the true minimum in this projection. We find again that the cubic interpolation describes the true generator response even better, especially in regions further away from the minimum.

Not surprisingly, the distribution of the minimization results obtained from the parameterizations, is quite broad.

Parameter point sampled from scan-line

Scan-line 2 along direction of smallest uncertainty Scan-line 1 along direction of largest uncertainty Error ellipsis

Minimization result considered as best tune Parameter sampling hypercube

Minimization results that project onto scan-line 2 only Minimization results that project onto both scan-lines

Parameter

Parameter 1

Figure 75:Scanning in a two-dimensional “hypercube” along a straight line that pierces the best tuning

point. The scan is done along the direction of the largest/smallest uncertainty. The directions are calculated from the best tune’s covariance matrix.

Scanning along the direction of smallest uncertainty From the line-scans investigated

so far one would expect a scan along a direction of the smallest uncertainty should show the best agreement between the predictions from the parameterizations and the true generator response. Furthermore, the distribution of the minima on the scan-line should be somewhat more narrow.

Indeed, the line-scan in Figure 77 shows exactly this behaviour with a quite good description of the generator response by the cubic interpolation. The minima of all predictions and the true generator response are in perfect agreement.

Furthermore it should be mentioned that the relative size of the scan-line in Figure 76 is only half the value of the the scan-line in Figure 77 (see also) Table 9), meaning that when comparing the two line-scans along the extremal directions, the line-scan axes of

6 TUNING THE UNDERLYING EVENT 6.4 χ2 comparison to different tunes.

In document Systematic Event Generator Tuning with Professor (Page 102-107)