Systematic Event Generator Tuning with
Professor
D
iplomarbeit
zur Erlangung des akademischen Grades Diplom-Physiker
eingereicht von
Holger Schulz
geboren am 14. Februar 1983 in Lutherstadt Wittenberg
H
umboldt
-U
niversität zu
B
erlin
Mathematisch-Naturwissenschaftliche Fakultät I Institut für Physik
1. Gutachter: Prof. Dr. Heiko Lacker (HU Berlin) 2. Gutachter: Dr. Ulrich Husemann (DESY)
Abstract
This diploma thesis describes Professor, a new program for tuning model parameters of Monte Carlo event generators to experimental data by parameterising the per-bin generator response and numerically optimising its parameterised behaviour on data. Simulated experimental analysis data is obtained using the Rivet analysis toolkit. The Professor procedure and implementation, illustrated with the application of the method to a tuning of the Pythia 6 event generator to data from the Tevatron experiments is presented. This tuning is a substantial improvement on existing standard parameter choices, and is recommended as a base tuning for LHC experiments, to be systematically improved itself upon when early LHC data will be available. It is offered in Pythia 6 from version 6.4.20 on as a standard tuning.
CONTENTS CONTENTS
Contents
1 Introduction 11 1.1 Rivet . . . 12 1.2 AGILe . . . 13 1.3 Tuning methods . . . 142 The Professor method 16 2.1 The parameterised response function . . . 17
2.2 Determining the response function . . . 18
2.3 Goodness of fit function . . . 20
2.4 Maximising the total Goodness of Fit . . . 21
2.5 Final checks . . . 22
3 Implementation 23 3.1 A basic work-cycle . . . 23
3.1.1 Prerequisits . . . 23
3.1.2 Random sampling of parameter points . . . 23
3.1.3 Running the generator and storing histograms . . . 24
3.1.4 Reading the data and performing the tuning . . . 25
3.2 Tuning . . . 25
3.2.1 GoF optimisation . . . 26
3.2.2 GoF vs. parameter value . . . 28
3.2.3 GoF vs. weight combination . . . 28
3.2.4 Correlation display . . . 28
3.2.5 Sensitivities . . . 29
3.3 Validation . . . 31
CONTENTS CONTENTS
3.3.2 Tune verification . . . 33
3.3.3 Tuning stability . . . 33
4 The Underlying Event 35 4.1 Probing the Proton structure . . . 35
4.2 Underlying event phenomenology . . . 36
4.2.1 Multiple parton interactions (MPI) . . . 37
4.2.2 Primordial k⊥-distribution . . . 38
4.2.3 Beam Remnants . . . 39
4.3 The Underlying Event at the Tevatron . . . 41
4.3.1 Studying the Underlying Event in leading jet events . . . 41
4.3.2 Studying the Underlying Event in Drell-Yan processes . . . 42
5 The setupS0 in Pythia 6 44 5.1 Model setup - switches . . . 44
5.1.1 MSTP(41) - choosing an evolution variable . . . 45
5.1.2 MSTP(51), MSTP(52)- choosing the parton density function (pdf) . 45 5.1.3 MSTP(70) - deciding on cut-off for initial state radiation (ISR) . . . 46
5.1.4 MSTP(72) - combining initial state and final state radiation . . . . 47
5.1.5 MSTP(81) - how to model multiple interactions . . . 48
5.1.6 MSTP(82) - hadronic matter overlap and impact parameter . . . . 49
5.1.7 MSTP(88)-modelling the transition from quark junctions to di-quarks/Baryons . . . 49
5.1.8 MSTP(95)-chosing a colour-reconnection scenario . . . 50
5.2 Model setup - parameters . . . 51
5.2.1 Evaluating αsfor Initial State Radiation . . . 51
5.2.2 Impact parameter . . . 51
CONTENTS CONTENTS
5.2.4 Intrinsic k⊥ distribution . . . 55
5.2.5 Initial State Radiation (ISR) . . . 56
5.2.6 Merging matrix elements to final state parton showers . . . 56
5.2.7 Colour-reconnection . . . 57
5.2.8 Beam-remnant x-enhancement . . . 58
5.2.9 Suppressing the breakup of the beam remnant . . . 59
6 Tuning the underlying event 61 6.1 Monte Carlo data production . . . 61
6.2 Observables . . . 62
6.2.1 p⊥-distribution of Z-bosons . . . 63
6.2.2 CDF-Run-I multiplicity measurement using a min-bias trigger . . 64
6.2.3 CDF-Run-I underlying event in min-bias and leading jet events . . 65
6.2.4 CDF-Run-II underlying event in leading jet events . . . 71
6.2.5 Minimum bias: p⊥ vs. Nch . . . 76
6.2.6 CDF-Run-II: Studying the Underlying Event with Z-bosons in Drell-Yanevents . . . 76
6.2.7 DØ-Run-IIdijet angular correlations . . . 86
6.3 Tuning and validation . . . 91
6.3.1 Choosing weights . . . 91
6.3.2 Minimisation . . . 93
6.3.3 Parameter-wise distribution of minimization results . . . 94
6.3.4 Weight comparison . . . 96
6.3.5 Linescan validation . . . 100
6.4 χ2 comparison to different tunes. . . 105
6.5 Parameter-parameter correlations . . . 106
CONTENTS CONTENTS
A Professor-usage 112
A.1 prof-scanparams . . . 112
A.2 prof-tune . . . 113
A.2.1 How to use several run-combinations . . . 113
A.2.2 Imposing parameter limits . . . 114
A.2.3 Fixing parameters . . . 114
A.2.4 Choosing the starting point method . . . 115
A.2.5 Choosing the parameterisation method . . . 115
A.3 Visualisation and validation . . . 115
A.3.1 Printing minimization results . . . 115
A.3.2 GoF-comparison of several tunes . . . 116
A.3.3 Merging minimisation results . . . 116
A.3.4 Plotting envelopes . . . 116
A.3.5 Sensitivity plots . . . 117
A.3.6 Pull-distributions . . . 118
A.3.7 Line-scan validation . . . 118
A.3.8 Plotting of minimisation results . . . 119
A.3.9 Correlation display . . . 119
LIST OF FIGURES LIST OF FIGURES
List of Figures
1 Parameterisation of generator response, schematically . . . 16
2 Professor-Rivet work-cycle . . . 26
3 Illustration of sensitivity calculation . . . 30
4 Example pull distributions . . . 32
5 Deep inelastic scattering . . . 35
6 p-p scattering . . . 37
7 Gluon-gluon fusion . . . 37
8 Lepton-pair production at the Tevatron in a Drell-Yan process . . . 38
9 The Underlying Event, schematically . . . 38
10 Multiple parton interaction . . . 39
11 Colour-reconnection, schematically . . . 40
12 Underlying Event topologies . . . 42
13 Subdivision of the transverse region into “TransMin” and “TransMAX” . 43 14 Gluon emission and scale selection . . . 47
15 Smooth turn-off of cross-section . . . 48
16 Beam remnants and formation of composite objects . . . 50
17 The effect of a varying impact parameter . . . 52
18 Screening of colour charges . . . 54
19 q-q-q-junction and Lund-model hadronization . . . 55
20 Intrinsick⊥-distribution of partons inside a hadron . . . 56
21 CDF Run-IZ-boson p⊥-distribution (peak region only) . . . 64
22 CDF Run-I, measuring the charged mutliplicity at √s=630 GeV using a minimum bias trigger . . . 65
23 CDF Run-I, measuring the charged mutliplicity at √s=1800 GeV using a minimum bias trigger . . . 66
LIST OF FIGURES LIST OF FIGURES
24 CDF-Run-I, number of charged particles in the toward region (minimum
bias data sample) . . . 68
25 CDF-Run-I, number of charged particles in the transverse region
(mini-mum bias data sample) . . . 68
26 CDF-Run-I, number of charged particles in the away region (minimum
bias data sample) . . . 68
27 CDF-Run-I, p⊥-sum in the toward region (minimum bias data sample) . . 69
28 CDF-Run-I, p⊥-sum in the transverse region (minimum bias data sample) 69
29 CDF-Run-I, p⊥-sum in the away region (minimum bias data sample) . . . 69
30 CDF-Run-I, p⊥-sum in the toward region (JET20 data sample) . . . 70
31 CDF-Run-I, p⊥-sum in the transverse region (JET20 data sample) . . . 70
32 CDF-Run-I, p⊥-sum in the away region (JET20 data sample) . . . 70
33 CDF Run-I, p⊥-distribution in the transverse region for events with
p⊥,leading jet > 30 GeV) . . . 71
34 CDF Run-IIleading jets, transverse region charged particle density . . . . 73
35 CDF Run-IIleading jets, transMAX region charged particle density . . . . 73
36 CDF Run-IIleading jets, transMIN region charged particle density . . . . 73
37 CDF Run-IIleading jets, transDIF region charged particle density . . . 74
38 CDF Run-IIleading jets, transverse region charged p⊥ sum density . . . . 74
39 CDF Run-IIleading jets, transMAX region charged p⊥ sum density . . . . 74
40 CDF Run-IIleading jets, transMIN region charged p⊥ sum density . . . . 75
41 CDF Run-IIleading jets, transDIF region charged p⊥ sum density . . . 75
42 CDF Run-IIleading jets, transverse region charged p⊥ average . . . 75
43 CDF Run-II, measuring the mean track p⊥ vs. multiplicity using a
mini-mum bias trigger . . . 76
44 The Underlying Event in Drell-Yanprocesses . . . 77
45 CDF Run-IIDrell-Yan, toward region charged particle density . . . 78
LIST OF FIGURES LIST OF FIGURES
47 CDF Run-IIDrell-Yan, transMAX region charged particle density . . . . 79
48 CDF Run-IIDrell-Yan, transMIN region charged particle density . . . . 79
49 CDF Run-IIDrell-Yan, transDIF region charged particle density . . . 79
50 CDF Run-IIDrell-Yan, away region charged particle density . . . 80
51 CDF Run-IIDrell-Yan, toward region charged p⊥ sum density . . . 80
52 CDF Run-IIDrell-Yan, transverse region charged p⊥ sum density . . . . 80
53 CDF Run-IIDrell-Yan, transMAX region charged p⊥ sum density . . . . 81
54 CDF Run-IIDrell-Yan, transMIN region charged p⊥ sum density . . . . 81
55 CDF Run-IIDrell-Yan, transDIF region charged p⊥ sum density . . . 81
56 CDF Run-IIDrell-Yan, away region charged p⊥ sum density . . . 82
57 CDF Run-IIDrell-Yan, toward region charged p⊥ average . . . 83
58 CDF Run-IIDrell-Yan, transverse region charged p⊥ average . . . 83
59 CDF Run-IIDrell-Yan, away region charged p⊥ average . . . 83
60 CDF Run-IIDrell-Yan, toward region charged p⊥ maximum . . . 84
61 CDF Run-IIDrell-Yan, transverse region charged p⊥ maximum . . . 84
62 CDF Run-IIDrell-Yan, away region charged p⊥ maximum . . . 84
63 CDF Run-IIDrell-Yan, average lepton pair p⊥ versus charged multiplicity 85 64 CDF Run-IIDrell-Yan, average charged p⊥ versus charged multiplicity . 86 65 CDF Run-IIDrell-Yan, average charged p⊥ versus charged multiplicity, p⊥ (Z) less than 10 GeV . . . 86
66 DØ Run-II, measuring the azimuthal angle between the two leading jets; pmax⊥ ∈ {75,100} GeV . . . 87
67 DØ Run-II, measuring the azimuthal angle between the two leading jets; pmax ⊥ ∈ {100,130} GeV . . . 88
68 DØ Run-II, measuring the azimuthal angle between the two leading jets; pmax ⊥ ∈ {130,180} GeV . . . 88
69 DØ Run-II, measuring the azimuthal angle between the two leading jets; pmax ⊥ > 180 GeV . . . 88
LIST OF FIGURES LIST OF FIGURES
70 How different weights influence the shape of observables . . . 94
71 Minimisation results, χ2/Ndfvs. parameter . . . 98
72 Minimisation results, parameter vs. weights . . . 99
73 Linescan between tuning points, schematic . . . 101
74 Line scan validation of the tuning result obtained with Professor, com-paring to tune S0 . . . 102
75 Linescan along direction of largest/smallest uncertainty, schematic . . . . 104
76 Line scan validation of the tuning result obtained with Professor along the direction of largest uncertainty . . . 105
77 Line scan validation of the tuning result obtained with Professor along the direction of smallest uncertainty . . . 106
LIST OF TABLES LIST OF TABLES
List of Tables
1 The number of polynomial coefficients N(nP) as a function of the
dimen-sionality . . . 18
2 Switches for the Underlying Event in an S0-like setup in Pythia 6. . . 45
3 Tuned flavour parameters and their default values in Pythia 6. . . 60
4 Tuned fragmentation parameters and their default values for the p⊥
ordered shower in Pythia 6. . . 60
5 Parameter sampling boundaries used for the sampling of points in the
nine-dimensional space of parameters relevant for the underlying event physics model in Pythia 6. . . 62
6 Tuned parameters for the underlying event using the p⊥-ordered shower
in an S0-like setup. . . 63
7 An overview of the observables and the parameters they are most
sensi-tive to . . . 90
8 Observables and weights used for the tuning of the underlying event in
an S0-like setup. . . 93 9 Relative length of scan-lines . . . 103
10 Vectors used for the definition of the scan-lines in the nine-dimensional
parameter hypercube. . . 107 11 Comparing different tunes in terms of χ2. . . 107
1 INTRODUCTION
1
Introduction
It is an inevitable consequence of the physics approximations in Monte Carlo event generators that there will be a number of relatively free parameters which must be tweaked if the generator is to describe experimental data. Such parameters may be found in most aspects of generator codes, from choices of the perturbative parton cascade to the non-perturbative hadronisation process, and on the boundaries between such models.
With a view to the upcoming LHC-experiments, the modelling of QCD (Quantum Chromo Dynamics) in general and its non-perturbative aspects in special will be crucial for the understanding of the data taken. Since next to the expected and hopefully to be found “new physics”, what is definitely going to be measured at the LHC is QCD. Since non-perturbative physics models are by necessity deeply phenomenological, they typically account for the majority of generator parameters: typical hadronisation models
require parameters to describe e.g. the kinematic distribution of p⊥ (transverse
momen-tum1) in hadron fragmentation, baryon- to meson-ratios, strangeness and suppression
of η and η0 mesons, and the assignment of orbital angular momentum to final state
particles [1–4].
Furthermore the model implementation of the underlying event which is expected to be the dominating background at the LHC depends on deeply phenomenological assumptions as well, be it the primordial distribution of partons inside the colliding hadrons or the mechanism of multiple parton interactions. The result is a proliferation
of parameters, of which between O(10–30) are of particular importance for collider
physics simulations.
Apart from rough arguments about their typical scale, these parameters are freely-floating: they must be matched to experimental data for the generator to perform
well. Even parameters which appear fixed by experiment, such as ΛQCD, should be
treated in generator tuning as having some degree of flexibility since the generator (unlike nature) can only apply them in a fixed-order scheme with incomplete large log resummation. It is also important that the experimental data to which parameters are tuned covers a wide range of physics, to ensure that in fitting one distribution well, others do not suffer unduly. Performing such a tune manually is slow, does not scale well, and cannot be easily adapted to incorporate new results or generator models. In addition, the results are always sub-optimal: a truly good tuning of a generator, which
1Transverse momentum orp⊥ describes the momentum component of a particle or a compound
1.1 Rivet 1 INTRODUCTION
can highlight deficiencies in the physics model as well as provide improved simulations for experimentalists, requires a more systematic approach.
In this thesis the new tuning system Professor (PROcedure For EStimating Systematic errORs) for Monte Carlo event generators is described. It eliminates the problems with manual and brute-force tunings by parameterising a generator’s response to parameter shifts on a bin-by-bin basis (see Figure 1). This parameterisation, unlike a brute-force method, is then amenable to numerical minimisation within a timescale short enough to make explorations of tuning criteria possible. Adding new data of generator models within the system is also relatively simple. Then the Professor procedure is applied to an optimisation of the Pythia 6 event generator against underlying event (UE) data from
the Tevatron (Run-Iand Run-II) experiments CDF and DØ using data from leading jet,
Drell-Yanand minimum bias events. Furthermore the results of an earlier tuning, also
obtained with the Professor-system, of flavour- and fragmentation-parameters to e+e−
event shape and flavour spectrum data [5] are used as well.
The resulting tune is a substantial improvement on existing tunes, such as the Atlas ’08 tune [6] and demonstrates the Professor system as an important tool for LHC event simulation both before data taking, and in response to early measurements at and above a center-of-momentum energy of 10 TeV.
Systematic tunings of Monte Carlo event generators were performed for the first time already in 1995 by Hamacher et al. [7] on the fragmentation model in Pythia. The Professor system is based on the same ideas and can be seen as a direct successor. It is a joint project of Andy Buckley1, Hendrik Hoeth2, Frank Kraus1, Heiko Lacker3, Holger
Schulz3 and Jan Eike von Seggern3.
The Professor system is based on simulated experimental analysis data, which is provided by the Rivet analysis library. As Professor and Rivet development are closely linked, Rivet is summarised first.
1.1
Rivet
Rivet [8,9] is a tool, written in C++, designed for the comparison and validation of Monte
Carlo event generators. The only reasonable way to do a validation of a Monte Carlo event generator is to compare its predictions to real data. Rivet is generator independent since it uses an abstract data format as the only input, the HepMC event record. This
1Durham University, UK 2Lund University, Sweden
1 INTRODUCTION 1.2 AGILe
format stores all the relevant information of a generated event except the used generator. This makes it a preferred standard since it avoids the temptation to fudge around with the actual generator code. Another benefit of the HepMC event record format being the only input to Rivet is the general-purpose character that emerges for the use of it for Rivet. After all, any generator that has an interface to this data format can be used with Rivet and, as will be seen later, is in principle tunable with Professor.
In order to compare generator output to real data, Rivet resorts to published and unpublished data stored in histograms using the exact same binning of the original data also for the Monte Carlo “data”. The analyses included in Rivet were implemented in close cooperation with the authors, especially when certain cuts or procedures are not clearly explained in the publications. Thus, the implementation is done on a high level of detail.
Also, the steadily growing number and variety of analyses included in Rivet is impres-sive. At the moment it covers mostly the experiments of LEP (ALEPH, DELPHI, Jade, L3, OPAL), HERA (H1, ZEUS) and from the Tevatron (CDF, DØ). In this diploma thesis I will focus on the latter two since they are most interesting for the prospect of tuning the underlying event in hadron-hadron collisions. The variety of data gave rise to a very large library of algorithms needed for the imitation of original data analyses. For instance quite a number of jet-algorithms (k⊥, Jade, Midpoint, SIScone, Durham. . . ) and event shape variables such as Thrust, Sphericity, Oblateness or the Parisi-Tensor, are predefined and therefore easily accessible by the user.
In order to use Rivet, the user only has to decide what analyses he wants to consider and to specify either a file or a pipe object to read the HepMC-data from. The analyses are given in a convenient form, using a combination of the experiment and the SPIRES-ID [10] as an identifier. An example usage like the following would read the
HepMC-data from a file called input.hepmc, use all the observables found in the analysis
CDF_2000_S4155203 and write the resulting histograms to a file calledhistograms.aida:
rivet input.hepmc -a CDF_2000_S4155203 -H histograms.aida
The output format of Rivet is by default the AIDA-XML[11] histogram format but also
ROOT [12] and “flat” (encoding as a plain text file) formats are supported.
1.2
AGILe
While all the modern C++-generators like Sherpa, Herwig++ and Pythia 8 are equipped
1.3 Tuning methods 1 INTRODUCTION
layer of abstraction, the older Fortran-generators1do not. They require another interface,
called AGILe (A Generator Interface Library) that steers the generators in a convenient way and produces the HepMC output required by the analyses implemented in Rivet.
1.3
Tuning methods
While Rivet provides a system for comparing Monte Carlo distributions for a given generator parameter set to a wide range of experimental data, it has no intrinsic mechanism for improving the quality of that parameter set. Historically, the usual methods of generator tuning have been the purely manual “by eye” method, and a brute-force scan of the parameter space.
Manual tunes Tuning any complex system by eye is evidently non-optimal, and
would barely be worth mentioning were it not the most widely used method until now! Manual methods require significant insight into the algorithmic response to parameter choices for even semi-reasonable results; and they are intrinsically slow since the procedure typically involves a lot of iterations of parameter choices and, even with unhappily low statistics, the turn-around time of a set of runs is a day or more. The scaling is also poor: few humans can cope with manual optimisation of more than five or so parameters, guided by a similar number of comparison plots. The responsiveness to new data or models is similarly deficient, since tuning a different generator essentially involves starting from scratch and, having done a tune once, few people are enthusiastic to repeat the exercise! The prevalence of manual tunings, despite their myriad shortcomings, is a major motivator for the development of Professor.
Brute force tunes This label includes any direct approach which involves running
generators very many times. Naïvely, one can think about dividing a parameter space up into a grid and then sampling on the grid line intersections. It will be readily seen that such an approach does not scale: a comprehensive scan of 5 parameters, with 10 divisions in each parameter will require 100,000 generator runs, each perhaps making 10M events. And even then, the sampling granularity will be insufficient for meaningful
results. Randomly sampling the space, looking for serendipitous best-χ2 values has
more merit, but is similarly bedevilled by scaling problems and a lack of satisfying ways to either systematically improve the “best” point, or to know whether the minimum that was stumbled into has been local or global.
Finally, the approach of putting a generator code into a Markov Chain Monte Carlo
1AGILe supports, among others, these Fortran-generators: Pythia 6, Herwig 6, AlpGen, Charybdis
1 INTRODUCTION 1.3 Tuning methods
(MCMC) optimiser such as Minuit, may be summarily dismissed. While the approaches above have the benefit of being parallelisable, MCMC is an intrinsically serial method:
one must wait for the nth “function” evaluation to decide where the(n+1)th will be.
Since generator runs take days, and even the burn-in periods of MCMC samplers may require thousands of samples, this approach is clearly intractable.
Parameterisation-based tunes The final approach, which has a lengthy history [7,13],
is to parameterise the generator behaviour. Since the fit function itself is expected to be complicated and not readily parameterisable, there is a layer of indirection: a polynomial is actually fitted to the generator response, MCb, of each observable bin bto the changes in the parameter vector~p = (p1, . . . ,pP) ofP parameters.
Having determined, via means yet undetailed, a good parameterisation of the generator response to the steering parameters for each observable bin, it remains to construct a goodness of fit (GoF) function and minimise it. The result is an estimated parameter
vector, ~ptune, which should (modulo checks of the technique’s robustness) closely
resemble the best description of the tune data that the generator can provide.
In parameterisation-based tuning, the run time is dominated by the time taken to run the generator and generate the reference data points. Assuming that sufficient CPU is available to run several hundred MC jobs in parallel, this is at most a few days; the time taken to convert this to a predicted set of best parameters is a few minutes (and can again be parallelised for different configurations as a safety check.) If the details left out above are tractable, then this technique offers the possibility of systematic tuning on a timescale compatible with rapid and exploratory re-tunings, ideal for responding to early LHC measurements.
As will have become obvious, parameterisation-based optimisation is the approach taken by the Professor system. The following sections document the details of the Professor method and implementation, the basic work-cycle of a tuning with Professor is illustrated, followed by an introduction to the phenomenology of the underlying event. The focus of this work will be on the tuning of a model setup of the underlying event in Pythia 6, called “tune S0”. Particular attention is payed on the validation of the obtained tuning and comparisons to other standard choices will be presented in many ways. Finally, a brief user-guide of the Professor-system is presented.
2 THE PROFESSOR METHOD
observable
parameter
parameterisation
Figure 1:The bin-wise parameterisation of the generators response under parameter shifts in
parameteri-sation based tuning systems, such as Professor. This is a qualitative example for the application on a one-dimensional parameter space. For each bin of the observable(s) in question, a unique parameter-isation of the generator response is being calculated as a function of the varied parameter(s) and can therefore be used to approximately predict the generators outcome for any parameter value. Given the simple nature of the parameterisation using polynomials of second or third order, a global goodness of fit measure may be defined and easily minimised numerically.
2
The Professor method
To summarise, the rough formalism of systematic generator tuning is to define a goodness of fit (GoF) function between the generated and reference data, and then to minimise that function. The intrinsic problem is that the true fit function will certainly not be analytic and any iterative approach to minimisation will be doomed by the expense of evaluating the fit function at a new parameter-space point. What we require is an optimisation method designed for very computationally expensive functions
whose form is not known a priori. Parameterisation-based optimisation meets these
criteria by using numerical methods to mimic the behaviour of an expensive function by using inexpensive ones, and by being amenable to parallelisation in the critical stages. The details to be described in this section are: the choice of general parameterisation function, the method for fitting the general function to the specific response of a MC event generator, the goodness of fit function to be used, and the method of maximising
2 THE PROFESSOR METHOD 2.1 The parameterised response function
its quality.
2.1
The parameterised response function
As already described, the function to be parameterised is not the overall goodness of fit function between the simulation and the reference data, but the large set of observable
bin values for every bin, b, in every distribution. Accordingly, the output of the first
stage of Professor is a set of functions f(b)(~p), which model the true MC response, MC b, of each observable bin to changes in theP-element parameter vector,~p.
This ensemble of parameterisations is useful in two ways: first (and most importantly), it provides safety against deviations from the chosen form of the parameterising function, since such deviations are not likely to be correlated between a majority of the bins in valid regions of parameter space. This incoherence of failure to describe the bin-wise generator response ensures that the aggregated measure of generator modelling is faithful to the true behaviour. Second, by breaking the problem down to a fine-grained level, it is possible to select particular regions of distributions as more interesting than
the rest — say, the peak of the Z-boson p⊥ spectrum, which is particularly sensitive to
QCD modelling.
To account for lowest-order parameter correlations, a polynomial of at least second-order is used as the basis for the bin parameterisation:
MCb(~p) ≈ f(b)(~p) =α(0b)+
∑
i
β(ib)pi0+
∑
i≤j
γij(b) p0ip0j, (1)
with~p0 being the shifted parameter vector~p0 ≡~p−~p0.
The number of parameters and the order of the polynomial determine the number
of coefficients to be determined. For a second order polynomial in Pparameters, the
number of coefficients is
N(2P) =1+P+P(P+1)/2, (2) since only the independent components of the matrix term are to be counted. For a
general polynomial of ordern, the number of coefficients is
N(nP) = n
∑
i=0 1 i! i−1∏
j=0 (P+j). (3)How the number of parameters scales with P for 2nd and 3rd order polynomials is
2.2 Determining the response function 2 THE PROFESSOR METHOD
Num params, P N(2P) (2nd order) N(3P) (3rd order)
1 3 4 2 6 10 4 15 35 6 28 84 8 45 165 9 55 220 10 66 286
Table 1: The number of polynomial coefficients N(nP) as a function of the dimensionality (number of
parameters)P, for polynomials of second order (n=2) and third order (n=3).
A useful feature of using a polynomial for the fit function, other than its general-purpose robustness, is that the actual choice of ~p0 is irrelevant: a shift in the reference point
simply redefines the {α,β,γ} coefficients, but the function remains the same. Hence
we are free to choose a numerically stable value within each parameter’s chosen range without loss of generality: we use the centre of the hypercube [~pmin,~pmax], as will be
defined in the next section.
2.2
Determining the response function
Given a general polynomial, one must now determine the coefficientsα,β,γfor each
bin so as to best mimic the true generator behaviour. This could be done by a Monte Carlo numerical minimisation method, but there would be a danger of finding sub-optimal local minima, and automatically determining convergence is a potential source of problems. Fortunately, this problem can be cast in such a way that a deterministic method can be applied.
One way to determine the polynomial coefficients would be to run the generator at as
many parameter points, N, as there are coefficients to be determined. A square N×N
matrix can then be constructed, mapping the appropriate combinations of parameters on to the coefficients to be determined; a normal matrix inversion can then be used to solve the system of simultaneous equations and thus determine the coefficients. Since there is no reason for the matrix to be singular, this method will always give an “exact” fit of the polynomial to the generator behaviour. However, this does not reflect the true complexity of the generator response: we have engineered the exact fit by restricting the number of samples on which our interpolation is based, and it is safe to assume that taking a larger number of samples would show deviations from what a polynomial can
2 THE PROFESSOR METHOD 2.2 Determining the response function
describe, both because of intrinsic complexity in the true response function and because of the statistical sampling error that comes from running the generator for a finite number of events. What we would like is to find a set of coefficients (for each bin) which average out these effects and are a least-squares best fit to the oversampled generator points. As it happens, there is a generalisation of matrix inversion to non-square
matrices — the pseudoinverse[14] — with exactly this property.
As suggested, the set of “anchor” points for each bin are determined by randomly
sampling the generator from N parameter space points in an P-dimensional parameter
hypercube [~pmin,~pmax] defined by the user. This definition requires physics input —
each parameter pi should have its upper and lower sampling limits pmin,max chosen
so as to encompass all reasonable values; we find that generosity in this definition is sensible, as Professor may suggest tunes which lie outside conservatively chosen ranges, forcing a repeat of the procedure. Each sampled point may actually consist of many generator runs, which are then merged into a single collection of simulation histograms. The simultaneous equations solution described above is possible if the
number of sampled points is the same as the number of coefficients between the P
parameters, i.e. N = Nmin(P) = N(nP). The more robust pseudoinverse method applies when N >Nmin(P): we prefer to oversample by at least a factor of 2.
The numerical implementation of the pseudoinverse uses a standard singular value decomposition (SVD) [15]. First, the polynomial is cast into the form of a scalar product,
MCb(~p)≈ f(b)(~p) = Nmin(P)
∑
i=1 c (b) i p˜i, (4)where the c(ib) coefficients are the independent components of α(0b), β(ib), and γ(ijb) in
equation (1), and ~p˜ is an extended parameter vector containing all the corresponding
combinations of the parameter vector components:
~p˜ = (1, offset in (1) p1,· · · ,pN, linear terms in (1) p1p1,p1p2,· · · ,p1pN, p2p2,· · · ,p2pN, · · · pNpN) quadratic terms in (1) (5)
2.3 Goodness of fit function 2 THE PROFESSOR METHOD
the matrix equation,
~v(b) = ˜P~c(b) , (6)
where~v(b) contains the generated bin values at the sample points, and the rows of ˜P
are composed of extended parameter vectors like~p˜ in equation (5). The c(ib) can then be
determined by the pseudoinversion of ˜P,
~c(b) =I˜[˜P]~v(b), (7)
where ˜I is the pseudoinverse operator.
For a two parameter case, where the parameters p1 and p2 can take on the N values
x1,· · · ,xN and y1,· · · ,yN, the above may be explicitly written as 1 x1 y1 x12 x1y1 y21 1 x2 y2 x22 x2y2 y22 ... 1 xN yN x2N xNyN y2N | {z }
˜P(sampled parameter sets)
α0 βx βy γxx γxy γyy | {z } ~c(coeffs) = v1 v2 ... vN | {z } ~v(values) (8)
where the numerical subscripts indicate the N generator runs. Note that the columns
of ˜P include all Nmin(2) =6 combinations of parameters in the polynomial, and that ˜P is square (i.e. minimally pseudoinvertible) when N = Nmin(P).
Except for demanding more sample points than can be computed in a reasonable time on the available batching facilities, the order of the polynomial has no influence on the functioning of the parameterisation. Hence, the method may be extended in accuracy of the fitting function as required. In practice, a 2nd order polynomial suffices for almost every MC generator distribution studied to date, i.e. there is no correlated failure of the fitted description across a majority of bins in the vicinity of the best generator behaviour, i.e. the region in parameter space where the generator describes the data well.
2.3
Goodness of fit function
As the goodness of fit (GoF), a heuristicχ2 function is chosen, but other GoF measures
can certainly be used. Since the relative importance of various distributions in the
2 THE PROFESSOR METHOD 2.4 Maximising the total Goodness of Fit
each observable O in the χ2 definition. For example in the previous [5] tuning of
the fragmentation model in Pythia 6, among other observables, the overall charged multiplicity was used. This is a quantity measured very precisely but it is represented by a single bin only. However, the heuristicχ2 definition treats all bins equally, meaning
that the importance of the charged multiplicity would have been lost without weighting since event shape distributions that were also included in the previous tuning consist of about 20 bins and are not measured to a comparable level of precision than the multiplicity is. Therefore, weighting up the multiplicity by a factor of at least 10 or so is sensible to maintain its relevance to the GoF measure, defined as:
χ2(~p) =
∑
O wOb∑
∈ O (fb(~p)− Rb)2 ∆2 b , (9)whereRb is the reference value for binb and the total error∆b is the sum in quadrature
of the reference error and the statistical generator errors for bin b. Furthermore,
observables exist that generators are known to be unable to describe due to technical imperfections and limitations in the shower algorithms. To be able to describe those observables a little bit better, they may also be weighted up. In practice we attempt to generate sufficient events at each sampled parameter point that the statistical MC error is much smaller than the reference error for all bins. In computing the number of degrees of freedom, the weights again enter:
Ndf =
∑
O wO|{b∈ O}|. (10)
It should be noted that there is unavoidable subjectivity in the choice of these weights, and a choice of equal weights is no more sensible than a choice of uniform priors in a Bayesian analysis; physicist input is necessary in both choosing the admixture of observable weights according to the criteria of the generator audience — a b-physics experiment may prioritise distributions that a general-purpose detector collaboration would have little interest in — and to ensure that the end result is not overly sensitive to the choice of weights.
2.4
Maximising the total Goodness of Fit
The final stage of the Professor procedure is to minimise the parameterised χ2 function.
It is tempting to think that there is scope for an analytic global minimisation at this order of the polynomial, but not enough Hessian matrix elements may be calculated to constrain all the parameters and hence one must finally resort to a numerical minimisation. This is the numerically weakest point in the method, since the weighted
2.5 Final checks 2 THE PROFESSOR METHOD
quadratic sum of hundreds of polynomials is a very complex function and there is scope for getting stuck in a non-global minimum. Hence the choice of minimiser is important. The output from the minimisation is a vector of parameter values which, if the parame-terisation and minimisation stages are faithful, should be the optimal tune according to the (subjective) criterion defined by the choice of observable weights.
2.5
Final checks
On obtaining a best tune estimate from Professor, it is prudent to check the result by running the generator again at the “best” tune: this can be done directly with Rivet. It is useful to verify that the generator behaves in the vicinity of the estimated best tune as predicted by the parameterisation by scanning the generator along a line which passes
through the “best” point and comparing with the Professor prediction of how theχ2
will change. This is also useful for explicitly comparing default/alternative parameter sets to Professor’s optimised tunes, by making the scan line intersect both points and plotting the slice of the GoF function along the line. Such a line scan can be seen in Figure 74.
A final important point of the procedure remains: so far the procedure has dealt entirely with a single set of reference runs entering the parameterisation and minimisation procedure. However, this is rather dangerous: it may be that an inappropriate set of runs is being picked, or that a subset of points is skewing the fit and the minimisation result away from the true generator behaviour. Even if this is not the case, the lack of any alternative to which one can compare means that there is little knowledge about the procedure’s systematic uncertainties. Hence, it has also been found useful to oversample by a considerable fraction, N Nmin(P), and then to perform the parameterisation and χ2
minimisation for a large number of distinct run-combinations using{Ntune} runs each
(Nmin(P) Ntune) ≤N.
The set of different parameterised generator responses and according tunes from all the different{Ntune} provides a systematic control. It is important that the tuning run
combinations need a significant degree of independence from each other. For example,
using run-combinations with Ntune = N−1 runs will result in highly correlated
parameterisations and therefore in an underestimation of systematic uncertainties. In practice, combinations that use a fraction of about two-thirds of all available runs are
recommended. It should further be noted that using Ntune = N typically gives good
3 IMPLEMENTATION
3
Implementation
In this section a detailed description of the implementation of the Professor-method is given, starting with a summarising basic work-cycle that covers the main aspects of the procedure, followed by explanations of how the tuning is performed and a presentation of the various methods available in Professor for its validation.
3.1
A basic work-cycle
The basic steps needed to perform a tuning of a Monte Carlo event generator with Professor are briefly described in the following paragraphs. An illustration of the basic work-cycle can be found in Figure 2.
3.1.1 Prerequisits
The tuning of a Monte Carlo-generator, or more precisely, a certain built-in model with Professor, is summarised here. The first step of course has to be the decision, what data the model parameters should be tuned to. In principle as much data as possible is preferred, the more precisely measured, the better. Then again, the parameters in question need to have impact on the shape of selected distributions. Clearly, most of the observables will only be sensitive to the one or the other subgroup of parameters. It is very unlikely to find observables that are significantly sensitive toall parameters. Therefore, a reasonable combination of observables must be chosen. Moreover it will be wise to use more than one combination of observables in order to check the robustness of the later estimated best parameter point.
3.1.2 Random sampling of parameter points
Having decided what parameters to tune to what observables, the random parameter
points need to be sampled. This is done using the prof-scanparams script which
only needs a file of parameter ranges and the desired number of parameter points as
input. We are using Python’s uniform random generatorrandom.uniformin order to
independently sample values in all dimensions of a hypercube defined by the input file. Currently only uniform sampling of parameter values is provided by Professor but more general sampling can be added easily by modifying the script, but is not an intrinsic feature. For example, there are parameters that have a logarithmic scaling where a
3.1 A basic work-cycle 3 IMPLEMENTATION
uniform random generator would lead to an inappropriate sampling of the parameter hypercube and therefore invalid parameterisation of the generators behaviour.
The default output format is a list of simple name–value pairs, suitable for use with the AGILe generator interfaces, but more complex templating can also be used for native use with generators such as Herwig++, which have a more complex configuration system.
The number of runs must be at least Nmin(P), calculable by equation (2) or equation (3), but usually several times this number is used, so that the parameterised function is not artificially anchored to the sampled values but may float away from them to exploit the least-squares property of the pseudoinverse.
It is worth noting that despite the scaling of Nmin(P), the volume of the hypercube still
scales exponentially withPand that the number of samples must keep approximate pace
with this scaling, especially in wide scans where many different generator behaviour regimes may be encountered. The power law scaling of the polynomial does not obviate the responsibility to ensure that the fitting method sees a representative sample of the space to be fitted. It is also wise to ensure that the sample range are chosen so as to include the default tune, at least in the first phase of a tuning: hopefully this would automatically be so, based on the rule that a first set of sampling ranges should at least include all “reasonable” values of each parameter.
3.1.3 Running the generator and storing histograms
The job of running the generator and Rivet (and of merging the output histograms from different kinematic regions, if required) is mainly left up to the user. This is because different system configurations, the variety of batching systems, the choice of contributing Rivet analyses, etc. effectively mandate some user customisation of run scripts. Attempting to automate this process would likely lead to disastrously algorithmic tuning efforts.
Besides these details, the principle is always the same. The generator outputs the just generated event in a convenient, generator independent format, called HepMC. The ability to produce this kind of output is the only technical requirement a generator has to match to be tunable by Professor. Rivet uses this format and processes analyses on an event by event basis. The HepMC format also allows for the application of several different analyses to the same event on run-time. After an event has been processed, the according histograms are being filled and eventually written to a file after the event generation has stopped.
3 IMPLEMENTATION 3.2 Tuning
For most of the Professor procedure, the analysis data is defined by a directory structure containing a reference directory and a set of run directories, each of which contains histogram files from Rivet. The same structure may be conveniently used to store output from different tunes. It is also possible for analysis programs other than Rivet to provide input for Professor tuning, provided that their data format is in a format which can be used by Professor, or can be converted to such a format. The currently most-used data format is the “AIDA” XML [11] format, as this is the main Rivet output format. When, as planned, Rivet’s data format is upgraded to use the simpler “YODA” [16] data files (encoding in a plain text file), Professor will also support this format.
3.1.4 Reading the data and performing the tuning
Loading of the data files is currently not very effective, i.e. all data files are read in
and stored in memory during processing. For large data sets, e.g. ∼ 1000 sampled
parameter points with distributions amounting to ∼104 bins per point, this produces a
lead time of ∼ 1 minute on a typical workstation and large memory occupancy. For
larger input sets, where this lead time may be less tolerable, loading on demand and deletion of unused bins from memory will be explored.
Once the data is loaded, the generator response is being parameterised based on a combination of the available parameter points and the associated histograms. Finally, a parameterised goodness of fit is defined and numerically minimised. Further details are given in the next paragraph.
3.2
Tuning
The main tuning stage is accessed via the prof-tune program. This performs the
combination of parameterisation and optimisation against reference data for each of a set of MC run combinations, based on the runs found in the input directory. The run
combinations can either be uniquely and randomly generated at run-time byprof-tune,
or can be supplied via a plain text file in which each line is a white-space separated list of run names. This latter method is most useful for parallelising the tuning for a large number of run combinations, as it was done in this study for the validation of the tuning’s robustness against different choices of observable weights.
3.2 Tuning 3 IMPLEMENTATION
perform tuning random parameter sampling
Beam parameters
Number of events to generate Generator specific parameters
Sherpa Herwig++ Pythia8 HepMC Ev ent Record Pythia6 Herwig . . .
AGILe
Hadron multiplicitiesEvent shape variables. . . Analyses: Z-BosonpTdistribution Histograms
Rivet
Professor
Figure 2:Illustration of the basic work cycle with Professor and Rivet. The starting point is the random
sampling ofNparameter points from then-dimensional parameter hypercube. These parameter points among others are used to steer the generator. The common output is the HepMC event record format which serves as the only input needed to Rivet to fill the histograms of the distributions the user specifies. In the end there areNhistograms that will be used by Professor to perform the tuning.
Parameterisation and fitting
Professor currently supports second- and third-order polynomials for parameterisation — as previously discussed, these are robust against origin-translation in a way necessary for the pseudoinverse method to work, and our experience is that a second-order polynomial is sufficient for almost all purposes in generator tuning.
For the numerical evaluation of the pseudoinverse procedure, NumPy’s [17] implemen-tation of the singular value decomposition is used.
3.2.1 GoF optimisation
Although the Professor system offers the calculation of interpolation errors, they are not included in the GoF-definition. This is because the interpolation error is calculated dynamically (see equation (28) in the appendix) and will hence drive the minimizer to regions of extrapolation where the interpolation error gets huge and the resulting
χ2/N
3 IMPLEMENTATION 3.2 Tuning
of a relative “theory” error, reflecting the degree of disbelief (≈10%) of the generator
authors have in their own models is going to be considered in the GoF-definition. To get an intuitive way of excluding single bins from the GoF calculation, the
imple-mented χ2function differs slightly from equation (9) in that weights are not applied on
a per-observable but on a per-bin scope. The Ndf definition is changed accordingly. By
this, single bins can be left out of theχ2 calculation by setting their respective weights
to zero. This is used to veto null bins and those bins with zero error which would lead to a divergentχ2. From the resulting bins the Ndf (equation (10)) is computed and aχ2
function χ2(~p)is constructed, which is passed to the minimizer.
The optimisation of the heuristic χ2/N
df function is implemented using minimisers
from SciPy [18] and also PyMinuit [19], a Python interface to the CERN Minuit package. However, the preferred and default choice is PyMinuit, since it uses a Markov chain
method, which copes with high dimensional problems better than the SciPy Nelder
-Meadsimplex minimiser, and offers error estimates and covariance calculations.
Professor is also able to apply limits to each parameter in the minimisation, which helps to exclude unphysical results. The limits used in such cases should not just be the sampling limits, unless those were determined by physical restrictions, since a minimisation falling outside the sample limits is actually a useful result which should not be obscured.
By default the starting point for the minimizer is chosen to be the center of the parameter space defined by our parameter sampling ranges. It is also possible to specify a random starting point. Minuit evaluates the parameter uncertainties by calculating those parameter points at which theχ2/N
dfvalue exceeds that of the minimum by 1: for
a truly χ2-distributed test statistics this corresponds to a 1σerror estimate. However, in
the tuning of the underlying event presented in this thesis (section 5 and section 6), the systematic errors introduced by using several different run-combinations have shown to be larger than the errors quoted by PyMinuit.
A successful minimisation will write out its details to a file, including the optimal parameters and their correlations, a file of histograms for all the observables included in the fit, based on the parameterisation at the tune point, and information about the number of parameters, optimal GoF value(s), etc. These can then be studied and plotted as described in the next section.
3.2 Tuning 3 IMPLEMENTATION
Tuning output and visualisation
The result of the tuning stage, in the form of the prof-tuneprogram, is a file of tune
points, plus their GoF scores. If the tuning has been parallelised, there will be several such result files, which can be merged together if desired. The tunes can be visualised either textually or graphically, using another script.
Graphical visualisation is particularly useful, and comes in three different forms:
3.2.2 GoF vs. parameter value
Each tuning parameter produces a plot of GoF vs. parameter value, with parameter sampling boundaries indicated. Run combinations of different size are represented
in different colours, and points which lie outsideany of the sampling boundaries are
indicated by lighter shades of their point colour: this makes it easy to see how tuning estimates fit into the high-dimensional sample space. Clearly, if a cluster of points falls outside one or more sampling boundaries, their GoF values are less than trustworthy. This is because outside the sampling boundaries, the parameterisation is extrapolating the generator’s behaviour rather than interpolating it. In these cases a re-run of the generator sampling, with expanded boundaries is recommended. Examples may be found in Figure 71.
3.2.3 GoF vs. weight combination
If several combinations of weights have been tested, they can be graphically displayed side-by-side to verify that the tune is robust against reasonable changes of GoF definition inherited from the choice of different weights. Reasonable means the exclusion of the possibility of “overtuning”, which may happen due to the assignment of an extreme weight to a certain badly described observable. This may force the minimiser to find a parameter set that is able to reproduce data, though. The description of other observables are, however, likely to be lost by doing so. Further discussions on this topic can be found in section 6.3.1 and in Figure 70. overtuning. Examples of the plots mentioned can befound in Figure 72.
3.2.4 Correlation display
For each minimisation result we also store the covariance matrix between parameters
3 IMPLEMENTATION 3.2 Tuning
The Professor-system provides the user with the possibility to calculate the coefficients of correlation ρij for each pair of parameters (i,j) from the (symmetric and real) P
-dimensional covariance matrixC:
ρij = pCij
CiiCjj (11)
The resulting parameter-parameter correlations can be displayed as colour-map plots or as tables such as in Table 12.
3.2.5 Sensitivities
It is desirable to tune to those observables being most sensitive to parameter changes. Clearly, if a parameter has no effect on an observable at all the minimiser is very likely to yield useless results or it might even fail to converge. The Professor package offers the calculation of a bins sensitivity to changes of the values of the parameters in question for a tuning based on the parameterisation in two different ways. For an overview of the whole parameter space, the latter is binned in 100 bins. For each of these parameter bins, the sensitivity is calculated as:
S(ib)(~p) ≈ f(b)(p1· · ·pi,pi+εi,pi+1· · ·pP))− f(b)(~p)
f(b)(~p) ·
pi
pi+εi (12)
where the εi are conveniently set to one percent of the initial parameter sampling
interval[pmin
i ,pmaxi ]and P being the dimension of the parameter space. An illustration
of the procedure of the sensitivity calculation can be found in Figure 3.
By doing so, a side-by-side comparison of an observables sensitivity to all parameters included in the parameterisation can be displayed as colour-map plots. These plots may be used to identify and remove those observables that have little or no impact at all. They can also be considered as an a posteriori justification for the choice of observables included in a tuning.
In order to be able to visualise the sensitivity information in a space saving, i.e. two-dimensional, way for this thesis, a measure for the sensitivity needs to be introduced that represents a bins sensitivity over the whole parameter range. Since averaging tends to conceal information and could result in misleading interpretations of the sensitivity, especially in cases where the bins parameterisation is similarly symmetric like that displayed in Figure 3, it has been chosen to calculate the sensitivities for all values of the binned parameter-range and to use the extremal value, i.e. the maximum of the absolute values, as a measure of the sensitivity in the plots to be found in section 6.2. A
3.2 Tuning 3 IMPLEMENTATION
˜
p p˜+ε
f(b)(p˜+ε)
f(b)(p˜)
Figure 3: A simplified illustration showing the principle sensitivity calculation exploiting the
param-eterisation of the generator response f(b)(~p)for a certain binbin a one dimensional parameter space. The parameter range is subdivided in 100 bins, the resulting (parameter-wise) sensitivities can either be displayed as colour-map plots or, by using the maximum of the absolute values of the sensitivities calculated for binb, as 2D-plots.
further advantage of the extremum is that it should not be too dependent on the scale ofε.
The visualisations of tunes are extremely useful, not least for iterating the choice of sample boundaries in the early stages of a tune. The boundaries must of course be wide enough to include the tuning estimates — if the estimate consistently falls outside the boundaries, it is probably indicative of a problem with the generator physics model — but also narrow enough that the sampling is representative of the parameter space. An initial scan may be too coarse to yield stable and final results. Also when choosing tighter boundaries care should be taken in the case of strong parameter–parameter correlations, since then it may happen that the sought-after minimum will lay outside the sampled hypercube.
If there are cases where the polynomial does not sufficiently describe the generator, it may be worth using the third-order polynomial, if sufficient sample runs are available. The experience gained in the tuning with the Professor-system so far is that the higher-order function improves the description away from the optimum, but is not usually necessary in the region close to the minimum.
prof-tunealso helpfully produces a directory of histogram files, one for each minimi-sation, which makes it possible to see how each distribution is predicted to behave at that point without running the generator and incurring the usual (typically multi-day)
3 IMPLEMENTATION 3.3 Validation
delay. This is particularly useful when choosing how to weight distributions to achieve the desired quality of the fits — a subjective prioritising of particular generator aspects which cannot be avoided and usually requires some iteration.
3.3
Validation
Before parameterising real MC-generated data, the parameterisation algorithm had to be tested for robustness against the distribution of the anchor points and its behaviour when dealing with data which does not perfectly fit to the parameterising polynomial. Second, one is advised to check, that the GoF returned from the parameterisation resembles the GoF returned directly from the MC-generated data.
The Professor GoF function can be influenced by several observable weight
combina-tions, wO, and also by the number of runsused for the parameterisation. This offers
possibilities to check the minima for systematics due to improper parameterisation or overtuning to a specific set of observable weights.
In addition, the minimization results obtained from quadratic interpolations were compared to those obtained from cubic interpolations. So far, no significant difference has been found between the best tuning estimates, though the cubic interpolation describes the generator response better in regions that are far away from the minimum.
3.3.1 Robustness of the parameterisation algorithm
The basic functioning of the polynomial parameterisation was tested with input data
generated with a second-order polynomial with random coefficients and the known
coefficients were compared to those of the resulting parameterisations. After this, the robustness of parameterising error-smeared data and data from non-second-order distributions was tested. For this input data were generated using second- to fourth-order polynomials, especially polynomials of the form
f(~p) = (~p−~m1)2(~p−~m2)2+~a·~p, (13)
and were smeared using an Gaussian error. Then, theunsmeared original polynomial
and the parameterisation were evaluated at 10000 randomly located points and a simple
χ2/N
df and pulls were calculated as a GoF measure, where the pulls were calculated as
follows:
p=10000
∑
i=1
funsmeared(~xi)−fparam(~xi)
3.3 Validation 3 IMPLEMENTATION
(a) withour oversampling (b) with oversampling
Figure 4: Example pull distributions: Parameterisations of data generated with a smeared
fourth-order polynomial, equation (13), in 7 dimensions were compared to the unsmeared polynomial. The parameterisations in (a) were created using the minimal number of anchor pointsNmin(7) =37. Those in (b) are usingNmin(7) +6=43 anchor points. One can clearly observe that the pull distribution narrows when using additional sample points.
with the~xibeing the test points, funsmeared and fparam the unsmeared polynomial and
parameterisation, respectively, and σ the width of the Gaussian error distribution. At
last a Gaussian distribution was fitted to the pull histogram. This all was done for
different dimensions of parameter space n and different numbers of sample points
N =Nmin(n),Nmin(n) +2, . . .. UsingNmin(n) sample points resulted in observedχ2/Ndfcovering
several orders of magnitude and broad, in the low dimensional case even biased pull distributions. Using additional sample points reduced all this unwanted behaviour, e.g. in the case of a 7 dimensional parameter space and data generated after a fourth-order polynomial the average width of pull distribution fell from 7.9 (Nmin(7)) to 3.2 (Nmin(7) +
6) and the range of observed χ2/Ndf from O(10–103) (Nmin(7) ) to O(1–10) (Nmin(7) +6),
consequently discouraging parameterisations based on the minimal number of sample points as outlined in section 3.3.3. Examples of pull distributions are given in Figure 4. Third, the influence of the distribution of the sample points in the parameter hypercube on the parameterisation quality was tested. A total of 5000 parameterisations based on
error-smeared data were performed. χ2/N
df values were computed as above using four
different measures of the distance between the anchor point distributions
• average and minimal cartesian distance between the anchor points
• average and minimal distance of the projections of the anchor points on the
3 IMPLEMENTATION 3.3 Validation
These were filled in 2D histograms. For the low dimensional cases a dependence of the GoF on the averaged distances was found for anchor point samples that were sampled in a way that larger regions of the parameter space were not covered. This problem could easily be solved by oversampling, whereas the more relevant, high dimensional cases did not show this dependence. The dimension of the parameter space for this test ranged from P=1 to 10 and the number of anchor points fromN = Nmin(P) to Nmin(P) +10.
3.3.2 Tune verification
As mentioned in section 2.5, it is useful to visualise Professor tunes along lines in parameter space, in particular lines which intersect both the estimated best tune and an alternative or default configuration. Professor provides a program to do this scan. This is useful to verify that the GoF really behaves as parameterised, and to ensure that the chosen point really is close to a GoF optimum.
To reduce the risk that a minimum returned by the numerical minimisation is a local
minimum,prof-tunecan perform several minimisations with different starting points.
A tighter tune, either Professor or grid-scan based, could be performed based on the correspondence between the true and parameterised GoF in the tune region.
3.3.3 Tuning stability
The Professor system offers two different ways to get an estimate of the stability of the minimum found.
One can benefit from oversampling the parameter space w.r.t. the feasibility of the
SVD in such a way that numerous run combinations1 may be chosen for different
parameterisations simply by omitting a fraction of all the available Monte Carlo-runs. In order to reduce correlations between run combinations we usually choose this fraction to be about one-third. This is clearly a compromise between the quality of the parameterisation and the degree of correlation introduced by choosing several run combinations.
The outcome of all minimizations can be displayed parameter-wise such as in Figure 71. We observe that the minimization result derived from all available Monte Carlo runs always lies in the center of theχ2/N
df-distribution illustrating that certain interpolations
1Usually, minimisations are performed based on about 100 run combinations. The run combinations
3.3 Validation 3 IMPLEMENTATION
fit the data better than others but that using all the information available gives a good description on average.
Instead of varying the parameterisation it is also possible to influence the GoF function. This can be done by independently applying a weight to each observable included in the tuning. This more or less subjective approach is justified by two facts. Firstly, we certainly do not expect the generator’s response function to be a simple polynomial and secondly we know that the models are incapable to reproduce certain observables at all. In Professor it is possible to investigate the stability of the tuning under change of weights, again by comparing the outcome of the minimizations (Figure 72). So far no strong dependence on the observable weights has been found.
4 THE UNDERLYING EVENT
4
The Underlying Event
In this section an introduction to the underlying event phenomenology will be given. Furthermore, the strategies used at the Tevatron experiment CDF for the direct measure-ment of the underlying event characteristics will be explained. A detailed description of the model setup in question for tuning will be presented in section 5, followed by an explanation (and justification) of the observables that went into the tuning procedure with Professor.
4.1
Probing the Proton structure
I will shortly sum up the experimental techniques used to probe the proton structure.
Deep inelastic scattering
To our current level of knowledge free quarks do not exist. In fact, due to the mechanism of confinement, they appear only in compound objects called hadrons. Numerous experiments have been conducted so far in order to unveil the proton structure. Deep inelastic scattering (DIS) experiments, such as HERA, for one thing, use electrons or positrons as probes. Since they do not carry colour charge they interact with the quarks
inside the proton only by exchange of photons or Z0-bosons (see Figure 5). This makes
them the preferred choice for probing the charge distribution of hadrons and therefore the extraction of quark and also gluon density functions.
γ −Q=q=k0
−k
k k0
p p0