Reducible Background Estimation - Simplified Model Framework

7.2 Simplified Model Framework

7.6.2 Reducible Background Estimation

The reducible background contribution comes from two sources: Z+jets/Z+γ where a jet fakes a lepton or from photon conversion, and top-like backgrounds due to semileptonic hadronic decays. TheZ+jets/Z+γis estimated using the data-driven method, Fake Factor method, first introduced in Section 5.6.1. The top-like background is estimated using simulation normalized to data using DFOS events, first described in Section 5.6.2. This section shows how these two techniques are

applied in this analysis.

7.6.2.1 Z+jets/Z+γ Background

TheZ+jets/Z+γbackground enters the signal because a jet is mis-identified as a lepton or because of photon conversions. This background cannot be accurately described by simulation; therefore, it is estimated using the Fake Factor method.

The Fake Factor method identifies “ID” leptons whose criteria are identical to signal leptons, described in Tables 7.2 and an an “anti-ID” criteria, defined in Table 7.14. The anti-ID criteria is enriched in fake leptons by inverting or relaxing identification and isolation criteria.

Electrons Muons

pT>20 GeV pT>20 GeV

|η|<2.47 |η|<2.4

|∆z0sinθ|<1.0 |∆z0sinθ|<1.0

PassVeryLoose identification PassMediumidentification Pass OR requirements No OR requirements (!Medium identification (|d0significance|>3

|| |_d0significance|>5 ||!GradientLoose isolation) ||!GradientLoose isolation)

Table 7.14: Definition of the anti-ID criteria for the Fake Factor measurement region.

The Fake Factor is the ratio of ID to anti-ID leptons and is binned in lepton pT. Fake Factors are derived for electron and muon separately. The derivation of the Fake Factor is described in Section 5.6.1. In order to properly calculate the Fake Factor, the contribution from backgrounds with three prompt leptons must be subtracted from the data, as shown in equation (5.26).

The Fake Factor is derived in a region orthogonal to the signal selection selection and enriched with Z+jets and Z +γ events by requiring Emiss

T < 40 GeV and m min

T < 30 GeV (Fake Factor measurement region). To enrich this sample inZ+jets/Z+γevents, theZ leptons are also required to reconstruct an invariant mass within 10 GeV of theZ mass. The Fake Factor estimate of the Z+jet and Z+γ background is validated in a subset of the signal region containing events with 30< mmin

T <50 GeV andETmiss<40 GeV, which is enriched in background processes. The selection for the Fake Factor measurement and validation regions is summarized in Table 7.15 and illustrated in Figure 7.23 to show how these regions are orthogonal to theW Z control and validation regions. An additional fake VR is defined with third leptonpT>30 GeV, dilepton invariant mass outside the Z boson mass window of 10 GeV and with E_Tmiss between 40 and 60 GeV. This region is

[GeV] T min m 0 10 20 30 40 50 60 70 80 90 100 [GeV] miss T E 0 10 20 30 40 50 60 70 80 90 100

FF 

CR

Zjet

VR

VR3Za-j0 and j1

WZCRs

Figure 7.23: Cartoon demonstrating the selection used to obtain the Fake Factor measurement and validation regions. TheW Z CRs, and VRs, VR3Za-j0 and VR3Za-j1, are also displayed.

ETmiss[GeV] m min

T [GeV] |m``−mZ|[GeV] b-jet veto

FF measurement region <40 <30 <10 n/a

FF validation region <40 [30−50] <10 n/a

FF closure region >60 >60 <10 required

Table 7.15: Summary of measurement and validation regions used for the Z+jets/Z+γ estimates. The closure region is only used with MC events, and is used to derive the MC closure systematic uncertainty on the Fake Factor.

summarized in Table 7.10. Figure 7.24 shows the good agreement between data and background in theEmiss

T distribution in this validation region.

The Fake Factors are applied in the signal by regions by requiring that events satisfy the signal region requirements defined in Table 7.8 except that one signal lepton is replaced by an anti-ID lepton. The appropriate Fake Factor derived is applied to that event. The estimate for the number of three lepton events containing at least one fake lepton is shown in equation (5.25).

There are several sources of uncertainties for the Fake Factor method. First is the statistical uncertainty on the Fake Factor, which must be accounted for in the finalZ+jets/Z+γ estimate.

Second, as the MC samples are used to subtract the diboson contribution from the data, the uncertainty associated to this subtraction must be evaluated. To do so, the MCW Z andZZ yield is scaled up and down by 15%, and the Fake Factor is recalculated. The largest difference with respect to the nominal Fake Factor is then used as the Fake Factor’s uncertainty on the diboson

0 20 40 60 80 100 120 140 160 180 200 Events / 20 GeV 1 10 2 10 3 10 4 10 5 10 Data Total SM VV VVV V t t Higgs Reducible )=(450,150) GeV 0 1 χ , 0 2 χ / ± 1 χ m( )=(1000,600) GeV 0 1 χ , 0 2 χ / ± 1 χ m( -1 = 13 TeV, 36.1 fb s ATLAS VR3-Za [GeV] min T m 0 20 40 60 80 100 120 140 160 180 200 Data / SM 0 1 2 Figure 7.24: Emiss

T distribution in the fake validation region, VR3-offZa. Reducible corresponds to the data-driven fake factor estimate. The uncertainty band includes all statistical and systematic uncertainties.

subtraction and assigned as a symmetric uncertainty.

Third, a closure systematic is assigned to cover kinematic and composition differences between the Fake Factor measurement region and the signal region. To do so, the MCZ+jets/Z+γsamples are used, and an MC-based Fake Factor is computed in these two kinematic regions. The difference between these two MC-based Fake Factors is used as the uncertainty on the MC closure. The region used to derive the closure systematic requires three signal leptons, or two signal and one anti-ID leptons, ab-jet veto, a same-flavor, opposite charge, pair with mass within 10 GeV of theZ mass, mmin

T >60 GeV, andE miss

T >60 GeV. The looser cut onm min

T , as compared with the signal region requirement ofmmin

T >110 GeV is chosen to enhance the MC statistics in this region. The selection for this region is summarized in Table 7.15.

These systematic uncertainties are then added in quadrature to determine a total Fake Factor systematic uncertainty.

7.6.2.2 Top-like Backgrounds

The top-like background contribution in the signal region is estimated using simulation normalized to data in a control region. The top control region is defined using different flavor opposite charge events (e±e±µ∓ and µ±µ±e∓) to minimize the W Z contamination and increase top purity. The fake lepton is one of the same flavor leptons. The normalization factor methodology for the top background is further described in Section 5.6.2.

The top control region is defined at Emiss

applied on the invariant mass of the different flavor, opposite charge pair of leptons, and nob-jet veto is applied to increase the statistics in this region. The signal regions are binned in the number of jets; however a normalization factor inclusive in the number of jets is derived since the scale factor is similar between both regions. The selection of the top control region is summarized in Table 7.16.

Events with either three signal leptons or two signal leptons and one anti-ID lepton Onlye±e±µ∓ andµ±µ±e∓ events

When measuring the normalization factors for events with an anti-ID lepton, the anti-ID lepton must be one of the same-flavor, same-sign leptons E_Tmiss>60 GeV

mmin_T >60 GeV

no requirement onN_b-jets20 GeV

Table 7.16: Selection criteria used to define the control region for the top-like backgrounds.

A normalization factor is derived for electrons and muons separately. Moreover, there are two regions where the top NF is applied: in the signal region, and in the prompt background subtraction in the Fake Factor estimation. Both regions are defined with the same kinematic cuts except that NF derived for the signal regions uses three signal leptons, and the one for the prompt background subtraction is measured using two signal leptons and one anti-ID lepton.

The top NF factor associated with the Z+jets/Z+γ anti-ID control region, an electron NF of 1.04±0.09 is obtained, along with a muon NF of 1.05±0.03. The top NF associated with the three signal lepton top control region is 0.99±0.42 for electrons and 2.37±0.89 for muons.

100 150 200 250 300 Events / 20 GeV 1 10 2 10 3 10 4 10 Data _{Total SM} VV VVV V t t Higgs Reducible )=(450,150) GeV 0 1 χ , 0 2 χ / ± 1 χ m( )=(1000,600) GeV 0 1 χ , 0 2 χ / ± 1 χ m( -1 = 13 TeV, 36.1 fb s ATLAS VR3-Zb [GeV] miss T E 100 150 200 250 300 Data / SM 0 1 2

Figure 7.25: E_Tmissdistribution in the fake validation region, VR3-offZb. Reducible corresponds to the data-driven fake factor estimate and the top background with the NF applied.The uncertainty band includes all statistical and systematic uncertainties.

Table 7.10. The validation region is defined with same-flavor, opposite sign events, just like the signal regions, with the third leptonpT>20 GeV and EmissT >40 GeV. To minimizeW Z contamination, this region vetoes invariant masses of the dilepton pair within 10 GeV of the mass of theZ boson. To increase top background statistics, at least oneb-jet is required. Figure 7.25 shows the modeling of theEmiss

T distribution in the top VR. There is good agreement between the data observed and the expected background.

The statistical uncertainty on the normalization factors is propagated to the final estimate, and is used as the systematic uncertainty on the top background.

In document Electroweak Physics At The Large Hadron Collider With The Atlas Detector: Standard Model Measurement, Supersymmetry Searches, Excesses, And Upgrade Electronics (Page 186-191)