Event Selection at Large Zenith Angle (LZA)

The shapes of shower images for electromagnetic and hadronic showers change with the observed zenith angle. The gradient of parameters such as MSW and MSL steepens increas- ingly as the zenith angle approaches 90°, changing the parameter distributions. Therefore, the classifiers and box cuts used for SZA will not be nearly as effective for a GC analysis, which results in decreased sensitivity. The two options for improving event selection in the GC analysis are to train BDTs or to optimize box cuts for LZA.

6.3.1 BDT Event Selection at Large Zenith Angle (LZA)

Because of the gains in sensitivity they offered for SZA analyses with VEGAS, I trained and tested a BDT classification for the event selection of the LZA analysis. I followed the same procedure detailed in Subsection 5.7.2, using training data appropriate for the GC. The simulations used for signal were the same as those used for the BDT energy regression described in Subsection 6.1.2, but with zenith bins of 55°, 60°, and 65°. Events from LZA observations of dark matter targets were used as background. Classification scores were optimized on LZA Crab observations, and an independent set of LZA Crab runs was used to validate the optimized set of BDTs.

Unfortunately, at the time of training, there were less than 15 hours of appropriate background observations. While VERITAS takes an abundance of Crab observations, less than 10 hours were taken below 30° elevation. Because of the steepness of the parameter gradient at GC elevations, data taken as high as 30° elevation is not ideal for optimization or validation. While the number of events provided by simulations was sufficient at SZA, the higher energy threshold at LZA reduces the number of useful events drastically (recall the differential energy distribution is proportional to E−2). The highest energy bin, which covered the range 10–50 TeV, contained only O(1000) events. This number was inadequate even for a single energy bin, and could not be split into smaller energy ranges. In the case of GC spectra, the highest energy events are the rarest and most interesting, so classifying them correctly is extremely important.

The results of the validation did not show an increase in sensitivity, so this method was not used in my analysis. However, the method shows the potential for improvement if given more LZA observation data and more suitable simulation data. The additional LZA Crab data taken between 2017–2018 alone could have been sufficient, and more data will continue to be taken while VERITAS continues to operate. More LZA observations of DM targets were also taken in that time frame. Observations of the Sgr A Off position, which increased from 20 to 38 hours since I performed this training, could provide additional background events. The number of useful events could be increased by producing simulations with an altered

energy distribution, and they could be separated into finer energy bins. The large parameter gradient with respect to zenith could be addressed by producing simulations with primaries thrown at continuous, rather than discrete, zenith angles. With a continuous distribution of simulated zenith angles, it may even be possible to use zenith angle as a training parameter, avoiding the need for binning in zenith angle.

6.3.2 Box Cut Optimization

Box cut optimization is a simple procedure for finding the cuts that maximize the sensitivity for an analysis. The primary step is to process Crab data through all of the VEGAS stages and perform an RBM analysis for a large number of combinations of variable box cuts. A coarse search with large variations in the variables is done first, to reduce the computation time. Once a coarse maximum is found, a finer search is performed to find the optimal cut values more precisely. A dataframe was created with the Python module, Pandas (McKinney, 2010), to navigate through the large number of results. The module also helped me to ensure the maximum found was indeed a smooth maximum. The quality and event selection cuts varied were MSW, MSL, size, distance, and shower maximum height.

Cuts are best optimized on the Crab, a consistently strong source, because nearly all of the events within the ON region will be signal events rather than background events. Real events are preferable to simulated events (which are 100% gammas) because they are more accurate in this context because the showers are real and provide more accurate parameter distributions. I put together two runlists of high quality Crab data runs for the V5 and V6 epochs. The runs were chosen so that the zenith distribution roughly matched that of the full GC dataset. The Crab also has a similar spectral index to the GC source of about -2.5, which is considered medium hardness. The training runlist contained about 20 hours of LZA Crab observations, and the testing runlist contained about 25 hours of non-overlapping data. The sensitivity for each analysis must be calculated with the ON accounts scaled by its fraction f of the Crab flux where f = Fsource

1.10 1.15 1.20 1.25 1.30 1.35 MSL 0.95 1.00 1.05 1.10 1.15 MSW 2.15 2.20 2.25 2.30 2.35 2.40 2.45 2.50 se ns iti vit y [ hr ]

Figure 6.9: Heatmap grid of sensitivity for the mean scaled parameters produced during optimization. Pictured left is the grid for the V5 array, with the optimal size cut of 500. Pictured right is the V6 array with the optimal size cut of 650. The color scale is in units of standard deviations per square root of time.

does not depend on the source strength. Accordingly, the ON counts Non are reduced to

Non,reduced = (Non− Noff) · f. (6.1)

while the OFF counts Noff are unchanged, and the significance is subsequently calculated. A heatmap of re-scaled sensitivity in a grid of MSL and MSW is shown in Figure 6.9. The shower size is held constant at the optimal value for V5 (left) and V6 (right). As clearly shown, the maximum sensitivity occurs at an extremum rather than a saddle point. Similar tests showed that the parameters at the maximum sensitivity in the data frame reached their maxima smoothly. The GC has a value f between 5–10 % of the Crab rate, and my optimization was stable whether I used 5, 7, or 10 percent for the fraction f .

The optimal cuts found, as well as the standard values for comparison, are listed in Table 6.2. The optimized cuts increased the LZA sensitivity by about 10 %. The shower maximum height parameter proved to be ineffective for event selection at LZA, most likely because the showers that travel a greater distance will develop to their maximum length before they reach the surface of the Earth. Updating the cuts necessitated both custom effective areas and lookup tables.

Parameter Optimized V5 Optimized V6 Legacy V5 Legacy V6

MSL 1.3 1.3 1.3 1.3

MSW 1.0 1.05 1.0 1.0

Size 500 650 400 700

Distance 1.38 1.38 1.38 1.38

Shower Max Height 7 7 7 7

Table 6.2: Table of optimized event selection cuts (left two columns) for the two data sets, V5 and V6, compared to standard values (right two columns).

In document Very High Energy Emission from the Galactic Center with VERITAS (Page 172-176)