2.2 Detector Design and Construction
2.2.7 DAQ and Computing
After the front-end electronics located in the water read out and digitize data from individual photosensors, this data is sent to the data acquisition (DAQ) system in the dome above the water tank. The goal of the DAQ system is to combine data from all PMTs, identify and reconstruct events in the detector and write them to storage for later analysis. This section will first discuss the basic design of the DAQ system and its operations, before describing the separate operations mode designed for
60
2.2 Detector Design and Construction
Readout
Boards
1
2
3
4
5
To offline processing unit
Buffering
Trigger
Trigger info building Readout buffer unit
Event builder unit Front−end
Figure 2.14: Simplified block diagram of the DAQ system for Hyper-Kamiokande.
Figure from reference [169].
supernova bursts. Finally, the multi-tiered computing system used for data analysis will be described briefly.
2.2.7.1 Design of the DAQ System
The basic design of the DAQ system is shown in figure 2.14. The system will be implemented using the ToolDAQ framework [218], and consist of four main components which are built using off-the-shelf server hardware.
Due to the high expected data rates, the system is designed to be modular and highly parallelizable. The ToolDAQ framework uses messaging protocols, redund-ant connections and buffers along with dynamic service discovery to increase fault tolerance by detecting and dynamically replacing unresponsive computing nodes.
Readout Buffer Units (RBU)
The DAQ system will contain approximately 70 RBUs that are connected to the front-end electronics modules in the water via a gigabit network switch, allowing data to be rerouted to other RBUs if a failure occurs. Each RBU is responsible for reading out the digitized signals from about 30 front-end electronics modules in the water. It then buffers all data in active memory for about 100 seconds and temporarily saves older data to hard drives for about one hour.
During normal operations, RBUs additionally reduce data by eliminating all PMT hits where the signal is below a threshold of 0.25 photoelectrons. This reduced data in a given time window is then provided to trigger processing and event building units upon request.
Trigger Processor Units (TPU)
Once all data is read out by the RBUs, the TPUs will analyse every time window in the data for possible signals. To do this, the TPU requests the reduced data
Chapter 2 The Hyper-Kamiokande Detector
for this time window from RBUs and then applies various trigger algorithms to look for events.
The simplest trigger is the “NDigits” or “Simple Majority” trigger, which applies a sliding time window to the data and triggers if the number of events in that window surpasses a given threshold. This trigger will be able to identify high-energy events but is not viable for low-energy events due to the high dark-noise rates of the PMTs. Instead, time windows that fail this NDigits trigger get passed on to a more sophisticated trigger optimized for low-energy events.
This “Vertex Reconstruction” trigger relies on the fact that a low-energy lepton travels only a few centimetres in water before its energy falls below the Cher-enkov threshold, so it is well approximated as a point source that emits all Cherenkov photons at the same time and position. The trigger uses a uni-formly spaced three-dimensional grid of test vertices in the detector. For each vertex, it corrects the recorded hit times in all PMTs by the time-of-flight of a Cherenkov photon originating at that vertex and then applies an NDigits trigger to a shorter sliding time window of 20 ns. If the test vertex is close to the true vertex of an event, this time-of-flight correction leads to a narrow peak in the corrected arrival times, while dark noise hits remain randomly distributed.
Furthermore, the TPUs will take into account external calibration triggers as well as GPS timestamps sent from the J-PARC accelerator to help identify events during beam spills.
Since all triggers are software-based, they can be updated if algorithmic im-provements are available or increases in computing power enable a lower threshold.
Event Builder Units (EBU)
Once a TPU has identified an event, the timestamps of that event are sent to an EBU, which requests the data in that time window from the RBUs. The EBU then identifies hits within the trigger time window that are associated with the event and writes them to disk for permanent archival.
Brokers
A central broker is tasked with coordinating operations of the DAQ system.
To increase fault tolerance, two identical machines act as broker. The primary one handles all tasks during normal operations, while the secondary keeps track of all decisions of the primary and is ready to replace it at any time in case of a failure.
The broker distributes tasks to TPUs and EBUs. To reduce load, it does not
62
2.2 Detector Design and Construction
transfer the data itself; instead, it tells a TPU or EBU which time window to analyse and the TPU or EBU will then request the data for that time window from the RBUs directly.
The broker handles failures of individual TPUs or EBUs by redistributing jobs to other available units, while failures of individual RBUs are handled by reassigning front-end electronics modules to other RBUs.
2.2.7.2 Supernova Mode
The DAQ design also contains two dedicated supernova trigger machines that examine the event rate in each 1 ms time slice as well as in a sliding 20 s time window.
If a significant increase in a 20 s window is detected, which could be the signal of a distant supernova, all data from that time period is saved to long-term storage.
In the case of a galactic supernova, an increased event rate should be visible in a 1 ms time slice shortly after the start of the burst. In that case, an alert is sent out to all machines in the DAQ system to switch into a dedicated supernova operations mode. While in this mode, RBUs will temporarily stop the processor-intensive data reduction and stream all data to the EBUs for permanent storage as fast as the network connections allow.
Buffer capacities and bandwidths throughout the DAQ system are designed to be able to handle a nearby supernova at a distance of 0.2 kpc with a peak event rate of about 108Hz, corresponding to peak data rates of about 100 GB/s.
Like Kamiokande, Hyper-Kamiokande is likely to participate in the Super-Nova Early Warning System (SNEWS [219]) and would send out an alert to that system in the case of a supernova trigger.
2.2.7.3 Computing
As common in modern high-energy physics experiments, Hyper-Kamiokande will adopt a multi-tiered computing system based on the Worldwide LHC Computing GRID. Kamioka, which hosts the Hyper-Kamiokande detector, as well as KEK, which hosts the neutrino beamline and near detector, will be Tier-0 sites storing all raw event data. Several Tier-1 sites hosted by major research facilities distributed around the world will store all reduced data, while individual institutions participating in the experiment will typically host Tier-2 sites that store subsets of the data as required.
The Hyper-Kamiokande software will be made available via the Cern Virtual Machine File System (CVMFS), a read-only file system optimized for software distribution, to ensure that all users have access to the most recent versions.
Chapter 2 The Hyper-Kamiokande Detector