3.4 X-ray surveys
3.4.1 X-ray source detection and characterisation
As mentioned in Section3.3, X-ray images contain very few photons, even for relatively long exposure observations. Moreover, some extended sources could contain only a few tens of photons spread over a large area. In addition, the PSF and vignetting, which change over the FoV, as well as the particle and X-ray background complicate the analysis of X-ray images. Therefore, it is important for a source detection and characterisation algorithm to be able to cope with these complications.
The main goals of X-ray surveys are the discovery and detection of as many objects as possible, es- pecially if there is a research interest on a specific object population. There are two main steps in the source identification process on X-ray observations. The first step is to detect sources by identifying regions with a statistically significant overdensity of photons over the background. The second step is to determine to which kind of object the detected overdensity belongs to. The observed X-ray objects can be divided into two categories: point-like and extended sources. The former objects are unresolved astronomical objects, which are compact and their size is smaller than the telescope PSF, such as AGN. The latter sources have an extended X-ray emission, which can exceed the size of telescope PSF, such as galaxy groups and clusters (see Section 2.1.2). However, due to the complicated shape of the PSF in X-ray telescopes and other instrumental effects (like vignetting, see Section3.3), point-like sources can appear as extended ones. Determining which sources are point-like or extended can become very difficult.
The discrimination between X-ray point-like and extended sources, or AGN and galaxy groups and clusters, used to be done through optical follow-up or by looking at the spectrum of the sources. The problem is that both methods are time consuming, and with the arrival of large data sets from X-ray surveys, such procedures became rather inefficient. Therefore, having automatic and reliable methods to identify X-ray sources became a necessity. Thus, the classification methods started to compare the measured source extent with that of the PSF in order to distinguish between point-like and extended sources. However, with the advent of better X-ray observatories, i.e. with improved PSF and better sensitivity, the extent alone criteria became obsolete. Nowadays, sophisticated algorithms, mostly max- imum likelihood procedures, are used to fit point-like and extended models over the detected sources. In the following some of the most common and successfully applied methods for X-ray source detection and characterisation are described.
Sliding cell
In this method, an X-ray image is scanned by a detection box in small steps. The signal-to-noise (SN) ratio is measured in each step and is compared to the local background or a previously specified threshold value. The signal is measured from the pixel values within the cell, and the noise from nearby pixels. If the SN ratio is greater than the background, then the box position is marked as a source, and the SN is a first approximation of the object flux. The above, are the basics elements of the sliding cell method. There are some improvements to it: successive runs with increasing cell size, adaptive cell size as a function of off-axis angle, a matched filter detection cell, or the addition of a maximum likelihood (ML) algorithm for further source analysis. The latter improvement will be discussed in more detail at the end of this section. The sliding cell method is generally robust in finding isolated point sources, but it can merge close by sources. Moreover, it can fail in detecting extended sources, especially if the cell size is smaller than the source extension. This method has been incorporated into some of the data analysis packages from X-ray missions, like in XMM-Newton and Chandra.
Voronoi Tessellation and Percolation
In this method, the tessellation is built by each occupied pixel, which defines the centre of a polygon, the Voronoi cell. The surface brightness of each cell is inversely proportional to the Voronoi cell area. In this sense, background pixels have larger Voronoi cells and low photon counts, whereas source pixels have small Voronoi cells and high photon counts. The flux of each cell is compared with the expected
3.4 X-ray surveys
Figure 3.5: Sketch of an image wavelet de- composition, where each level depicts different feature sizes corresponding to a given wavelet scale (from small, top, to large sizes, bottom). Contiguous wavelet images form an object if their features reside within a linking radius. Figure adapted from Starck & Murtagh (2006).
one from a random Poisson distribution. If the flux deviates from this distribution, it is flagged and percolated with the neighbouring cells that also fulfil this requirement to form an object. This method can detect extended sources, even the ones with low surface brightness. However, it tends to merge or blend close by sources. This method has been included in the data analysis package of Chandra.
Wavelet transform
The wavelet technique is a multi-scale analysis of an image, where the signal is decomposed through the wavelet transformation. This enables to isolate sources of different sizes from the background signal. Wavelets are scalable, oscillatory functions that deviate from zero within a limited spatial regime. They also have zero normalization, and a full wavelet dictionary can be obtained from a mother wavelet using the simple dilatation equation
W(x, y)= 1 abW x − c a , y − d b ! , (3.1)
where a and b are the dilation parameters, and c and d, the translation parameters. The wavelet technique convolves an image, I(x, y), with a wavelet function, W:
wab(x, y)= I(x, y) ⊗ Wx a, y b . (3.2)
wab are the wavelet coefficients images corresponding to a wavelet scales a, b. By choosing a set of scales, the wavelet transform decomposes the original image into a different number of wavelet coefficient images. In these images, the features with characteristic sizes close to the corresponding wavelet scale are amplified (see Fig. 3.5). Then, the problem lies in the correct identification of the features that are not due to noise but rather to the source signal. Since for X-ray images the Poisson noise dominates, the process of selecting significant features can get complicated. There exist different methods which allow removing the insignificant features. For example, Vikhlinin et al. (1997) assumed local Gaussian noise and defined a significant threshold value; Slezak et al. (1994) transformed an image with Poisson noise into an image with Gaussian noise through the Anscombe transformation; Damiani et al. (1997) uses Monte Carlo simulations to find a convenient source detection threshold; Starck & Pierre
(1998) uses the wavelet function histogram method, which estimates the exact probability distribution function (PDF) of wavelet coefficients originated from Poisson data of locally constant mean.
Once the significant coefficients at each wavelet scale have been identified, the local maximum at all scales are collected and cross-identified to define objects in the data (see Fig.3.5). This method can sep- arate close by sources, detect sources of different shapes and surface brightness, even with low-surface brightness. One of the most important features is that wavelet transformation does not require previous knowledge of the image background to compute source parameters. However, it can be computationally expensive. Such method has been included in the data analysis packages of XMM-Newton and Chandra. Furthermore, it has been extensively and successfully used in X-ray surveys, where their main goal is the detection of galaxy groups and clusters (e.g. Rosati et al.1995; Vikhlinin et al.1998; Pacaud et al. 2006; Lloyd-Davies et al.2011).
Combination with a maximum likelihood fitting
The above methods estimate various parameters of the detected sources, such as extent, counts, position, etc. The selection of extended sources used to be based on their spatial extent. However, with the improved PSF of new X-ray observatories, the accuracy of the sizes estimated straight from the detection algorithms is often insufficient to reliably classify sources, especially if the aim is to detect more faint and high redshift objects. A major improvement on such selection has been the addition of a maximum likelihood technique for further analysis of the detected sources.
In this further step, the sources identified by the detection algorithms are analysed by a maximum likelihood fitting. For each source, the fitting code determines a model that maximizes the probability of generating the observed spatial photon distribution. On the one hand, for a point-like source model, the spatial distribution of a detected source is compared to the telescope PSF at the same off-axis position of the source. On the other hand, the extended source can be modelled by the β-profile of galaxy groups and clusters (see Eq.2.5in Section2.1.2). The modelling should be as realistic as possible, therefore, the models are convolved with the telescope PSF and include background. The likelihood ratio that calculates the probability that both distributions are the same is calculated by means of a simplified version of the C-statistic12, C, (Cash1979),
C = −2 ln P = −2 N X
i=1
(niln ei− ei), (3.3)
where P is the probability, ni is the number of photons in a pixel i, ei is the expected model value at pixel i, and N is the total number of pixels. The final model parameter estimation is performed by the maximization of the likelihood fitting. There exist different methods to do this calculation, which require a first guess of the parameters. These starting points are usually taken from the output of the detection methods. Finally, the maximum likelihood fitting provide a series of parameters that can be used to distinguish between point-like and extended sources.
This method has been successfully applied in different galaxy cluster surveys (e.g. Rosati et al.1995; Vikhlinin et al. 1998; Pacaud et al. 2006; Lloyd-Davies et al. 2011; Pacaud et al.2015), identifying hundreds of galaxy groups and clusters, which have been optically-confirmed. The methodology varies
3.4 X-ray surveys
across these different studies, because they are adapted to the features of the data, or simply because the methodology itself has evolved with time.