2.1 Simulated Chandra data
2.2.3 Evaluation
celldetect for images is presently limited to work on datasets of size 2048 or less. However, for events files, recursive blocking is employed so that larger areas can be handled.
celldetect can utilize the variable kernel procedure only if the following keywords are present in the file: TELESCOP, INSTRUME, DETNAM, GRATING, RA NOM, and DEC NOM. Only Chandra detectors are supported at present. If any of the keywords is missing, celldetect cannot calculate the appropriate cell size; fixedcell needs to be used.
A general problem of the sliding cell method arises in the wings of the PSF of strong sources. Statistical fluctuations of source counts sometimes produce a large number of spurious detections in this situation. However, if the cell size is a good match to the PSF, this effect should be mitigated.
Future planned enhancements include: • Options for use of an exposure map
• Implementation of a generalized function in place of the sliding cell
2.3
Wavelet — Chicago: wavdetect
In its most general form, the wavelet detect method correlates the data with a wavelet function which has limited spatial extent and overall normalization zero. Each pixel’s correlation value is compared with the expected distribution of values (computable from the estimate of the background); if the value is an extreme outlier within this distribution, the pixel is assumed to be associated with a source. The implementation provided in wavdetect uses the so-called ‘Mexican Hat’ function (see 6.5), which has a positively valued quasi- Gaussian core surrounded by a negatively valued annulus. This is a reasonable function for mirrors/detectors which are characterized by a quasi-Gaussian PSF, and will work effectively even for pathological PSFs.
Figure 2.3: The ACIS-S rendition of our simulated field with ellipses showing the celldetect results. The size of the ellipses has been increased by a factor of 8 for clarity: their actual size is somewhat smaller than the black core of the source distribution. As expected, the close pairs with separations of 1 and 0.5 arcsec are not resolved and the three sources of large extent are not detected.
There are two parts to the wavdetect code. The first, wtransform, convolves the data with the wavelet function for however many scales are chosen. The resulting correlation maps are used by the second part, wrecon, to construct a final source list and estimate various parameters for each source. For a complete description of the parameters and the data products, see sections 9.2.2 and 9.2.3.
Each part may be run separately but the recommended procedure is to use the script, wavdetect, which runs them sequentially. It is not possible to run wavdetect, and then one or more additional runs with wrecon because the wavdetect script automatically deletes the intermediate files. For this reason, if you anticipate the need to make several runs of wrecon, you should start with wtransform, not wavdetect. A few parameters differ between wavdetect and wtransform/wrecon, e.g. xscales, yscales in place of scales, thereby providing the option to deviate from circular symmetry.
2.3.1
Key parameters to consider
Here we discuss the key parameters. A description of all parameters is given in section 9.2.2. infile, outfile, scellfile, imagefile, defnbkgfile
Filenames are required for all of these parameters. outfile contains the detected source list while the last three are images for the source cells, a reconstructed image, and a normalized (i.e. flat-fielded) background image.
expfile
This is the name of the file containing the exposure map. If an exposure map which matches the image is available, it should be used. If there is no exposure map, a dummy array is substituted with each pixel set to one. Thus to obtain true countrates, the user must divide the output values by the exposure time. The program should, in the future, read the exposure time from the FITS headers if exptime = 0. This does not eliminate the need for the parameter, as some flexibility is needed for older satellite datasets.
scales
The scales parameter determines how many (scaled) transforms will be computed. For a simple test you might enter "2.0 4.0" for this parameter. If that test is successful, try again with scales ‘‘1.0 2.0 4.0 8.0 16.0’’. For a more extensive run you might like to try the √2 series: "1.0 1.414 2.0 2.828 4.0 5.657 8.0 11.314 16.0". Remember to use quote marks at the begining and end of a series of numbers which have embedded spaces.
The primary concern is to match the first scale size to the PSF. Once that is done (e.g. blocking an over- sampled image), one decides how many scaled wavelets to use. To some extent, choices are based on the computing resources at hand. In practice, one must not have too many pixels and too many scale sizes. Data structures for a 512x512 image use up 36 Mb. A 2048x2048 images requires over 300 Mb. Datasets that do not fit in physical memory will page heavily to disk and processing will run very slowly. Scale sizes larger than 32 allocate excessive memory because it is necessary to pad the image with surrounding zeros. Specification of scales larger than 32 can quickly bring even respectable computers to their knees. Common choices are 512x512 arrays and 5 to 9 scales where each scale is a factor of 2 or√2 larger than the preceeding one.
of the wavelet. The resulting ‘correlation maps’ are then examined for regions where the intensity is larger than some threshold, and the final source list is constructed from a comparison of the different scale runs. Note that the units for scales are pixels and the value of scales is the radius of the Mexican hat (The Mexican hat function crosses zero at√2×radius).
exptime
This is the length of time over which the field was observed, in seconds. If set to zero, the value will either be taken from the FITS header (if set), or, failing that, estimated by averaging over exposure map pixel values at the center of the field. Otherwise, enter the field value in seconds.
mask
If a mask is desired, it may be specified here either as a circle (c x y r - all in pixels) or a rectangle (r x1 y1 x2 y2, where x1,y1 locate the lower left corner and x2,y2 define the upper right corner). The mask descriptors are left very simplistic (circle, rectangle) because DataModel should eventually be the preferred method of creating filtered datasets.
maxiter
The minimum number of iterations for cleaning sources from the data (to estimate the background map) is 2. Increasing this number will increase the method’s source detection sensitivity, but may not increase it enough to justify the increased computation time. More iterations are generally needed for large wavelet scales (e.g. 16 pixels in a ROSAT 512x512 PSPC field).
sigthresh
This is the significance threshold for source detection. A good value to use is the inverse of the total number of pixels, e.g. ∼ 10−6for a 1024x1024 field. This is equivalent to stating that the expected number of false sources per field is one. If larger arrays are used without decreasing
sigthresh, you will start to increase the probability of detecting false sources. Likewise, if smaller arrays are used, this parameter may be increased.
bkgsigthresh
This parameter specifies a significance for cleansing data from the image to compute the background map. As it does not effect source detection, this parameter may be set to a more liberal value than sigthresh (e.g. 10−2 or 10−3); this will help reduce the effect of weak undetectable sources on the background map calculation.