CiteSeerX — STATISTICAL CHALLENGES IN THE ANALYSIS OF COSMIC MICROWAVE BACKGROUND RADIATION

(1)

arXiv:0807.1816v2 [stat.AP] 15 May 2009

2009, Vol. 3, No. 1, 61–95 DOI:10.1214/08-AOAS190

c

Institute of Mathematical Statistics, 2009

STATISTICAL CHALLENGES IN THE ANALYSIS OF COSMIC MICROWAVE BACKGROUND RADIATION

By Paolo Cabella and Domenico Marinucci

University of Rome Tor Vergata and University of Rome Tor Vergata

An enormous amount of observations on Cosmic Microwave Back- ground radiation has been collected in the last decade, and much more data are expected in the near future from planned or operating satellite missions. These datasets are a goldmine of information for Cosmology and Theoretical Physics; their efficient exploitation posits several intriguing challenges from the statistical point of view. In this paper we review a number of open problems in CMB data analysis and we present applications to observations from the WMAP mission.

1. Introduction.

1.1. Cosmological background. Cosmology is now developing into a ma- ture observational science, with a vast array of different experiments that yield datasets of astonishing magnitude and nearly as great challenges for theoretical and applied statisticians. Datasets are now available on a large variety of different phenomena, but the leading part in cosmological research has been played over the last 15 years by the analysis of Cosmic Microwave Background (CMB) radiation, an area which has already led to Nobel Prizes for Physics in 1978 and in 2006.

The nature of CMB can be loosely explained as follows [see, e.g.,Dodelson (2003) for a textbook account]. According to the standard cosmological model, the Universe that we currently observe originated approximately 13.7 billion years ago in a very hot and dense state, in what of course is universally known as the Big Bang. Neglecting fundamental physics in the first fractions of seconds, we can naively imagine a fluid state where matter was completely ionized, that is, the kinetic energy of electrons was much stronger than the electrical attraction of protons, so that no stable atomic

Received April 2008; revised July 2008.

Key words and phrases. Cosmic Microwave Background radiation, spherical random fields, angular power spectrum, bispectrum, local curvature, spherical wavelets.

This is an electronic reprint of the original article published by the Institute of Mathematical StatisticsinThe Annals of Applied Statistics,

2009, Vol. 3, No. 1, 61–95. This reprint differs from the original in pagination and typographic detail.

1

(2)

nuclei could form. It is a consequence of quantum principles that a free elec- tron has a much larger cross-section than when it is bound in a nucleus;

loosely speaking, as a consequence, the probability of interactions between photons and electrons is so high that the mean free path of the former was very short and the Universe was consequently “opaque.” As the Universe expands, the mean energy content decreases, that is, the fluid of matter and radiation cools down; the mean kinetic energy of the electrons decreases as well until it reaches a critical value where it is no longer sufficient to compensate the electromagnetic attraction of the protons; stable (and neu- tral) hydrogen atoms are then formed. This change of state occurs at the so-called “age of recombination,” which is currently reckoned to have taken place 3.7 × 10⁵ years after the Big Bang, that is, when the Universe had only the 0.003% of its current age. At the age of recombination, the probability of interactions became so small that, as a first approximation, photons could start to travel freely. Neglecting second order effects, we can assume they had no further interaction up to the present epoch.

The remarkable consequence of this mechanism is that the Universe is embedded in a uniform radiation that provides pictures of its state nearly 1.37 × 10¹⁰ years ago; this is exactly the above-mentioned CMB radiation.

The existence of CMB was predicted by G.Gamow in a series of papers in the forties; it was later discovered fortuitously by Penzias and Wilson in 1965—for this discovery they earned the Nobel Prize for Physics in 1978. For several years further experiments were only able to confirm the existence of the radiation, and to test its adherence to the Planckian curve of blackbody emission, as predicted by theorists. A major breakthrough occurred with NASA satellite mission COBE, which was launched in 1989 and publicly released the first full-sky maps of radiation in 1992; for these maps Smoot and Mather earned the Noble Prize for Physics in 2006 [Smoot et al.(1992)].

The nature of these maps deserves further explanation. CMB is distributed in remarkably uniform fashion over the sky, with deviations in the order of 10⁻⁴ with respect to the mean value (corresponding to 2.731 Kelvin degrees). The attempts to understand this uniformity have led to very important developments in cosmology, primarily the inflationary scenario which now dominates the theoretical landscape. Even more important, though, are the tiny fluctuations around this mean value, which provided the seeds for stars and galaxies to form out of gravitational instability. Measuring and understanding the nature of these fluctuations has then been the core of an enormous amount of experimental and theoretical research. In particular, their stochastic properties yield a goldmine of information on a variety of extremely important issues on astrophysics and cosmology, and on many problems at the frontier of fundamental physics.

To mention just a few of these problems, we recall the issues concerning the matter content of the Universe, its global geometry, the existence and

(3)

nature of (nonbaryonic) dark matter, the existence and nature of dark energy, which is related to Einstein’s cosmological constant, and many others.

The next experimental landmark in CMB analysis followed in 2000, when two balloon-borne experiments, BOOMERANG and MAXIMA, yielded the first high-resolution observations on small patches of the sky (less than 10^◦ squared). These observations led to the first constraints on the global geometry of the Universe, which was found to be (very close to) Euclidean.

Another major breakthrough followed with the 2003, 2007 and 2008 data releases from another NASA satellite experiment, WMAP (the data are publicly available on the web sitehttp://lambda.gsfc.nasa.gov/). Such data releases yielded measurement of the correlation structure of the random field up to a resolution of about 0.22 degrees, that is, approximately 30 times better than COBE (7–10 degrees). Another major boost in data analysis is expected from the ESA satellite mission Planck, which is now scheduled to be launched on October 31, 2008; data releases for the public are expected in the following 3–5 years. Planck is planned to provide datasets of nearly 5 × 10¹⁰ observations, and this will allow to settle many open questions with CMB temperature data. New challenging questions are expected to arise at a faster and faster pace over the next decades; for instance, Planck will provide high quality for so-called polarization data, which will set the agenda for the experiments to come. Polarization data can be viewed as tensor-valued, rather than scalar, observations—that is, what we observe are not measurements of a scalar quantity such as the temperature, but random quadratic forms. As such, this entails an entirely new field of statistical research, which is still in its infancy and will not be discussed in the present paper.

Our aim here is to provide a review of statistical issues arising in CMB data analysis, with many examples of applications of statistical procedures to real data from the WMAP experiment. Some of the empirical results we provide are new, as detailed below. The plan of the paper is as follows: in Section2we review very briefly some background material on map-making, component separation and spectral representations for the CMB data sets.

For brevity’s sake, we do not provide many details other than the material which is essential for our following discussion. In Section3we are concerned with angular power spectrum estimation, and we discuss procedures to deal with relevant practical questions such as the presence of observational noise and/or missing observations. In Section 4 we present some tools to test for Gaussianity and/or isotropy of CMB radiation: we focus, in particular, on harmonic methods such as the bispectrum, techniques based on differential geometry such as the local curvature, and spherical wavelets (with the so-called Spherical Mexican Hat approach). Concerning the latter, we stress that many other possible approaches to wavelets on the sphere exist, which have been successfully applied to various parts of cosmological and astrophysical research: nevertheless, the field is still extremely active and

(4)

very much open for research (in particular, the derivation of the stochastic properties of wavelets procedures is still at the very beginning). Finally, we collect in the Appendixsome background mathematical material which we considered necessary for a better understanding of our proposals.

2. Some preliminary issues.

2.1. Map-making and component separation. To understand more precisely the nature of the statistical issues involved, we need to introduce some more formalization. As explained above, CMB can be viewed as the single realization of a random field on the surface of the sphere, that is, for each x ∈ S², T (x) is a random variable on a probability space. Observations are provided by means of electromagnetic detectors (so-called radiometers and/or bolometers) which measure fluxes of incoming radiations (i.e., photons) on a range of different frequencies. For instance, the above mentioned WMAP experiment is based upon 16 detectors, centered at frequencies 40.7, 60.8 and 93.5 GHz, which are labeled the Q, V and W band, respectively.

The forthcoming ESA mission Planck will be based upon 70 channels ranging from 30 GHz to 857 GHz. As the satellites scan the sky, observations are collected as a vector time series, the number of observations being in the order of 10⁹ for WMAP and 5 × 10¹⁰ for Planck. A first issue then re- lates to the construction of spherical maps starting from the Time Ordered Data (TOD) provided by the satellite; this is the so-called map-making challenge; see, for instance, Keihanen, Kurki-Suonio and Poutanen (2005) and De Gasperis et al.(2005). For brevity’s sake, we shall provide only the basic framework, and refer to the literature for more details. In short, we can assume that in each of the p channels we actually observe

O_i(x) = T (x) + F_i(x) + N_i(x), i = 1, . . . , p, x ∈ S²;

here, T (·) denotes the CMB signal, Fi(x) denotes the so-called foreground emissions by galactic and extragalactic sources of noncosmological nature (for instance, galaxies, quasars, intergalactic dusts and others), and N_i(x) instrumental noise. The crucial point to be understood is that the dependence across the different frequency channels of CMB emission is known, and it is different from the pattern followed by other sources: this capital property makes component separation possible and allows the construction of filtered maps [see. e.g., Patanchon et al. (2005) and the references therein]. More precisely, a clear prediction from theoretical physics, confirmed to amazing accuracy from the very first experiments [Smoot et al.(1992)], is that CMB radiation should follow the Planckian curve of blackbody radiation, that is, radiation is distributed across frequencies νi, i = 1, . . . , p according to the function

R(ν; x) =8πhv³ c³

1

e^−hv/k^B^{T (x)}− 1, (1)

(5)

Fig. 1. CMB radiation from WMAP data.

where R(ν; x) denotes the emission at frequency v for the corresponding temperature T (x) (measured in Kelvin degrees), c is the speed of light in the vacuum (= 2.99798 × 10⁸ m/s), h is Planck’s constant (= 6.6261 × 10⁻²⁷ er g/s), and k_B is Boltzmann’s constant (= 1.3807 × 10⁻¹⁶er g/K). In other words, the determination of T (x) is made possible by the inversion of (1):

the blackbody pattern can be estimated due to the presence of multiple detectors and the fact that astrophysical emissions of noncosmological nature are characterized by a different pattern of dependence across frequencies. In some regions, however, foreground emissions are so strong that component separation is still a difficult statistical problem; several groups of cosmologists are active in this field and a unique consensus solution has not been delivered yet. Moreover, in some areas of the sky (e.g., the Galactic plane, i.e., the line of sight of the Milky Way) the problem is considered to be largely unsolvable, so that there are missing observations in CMB maps (these unobserved regions are becoming, however, smaller and smaller with more refined experiments). In Figure 1 we report a CMB map constructed from (the Q band of) WMAP data; the missing region around the galactic plane is immediately evident.

Full-sky maps can be constructed by weighted linear interpolation across different channels, but they are not considered fully reliable for data analysis, especially at high frequencies; we report this so-called ILC (Internal Linear Combination) map in Figure2, seeBennett et al.(2003) for more details on its construction.

(6)

Fig. 2. The so-called Internal Linear Combination map from WMAP data.

There are several other statistically interesting issues involved with the reconstruction of the scalar value T (x) from the vector-valued observations {O1(x), . . . , O_p(x)}; actually the real experimental set-up is more complicated (and interesting) than this, because each location is observed unevenly, that is, the scanning strategy is such that some regions are more accurately measured than others. Also, the contaminating noise can have a time-dependent structure [there is indeed strong evidence for long memory behavior, see, e.g.,Natoli et al.(2002)]; the possible existence of noise correlation across different channels will be discussed below. These experimental features have sparked in the cosmological literature a very lively statistical debate on filtering and image reconstruction. We shall come back to some of these points later.

2.2. Isotropy and spectral representation. In the idealistic case of no experimental noise and perfect map-making, we can focus on the random field {T (x)}, assuming that it is exactly observed at each location on the unit sphere S². A crucial assumption on CMB radiation is its isotropic nature, that is, T (·)= T ◦ g(·), where^d = denotes equality in distribution (in the sense^d of random fields) and g ∈ SO(3) is any element of the group of rotations in R³. More explicitly, the joint law of CMB radiation is assumed invariant to any change of coordinate; the condition is viewed by the physicists as a realization of so-called Einstein’s Cosmological Principle, that is, the statement that the Universe should “look the same” to an observer in any arbitrary location. In other words, we could impose isotropy by requiring that the

(7)

stochastic laws of CMB radiation are invariant with respect to the choice of coordinates. There is some (quite inconclusive) evidence from WMAP data that isotropy may fail, that is, some authors have suggested that data on CMB radiation may show some asymmetries which would be inconsistent with isotropy [see, e.g.,Park(2004),Hansen et al.(2004)]. The existence of these asymmetries remains highly disputed, though, and it actually provides yet another intriguing area for statistical research. It is in fact hotly debated whether these asymmetries should be ascribed to experimental features or truly cosmological causes. From the theoretical point of view, cosmological models that would produce asymmetries do indeed exist, but they are highly nonstandard, ranging from global rotating solutions of Einstein’s field equations to unconventional topological structures for the whole Universe. Much more methodological and applied research is needed in this area, but the question will most probably remain unsolved at least until the first releases of Planck data are available in a few years’ time. By now, it is fair to say that a vast majority of cosmologists is still sticking to the isotropy assumption, and this is what we shall do in the present paper. Some of the procedures we shall consider in Section 4 for testing non-Gaussianity, however, are known to have also power against nonisotropic behavior; see, for instance, the local curvature approach below.

We shall hence focus on the statistical analysis of isotropic random fields.

Throughout this paper we shall assume that the CMB random field is mean- square continuous, as it is always done in the CMB literature. Under the previous assumptions, the following spectral representation holds, in the mean square sense

T (x) = X∞ l=0

Xl m=−l

a_lmY_lm(x) (2)

where a_lm= Z

S²

T (x)Y_lm(x) dx.

(3)

Here, the bar denotes complex conjugation and {Ylm(·)} the spherical harmonics, which form an orthonormal system for L² functions on the sphere. Some explicit expressions for the spherical harmonics can be found in the Appendix: much more complete treatment can be found elsewhere;

seeVarshalovich, Moskalev and Khersonskii (1988). For l = m = 0, we have a₀₀=^R_S2T (x) dx, that is, the first coefficient is 4π times the sample mean of the random field. This value can be subtracted from T (x), whence we can take the expansion to start from l = 1; indeed, in practice, in the cosmological literature also the coefficients corresponding to l = 1 are discarded (the so-called dipole terms), as they have no cosmological meaning, but they simply reflect the absolute motion of the Earth with respect to the frame of reference with respect to which CMB radiation is at rest. For l ≥ 2,

(8)

the triangular array {alm(·)} represents zero-mean, complex-valued random coefficients, with variance E|alm|²= C_l> 0, the angular power spectrum of the random field. The coefficients are uncorrelated, Ea_l₁_m₁a_l₂_m₂ = C_l₁δ_l^l₁²δ_m^m₁², and, hence, in the Gaussian case they are independent [note, however, that a_lm= (−1)^ma_l−m]. We have the identity

E (_∞

X

l=2

Xl m=−l

a_lmY_lm(x) )2

= X∞ l=2

Xl m=−l

E|alm|²Y_lm(x)

= X∞ l=2

C_l Xl m=−l

Y_lm(x) = X∞ l=2

C_l2l + 1 4π ,

in view of a standard summation formula for spherical harmonics [Var- shalovich, Moskalev and Khersonskii (1988)]. It follows immediately that C_l(2l + 1) must be summable to ensure finite variance. The angular power spectrum in the Gaussian case provides a complete characterization of the dependence structure of the random field; to its estimation from CMB data we now turn our attention.

3. Angular power spectrum estimation.

3.1. Power spectrum estimation under idealistic circumstances. As noted before, having observed the random field T (x), the coefficients {alm(·)} can be recovered by means of the inverse Fourier transform (3). In practice, with real data the integral is replaced by finite sums by means of (exact or approximate) cubature formulae, which are implemented in standard pack- ages for CMB data analysis such as HealPix or GLESP [see Gorski et al.

(2005), Doroshkevich et al. (2005)]. The angular power spectrum can then be estimated by

Cb_l= 1 2l + 1

Xl m=−l

|alm|². (4)

This simple estimator highlights a very important issue when dealing with CMB data. It is indeed readily seen that the estimator is consistent in the Gaussian case, as l → ∞; more precisely,

EC^b_l= C_l, E^bC_l

C_l − 1

2

= 1

(2l + 1)²E

"

a²_l0

C_l − 1 + 2 ( _l

X

m=1

|alm|² C_l − 1

)#2

= 2

2l + 1= o(1),

(9)

because a²_l0/C_l∼ χ^d ²1 and for m = 1, . . . , l, 2a²_lm/C_l∼ i.i.d. χ^d ²2, where χ²_n denotes a standard chi-square random variable with n degrees of freedom.

In the Gaussian case with fully observed maps, the issue of angular power spectrum estimation can thus be considered trivial, and indeed, the previous expressions not only ensure consistency but they also provide exact confidence intervals: it is immediate to see that

Xl m=−l

|alm|²= (

|al0|²+ Xl m=−1

2|alm|² )

∼ Cd l× χ²2n+1.

However, we must stress that these results rely heavily on the Gaussian assumption. Indeed, Baldi and Marinucci (2007) and Baldi, Marinucci and Varadarajan (2007) have shown that under isotropy the coefficients a_lm can only be independent in the Gaussian case, despite the fact that they are always uncorrelated by construction: in other words, sampling independent, non-Gaussian random coefficients to generate maps according to (2) will always yield an anisotropic random field. The correlation structure of the coefficients {alm} is in general quite complicated, despite the fact that it can be very nicely characterized in terms of group representation properties for SO(3) [Marinucci and Peccati (2007)]. In view of this, to derive any asymptotic result forC^b_lunder non-Gaussianity is by no means trivial; indeed, even the possible consistency (as l → ∞) of the estimator (4) in non-Gaussian circumstances is still an open issue for research.

3.2. Dealing with instrumental noise. We shall now try to make our analysis more realistic by considering the effect of noise and missing observations. Starting from the former, we shall consider the case where we observe O(x) := T (x) + N (x), N (x) denoting instrumental noise; for simplicity, we shall follow the cosmological literature, assuming N (x) to be also a zero mean, mean square continuous and isotropic random field on the sphere.

Whereas the assumptions of zero-mean and mean square continuity are basically immaterial, isotropy of the noise may need to be relaxed if the sky is unevenly observed. We shall also assume that T (x) and N (x) are independent. Performing the spherical harmonic transform, we obtain, in an obvious notation,

a_lm= Z

S²{T (x) + N(x)}Ylm(x) dx =: a^T_lm+ a^N_lm, which leads to

Cb_l= 1 2l + 1

" _l X

m=−l

|a^Tlm|²+ Xl m=−l

|a^Nlm|²+ 2 Re ( _l

X

m=−l

a^S_lma^N_lm )#

.

(10)

It is immediate to see that the resulting estimator is biased, EC^b_l= C_l^T+ C_l^N; the variance is easily seen to be given by

Var{C^b_l} = 2{Cl^T + C_l^N} 2l + 1 . (5)

In the cosmological literature, the standard procedure to address this bias is to assume that the noise correlation structure can be derived by Monte Carlo simulations or instrumental calibration; under this assumption, it is possible to subtract the bias from C^b_l and obtain a correct estimator with variance (5). An obvious question is then to test whether the assumption that C_l^N is known does not introduce some spurious effect into the analysis (namely, some unaccounted bias). A proposal in this direction was put forward by Polenta et al. (2005). To understand this idea, we must get back to the multi-channel setting, where we observe

O_i(x) := T (x) + N_i(x), i = 1, . . . , p, which in the harmonic domain leads to

a_i;lm:= a^T_lm+ a^N_lmⁱ.

Note that the temperature component of the random spherical harmonics coefficients does not depend on the observing channel. We assume that the noise is independent over channels, which is believed to be consistent with the actual experimental set-ups of current datasets. Testing noise correlation across different channels is yet another open challenge for research. For a given noise structure, an obvious estimator for C_l is

Ce_l^A:= 1 p

Xp i=1

{C^b_il− Cl^Nⁱ}, (6)

Cb_il:= 1 2l + 1

Xl m=−l

|ai;lm|².

The estimator C^e_l^A is known in the literature as the auto-power spectrum.

Simple computations yield [Polenta et al.(2005)]

ECê_lÂ= C_l, Var{Cê_lÂ} = 2

2l + 1 (

C_l²+2C_l p²

Xp i=1

C_l^Nⁱ+ 1 p⁴

Xp i,j=1

C_l^NⁱC_l^N^j )

.

Of course, the natural question that arises at this stage is the possible existence of misspecification, that is, some errors in the bias-correction term

(11)

C_l^Nⁱ. A solution for this issue was proposed by Polenta et al. (2005). The idea is to focus on the cross-power spectrum estimator

Ce_l^CP= 2 p(p − 1)

p−1X

i=1

Xp j=i+1

1 2l + 1

Xl m=−l

a_i;lma_j;lm

! .

The underlying rationale for C^e_l^CP is easy to gather: under the assumption that noise is independent across a different channel, the estimator is unbiased, regardless of the value of the C_l^Nⁱ. More precisely,

EC^e_l^CP = 2 p(p − 1)

p−1X

i=1

Xp j=i+1

1 2l + 1

Xl m=−l

E(a^T_lm+ a^N_lmⁱ)(a^T_lm+ a^N_lm^j)

!

= 2

p(p − 1)

p−1X

i=1

Xp j=i+1

C_l= C_l. Similar manipulations yield

Var{C^e_l^CP} = 2 2l + 1

(

C_l²+2C_l p²

Xp i=1

C_l^Nⁱ+ 1 p²(p − 1)²

p−1X

i=1

Xp j=i+1

C_l^NⁱC_l^N^j )

. Merely for notational simplicity, we also assume that the noise variance is constant across detectors. It is then readily seen that

Var{Cê_l^CP} − Var{Cê_lÂ} = 2 2l + 1

1

p²(p − 1)(C_l^N)²

.

More explicitly, the auto-power spectrum estimator is more efficient that the cross-power spectrum; however, the latter is robust to noise misspecification.

This is the classical setting which makes the implementation of a Hausman- type test for misspecification feasible [Hausman(1978)]. Indeed, it is possible to consider the statistic

H_l= [Var{Cê_l^CP−Cê_lÂ}]^−1/2{Cê_l^CP−Cê_lÂ}, Var{Cê_l^CP−Cê_lÂ} = 2

2l + 1 (1

p⁴ Xp i=1

{Cl^Nⁱ}²+ 2 (p − 1)²

p−1X

i=1

Xp j=i+1

C_l^NⁱC_l^N^j )

. Under the null of exact bias correction, it is readily seen that H_l→dN (0, 1), as l → ∞. On the other hand, in the presence of misspecification, that is, when the actual noise variance is equal to C_l^Nⁱ+ δ for some i, δ > 0, then we expect EH_l to diverge with rate√

lδ as l → ∞.

It is also possible to consider a functional form of the same test, focusing on

B_L(r) := 1

√L

[Lr]X

l=1

H_l, r ∈ [0, 1].

(12)

It is standard to show that B_L(r) converges weakly to a standard Brown- ian motion, as L → ∞. A test for noise misspecification can then be constructed along the lines of standard Kolmogorov–Smirnov or Cram´er–Von Mises statistics. We refer again to Polenta et al. (2005) for a much more detailed discussion and an extensive simulation study.

The methods discussed above rely on a basic identification assumption, that is, the condition that instrumental noise be independent across different channels. This is an assumption which is commonly entertained in the cosmological literature; suitable statistical issues to test its validity are still lacking and represent an open issue for research. A more challenging research task was mentioned before: the previous discussion was entirely led under the assumption that the CMB field (and thus the corresponding spherical harmonics coefficients) are Gaussian. It is very important to stress that relaxing this assumption has much deeper consequences here than it is usually the case in statistical inference. Indeed, it follows from results in Baldi and Marinucci (2007) that if the field is isotropic, the coefficients (a_lm) cannot be independent unless they are Gaussian. It follows that even the simple consistency (as l → ∞) of the estimatorC^e_l remains an open issue to address, in general non-Gaussian circumstances. We shall not go further into this issue here, but we rather focus on another important feature of realistic datasets: the presence of unobserved regions, which make the exact evaluation of the inverse Fourier transform (3) unfeasible.

3.3. Missing observations. The presence of missing observations, that is, regions of the sky where the CMB is deeply contaminated by astrophysical foregrounds, posits serious challenges to angular power spectrum estimation.

The first consequence is that the sample spherical harmonics coefficients a^M_lm=

Z

S²/M

T (x)Y_lm(x) dx,

lose their uncorrelation properties (here, M denotes the unobserved region and, for notational simplicity, we came back to the case of a single detector with no instrumental noise). Indeed, we have

Ea^M_l₁_m₁a^M_l₂_m₂= E

Z

S²/M

T (x)Y_l₁_m₁(x) dx

Z

S²/M

T (y)Y_l₂_m₂(y) dy

= ^X

l1m1

X

l2m2

Ea_lma_l^′_m^′

Z

S²/M

Y_lm(x)Y_l₁_m₁(x) dx

(7)

×

Z

S²/M

Y_l^′_m^′(y)Y_l₂_m₂(y) dy

=^X

lm

C_lW_lml₁_m₁W_lml₂_m₂,

(13)

where

W_lml₁_m₁:=

Z

S²/M

Y_lm(x)Y_l₁_m₁(x) dx.

In case the spherical random field is fully observed, then M = ∅ (the empty set) and by standard orthonormality properties of the spherical harmonics Y_lm, we obtain W_lml₁_m₁ = δ^l_l₁δ^m_m₁ and, therefore, Ea_l₁_m₁a_l₂_m₂ = C_lδ_l^l₁²δ^m_m₁². In the presence of missing observations, the random coefficients are no longer uncorrelated neither over l nor over m. In the physical literature the values of {Wlml1m1}l1m1l2m2 are computed numerically, exploiting the a priori knowledge on the geometry of the unobserved regions; the resulting coupling matrices can then be used to deconvolve the estimated values C^b_l, a procedure which has become extremely popular under the name of MASTER [see Hivon et al. (2002) for details]. In practice, it is not possible to identify by this method the value of the angular power spectrum at every single multi- pole l; it is then customary to proceed with binning techniques, where the values of C_lat nearby frequencies are averaged and only these smoothed values are actually estimated. Plots for the estimates of the C_l derived along these lines can be found, for instance, on the web site of WMAP ; a compar- ison with angular power spectrum estimate from several other experiments (based upon smaller patches of the observed sky) is also entertained.

The previous procedures can be computationally extremely demanding and we would like here to introduce an alternative strategy, which was basically put forward in Baldi et al. (2006). The idea is to implement power spectrum estimation by means of new kinds of spherical wavelets, called needlets [see alsoNarcowich, Petrushev and Ward (2006a), Narcowich, Petrushev and Ward (2006b), Marinucci et al. (2008) and Baldi et al.(2007)]. Needlets can be described as a convolution of the spherical harmonics basis by means of a suitable kernel function b(·); more precisely, the general element of the needlet frame can be written down as

ψ_jk(x) =^qλ_jk

BX^j+1

l=B^j−1

b

l B^j

Xl m=−l

Y_lm(x)Y_lm(ξ_jk),

where {ξjk} denotes a set of grid points on the sphere, B > 1 is a bandwidth parameter, b(·) is compactly supported and an infinitely differentiable function which satisfies the partition-of-unity property, that is,

X

j

b²

l B^j

≡ 1 for all l > 1, (8)

and {λjk, ξ_jk} (the cubature points and cubature weights) can be chosen in such a way that

X

k

Y_l₁_m₁(ξ_jk)Y_l₂_m₂(ξ_jk)λ_jk= Z

S²

Y_l₁_m₁(x)Y_l₂_m₂(x) dx = δ^l_l²₁δ^m_m²₁.

(14)

More details on this construction and its underlying rationale can be found inBaldi et al. (2006) and are not reported here for brevity’s sake; see also Kerkyacharian et al.(2007) and Guilloux, Fay and Cardoso (2007) for further work in this area. The corresponding random needlets coefficients are provided by the analysis formula

βb_jk= Z

T (x)ψ_jk(x) dx =^qλ_jk

BX^j+1

l=B^j−1

Xl m=−l

b

l B^j

a_lmY_lm(ξ_jk),

whereas the synthesis expression is given as X

j,k

β_jkψ_jk(x) =^X

j

BX^j+1

l1=B^j−1 l1

X

m1=−l1

b

l1

B^j

b

l2

B^j

a_l₁_m₁Y_l₁_m₁(x)

×

BX^j+1

l2=B^j−1 l2

X

m2=−l2

X

k

Y_l₁_m₁(ξ_jk)Y_l₂_m₂(ξ_jk)λ_jk

=^X

j

BX^j+1

l1=B^j−1 l1

X

m1=−l1

b

l1

B^j

b

l2

B^j

a_l₁_m₁Y_l₁_m₁(x)

×

BX^j+1

l2=B^j−1 l2

X

m2=−l2

δ^l_l₁²δ_m^m₁²

= X∞ l=1

Xl m=−l

a_lmY_lm(x) = T (x),

using (8). For our purposes, it is sufficient to recall the main properties of the needlets construction:

• needlets enjoy excellent localization properties in the real domain, each ψ_jk(x) being quasi-exponentially localized around its center ξ_jk. As such, needlets coefficients have been shown to be minimally influenced by the presence of missing observations.

• the needlets system is compactly supported in the harmonic domain; as such, the random needlets coefficients are uncorrelated for j −j^′≥ 2. Much more surprisingly, the random needlets coefficients are asymptotically uncorrelated for any fixed angular distance, as the frequency j diverges to infinity. This property implies that (in the Gaussian case) it is possible to derive a growing array of asymptotically i.i.d. observations out of a single realization of an isotropic random field. This opens the way to a plethora of statistical procedures.

(15)

In particular, it is possible to suggest the estimator Γbj:=^X

k

βb_jk² =^X

k

( _Bj+1

X

l=B^j−1

Xl m=−l

qλ_jkb

l B^j

a_lmY_lm(ξ_jk) )2

=

BX^j+1

l1,l2=B^j−1

X

m1,m2

b

l₁ B^j

b

l₂ B^j

a_l₁_m₁a_l₂_m₂ (

λ_jk^X

k

Y_lm(ξ_jk)Y_lm(ξ_jk) )

=

BX^j+1

l=B^j−1

b²

l B^j

Cb_l(2l + 1), for which it is simple to show that

EΓ^b_j=

BX^j+1

l=B^j−1

b²

l B^j

C_l(2l + 1).

(9)

Equation (9) shows thatΓ^bj provides an unbiased estimator for a smoothed version of the angular power spectrum; the advantage with respect to the standard procedure is that not only unbiasedness, but even uncorrelation over different scales is asymptotically conserved in the presence of missing observations, making the implementation of confidence intervals and testing procedures viable [see again Baldi et al. (2006) for details]. Also, even in the presence of a masked region, the summands {β^b_jk} are still asymptotically independent (over k) as j → ∞, whereas we have seen in (7) that this is not the case for the random coefficients {alm}. The price for such robustness properties is clearly connected to the smoothing, that is, in the presence of missing observations it turns out to be unfeasible to estimate each angular power spectrum mode C_l by itself, and one must stick to a slightly less ambitious goal, that is, the estimation of joint values averaged over some subset of frequencies (chosen by the data analyst). There is, of course, a standard trade-off in the choice of the bandwidth parameter B:

values closer to unity entail a much better resolution, but this brings about worse localization properties on the sphere and therefore a possibly higher contamination from spurious observations; on the other hand, higher values of B yield more robust, but less informative estimates.

Spherical wavelets in general, and needlets in particular, allow for many statistical applications, which go much beyond angular power spectrum estimation. One example is the analysis of cross-correlation between CMB and Large Scale Structure (LSS) maps; this is a key prediction of many cosmological models entailing some form of dark energy and has been implemented on real data byPietrobon, Balbi and Marinucci(2006). Other applications may include testing for non-Gaussianity and isotropy, bootstrap/subsampling

(16)

evaluation of confidence intervals for CMB statistics [Baldi et al. (2007)], component separation and many others. Given such a wide array of applications, we stress the need for a more careful analysis of their theoretical underpinnings, with special reference to the effect of the Gaussianity assumption on our conclusions. This and many other related issues are left as topics for further research.

3.4. Parameter estimation. In this paper we shall neglect almost completely another crucial issue in CMB data analysis, which is very tightly coupled to the estimation of the angular power spectrum, that is, cosmological parameter estimation. More precisely, the theoretical angular power spectrum can be written as a function of a number of cosmological parameters, such as the baryon, matter and dark energy densities Ω_b, Ω_m, Ω_Λ, the optical depth τ , the spectral index n_s, the Hubble constant H₀ and others;

of course, the numbers of parameters to be estimated varies across different cosmological models, typically ranging from 6 to 16; see againDodelson (2003) for more details. There are no known closed-form expressions yielding the theoretical angular power spectrum C_l as an explicit function of these parameters (which we write for brevity as ϑ); however, there are indeed very fast numerical routines which solve the associated partial differential equations and provide as an output C_l after a specific value of ϑ has been supplied [seeSeljak and Zaldarriaga (1996)].

Once the set of estimated values C^b_l has also been derived, there are basically two approaches that have been implemented to obtain estimates for the set of parameters, namely, some form of minimum distance estimators, where the parameters are calibrated to minimize a weighted distance between C_l(ϑ) and C^b_l, and approximate maximum likelihood methods, where suitable approximations for the likelihood functions are derived and the estimates are consequently derived. In practice, both methods are implemented by means of a heavy use of numerical techniques (especially MCMC), and a lively debate is growing on the construction of the most efficient algo- rithms. Likewise, an extensive discussion is growing on the construction of confidence intervals for the parameters, where fundamental issues such as the differences between Bayesian and frequentist viewpoints are often called upon (the distinction between these two approaches is not perceived in the cosmological community in the same manner as in the statistical one; just to give an example, maximum likelihood estimates are nearly unanimously labeled a Bayesian procedure in the CMB literature).

For brevity’s sake, we are unable to go deeper into these issues, which are still quite far from a satisfactory solution. We refer, for instance, to Hamann and Wong (2008) and the references therein for more discussion and recent proposals in this area.

(17)

4. Testing for non-Gaussianity. Among the several statistical issues which arise in connection to the analysis of Cosmic Microwave Background radiation, a lot of attention has been drawn by non-Gaussianity tests. These tests have several motivations. The first is connected to the need for a statistical validation for the predictions of the so-called inflationary scenario, which is currently the leading incumbent as a standard model for the Big Bang dynamics; seeDodelson(2003) for discussions and explanations. Under this labeling, there exist an enormous variety of different physical models, which in a vast majority of circumstances lead to expressions such as

T (x) = T_G(x) + fNL{TG²(x) − ETG²(x)}, (10)

where T (x) denotes as before CMB, T_G(x) is an underlying Gaussian field, fNLis a nonlinearity parameter and the unit of measurements are such that the non-Gaussian part T_G²(x) − ETG²(x) is 10⁻⁴/10⁻⁵ times smaller than T_G(x). (10) should be viewed as a strong simplification, for several reasons:

in particular, we are considering exclusively the primordial dynamics, thus neglecting later interactions through the gravitational potential; also, we are ruling out more complicated models, where higher order terms or multiple subordinating fields may be present; and, of course, we are neglecting a whole plethora of observational issues, where possible non-Gaussianities may be formed by secondary effects, such as the interactions of incoming photons at more recent epochs. Despite all these simplifying conditions, (10) does provide an extremely good guidance for features to be expected and, indeed, it makes up a benchmark model against which many procedures have been tested in the last few years. In particular, considerable attention has been drawn by the possibility to constrain the value of fNL, as this depends on constants from fundamental physics [Bartolo et al. (2004)] and as such it allows to probe many features of cosmological models.

Among several statistical procedures which have been proposed in the literature, we shall focus on three main families, namely, tests based upon the bispectrum, tests based upon geometric features of Gaussian random fields (local curvature) and tests based upon spherical wavelets (in this case, so-called Spherical Mexican Hat Wavelets).

4.1. The angular bispectrum. It is obvious that, under Gaussianity, the sequence {alm}, m = 0, . . . , l makes up an array of independent Gaussian random variables (complex-valued for m 6= 0), so that a natural first option for a test of Gaussianity is to consider their sample skewness a_l₁_m₁a_l₂_m₂a_l₃_m₃ and check whether it is significantly different from zero. This simple idea is made much more sophisticated by the necessity to impose rotational invariance on the sample coefficients. Such invariance can be imposed by demanding that the probability law of the CMB field be invariant with respect to the action

(18)

of the rotation group. More formally, let g ∈ SO(3) be any element of the rotation group in R³; the assumption of isotropy can then be written as

T (x)= T (gx)^d for all x ∈ S², whereas in terms of the spectral representation, we have

X∞ l=0

Xl m=−l

a_lmY_lm(x)=^d X∞ l=0

Xl m=−l

a_lmY_lm(gx).

(11)

As explained, for instance, inHu (2001) and Marinucci and Peccati (2007), from (11) it follows that the bispectrum of a rotational invariant random field must take the form

Ea_l₁_m₁a_l₂_m₂a_l₃_m₃=

l₁ l₂ l₃

0 0 0

l₁ l₂ l₃ m₁ m₂ m₃

b_l₁_l₂_l₃,

where b_l₁_l₂_l₃ (the reduced bispectrum) conveys the physical information and does not depend on m1, m2, m3. The Wigner’s 3j symbols appearing on the right-hand side are discussed in the Appendix; many more details can be found, for instance, in Varshalovich, Moskalev and Khersonskii (1988), Marinucci (2006) and Marinucci (2008), whereas generalization to higher order cumulant spectra are described in Marinucci and Peccati (2008). A feasible, rotationally invariant estimator for the (normalized) bispectrum is provided by

I_l₁_l₂_l₃= (−1)^(l¹^+l²^+l³^)/2 ^X

m1m2m3

l₁ l₂ l₃ m₁ m₂ m₃

a_l₁_m₁a_l₂_m₂a_l₃_m₃ pC_l₁C_l₂C_l₃ and its studentized version is of course

Ib_l₁_l₂_l₃= (−1)^(l¹^+l²^+l³^)/2 ^X

m1m2m3

l₁ l₂ l₃ m₁ m₂ m₃

a_l₁_m₁a_l₂_m₂a_l₃_m₃ qCb_l₁C^b_l₂C^b_l₃

.

The sample bispectrum is discussed, for instance, byHu(2001); asymptotic properties are provided byMarinucci(2006) andMarinucci(2008), where the phase factor (−1)^(l¹^+l²^+l³^)/2is also introduced. In particular, it can be shown that the sequences {Il1l2l3} and {I^b_l₁_l₂_l₃} converge to Gaussian independent random variables in the high frequency limit where min(l₁, l₂, l₃) ↑ ∞. The limiting behavior of the bispectrum ordinates, however, is perhaps not the most significant instrument for the implementation of statistical procedures.

More precisely, it seems more promising to combine the different ordinates into a single statistic, by means of the integrated bispectra

J_1L(r) =

[Lr]X

l2=1

( 1

√K XK k=1

Ib_l₁_+k,l₂_l₂ )

, J_2L(r) =

[Lr]X

l=1

Ib_lll.

(19)

Convergence to Brownian motion for both these statistics was established in Marinucci (2006). The underlying rationale can be briefly motivated as follows: in both cases we combine several different ordinates into a single functional statistic, capable of keeping track of the frequency location for possible deviations from Gaussianity. The different combination of multipoles in J_1L(·), J2L(·) corresponds to the two well-known classes of squeezed and equilateral configurations, as discussed again by Marinucci (2006), Babich, Creminelli and Zaldarriaga (2004) and many others. It is also possible to provide some results on the asymptotic behavior of these statistics under non-Gaussian circumstances; in particular, results inMarinucci(2006) suggest that J1Lwill provide consistent testing procedures (as L → ∞) under model (10), whereas tests based upon J_2Lwill have asymptotically negligible power, for all values of fNL. These theoretical findings have been validated by Monte Carlo simulations inCabella et al. (2006); the integrated bispectrum has also been shown to compare favorably with alternative statistical procedures in some internal statistical challenges within the Planck collab- oration.

In Figure 3 we report the results obtained by implementing J_1L(r) on the data from the (2003) and (2007) WMAP data releases. We stress that the simulations are calibrated in a realistic experimental setting, that is, they do take into account features such as the presence of noise and missing observations. More precisely, we used 1000 simulated maps of CMB signal plus noise; we took into account the modulation of noise on the maps given by WMAP scanning strategy, the presence of a masked region to avoid the emission from the Milky Way and point sources, and we considered the

Fig. 3. The behavior of J1L(r) on WMAP data.

(20)

optical transfer function of the telescopes. To comply with the cosmological literature, the shaded region represents the 68% confidence interval (1σ) as evaluated by means of Monte Carlo simulations for various values of r ∈ [0, 1]: we fixed L = 500 because WMAP data allow a reliable coverage up to this multipoles; see Bennett et al. (2003) and Hinshaw et al. (2008) for more technical details on the experiment (it should be noted that in the x-axis we report rL). The dotted and dashed lines represent Monte Carlo expected values for our statistics with fNL= ±100, . . . , 500, respectively. It is possible to check that the boundary value of fNL to ensure detection is in the order of 200 or larger, that is, with a signal to noise ratio in the order of a few percentage points. This is indeed confirmed by a more detailed study inCabella et al.(2006). Finally, triangles (2003 dataset) and squares (2007 dataset) represent the evaluation of the statistic on real data, on the basis of the previously mentioned WMAP releases. It is clear that the evidence for non-Gaussianity is rather weak, and, indeed, the statistics get closer to zero as the observations increase. We must stress, however, that the level of non-Gaussianity favored by theorists is well below 100, and this is still consistent with observations at the current resolution. Note that the signal to noise ratio for the non-Gaussian signal is in the order of fNL/10⁴, so that these values are really difficult to detect.

Very recently, inYadav and Wandelt(2007) it has been claimed a detection of a nonzero fNL (≃ 80) by means of a modified bispectrum estimator, which is constructed to take into account the presence of noise and missing observations, at the same time keeping computational costs at a feasible level. This proposal is indeed very interesting; the results, however, are quite close to the boundary level and as such they must probably be considered not conclusive. The general consensus in the community seems to be that new releases of data from more sophisticated experiments such as Planck, and possibly more efficient statistical procedures yet to be devised, will indeed be necessary to settle the question on the possible existence of non-Gaussianity in CMB. It should be stressed, in particular, that the bispectrum requires the evaluation of the inverse Fourier transform (3), and as such it is known to be severely affected by the presence of missing observations (there is some evidence that the detection level could reach fNL≃ 10 or lower for fully observed sky maps). Improving the performance of the bispectrum for partial sky coverage is a priority of current research in view of the forthcoming satellite data: for instance, inLan and Marinucci(2008) the bispectrum approach is combined with the needlets construction described in the previous section. Rather than considering these further developments in the bispectrum literature, we move to other methods which have a local nature, and are thus expected to be more robust in the presence of missing data.