### arXiv:0807.1816v2 [stat.AP] 15 May 2009

2009, Vol. 3, No. 1, 61–95 DOI:10.1214/08-AOAS190

c

Institute of Mathematical Statistics, 2009

STATISTICAL CHALLENGES IN THE ANALYSIS OF COSMIC MICROWAVE BACKGROUND RADIATION

By Paolo Cabella and Domenico Marinucci

University of Rome Tor Vergata and University of Rome Tor Vergata

An enormous amount of observations on Cosmic Microwave Back- ground radiation has been collected in the last decade, and much more data are expected in the near future from planned or operating satellite missions. These datasets are a goldmine of information for Cosmology and Theoretical Physics; their efficient exploitation posits several intriguing challenges from the statistical point of view. In this paper we review a number of open problems in CMB data analysis and we present applications to observations from the WMAP mission.

1. Introduction.

1.1. Cosmological background. Cosmology is now developing into a ma- ture observational science, with a vast array of different experiments that yield datasets of astonishing magnitude and nearly as great challenges for theoretical and applied statisticians. Datasets are now available on a large variety of different phenomena, but the leading part in cosmological research has been played over the last 15 years by the analysis of Cosmic Microwave Background (CMB) radiation, an area which has already led to Nobel Prizes for Physics in 1978 and in 2006.

The nature of CMB can be loosely explained as follows [see, e.g.,Dodelson (2003) for a textbook account]. According to the standard cosmological model, the Universe that we currently observe originated approximately 13.7 billion years ago in a very hot and dense state, in what of course is universally known as the Big Bang. Neglecting fundamental physics in the first fractions of seconds, we can naively imagine a fluid state where matter was completely ionized, that is, the kinetic energy of electrons was much stronger than the electrical attraction of protons, so that no stable atomic

Received April 2008; revised July 2008.

Key words and phrases. Cosmic Microwave Background radiation, spherical random fields, angular power spectrum, bispectrum, local curvature, spherical wavelets.

This is an electronic reprint of the original article published by the Institute of Mathematical StatisticsinThe Annals of Applied Statistics,

2009, Vol. 3, No. 1, 61–95. This reprint differs from the original in pagination and typographic detail.

1

nuclei could form. It is a consequence of quantum principles that a free elec- tron has a much larger cross-section than when it is bound in a nucleus;

loosely speaking, as a consequence, the probability of interactions between
photons and electrons is so high that the mean free path of the former was
very short and the Universe was consequently “opaque.” As the Universe
expands, the mean energy content decreases, that is, the fluid of matter and
radiation cools down; the mean kinetic energy of the electrons decreases
as well until it reaches a critical value where it is no longer sufficient to
compensate the electromagnetic attraction of the protons; stable (and neu-
tral) hydrogen atoms are then formed. This change of state occurs at the
so-called “age of recombination,” which is currently reckoned to have taken
place 3.7 × 10^{5} years after the Big Bang, that is, when the Universe had only
the 0.003% of its current age. At the age of recombination, the probability
of interactions became so small that, as a first approximation, photons could
start to travel freely. Neglecting second order effects, we can assume they
had no further interaction up to the present epoch.

The remarkable consequence of this mechanism is that the Universe is
embedded in a uniform radiation that provides pictures of its state nearly
1.37 × 10^{10} years ago; this is exactly the above-mentioned CMB radiation.

The existence of CMB was predicted by G.Gamow in a series of papers in the forties; it was later discovered fortuitously by Penzias and Wilson in 1965—for this discovery they earned the Nobel Prize for Physics in 1978. For several years further experiments were only able to confirm the existence of the radiation, and to test its adherence to the Planckian curve of blackbody emission, as predicted by theorists. A major breakthrough occurred with NASA satellite mission COBE, which was launched in 1989 and publicly released the first full-sky maps of radiation in 1992; for these maps Smoot and Mather earned the Noble Prize for Physics in 2006 [Smoot et al.(1992)].

The nature of these maps deserves further explanation. CMB is dis-
tributed in remarkably uniform fashion over the sky, with deviations in the
order of 10^{−4} with respect to the mean value (corresponding to 2.731 Kelvin
degrees). The attempts to understand this uniformity have led to very impor-
tant developments in cosmology, primarily the inflationary scenario which
now dominates the theoretical landscape. Even more important, though, are
the tiny fluctuations around this mean value, which provided the seeds for
stars and galaxies to form out of gravitational instability. Measuring and
understanding the nature of these fluctuations has then been the core of an
enormous amount of experimental and theoretical research. In particular,
their stochastic properties yield a goldmine of information on a variety of
extremely important issues on astrophysics and cosmology, and on many
problems at the frontier of fundamental physics.

To mention just a few of these problems, we recall the issues concerning the matter content of the Universe, its global geometry, the existence and

nature of (nonbaryonic) dark matter, the existence and nature of dark en- ergy, which is related to Einstein’s cosmological constant, and many others.

The next experimental landmark in CMB analysis followed in 2000, when
two balloon-borne experiments, BOOMERANG and MAXIMA, yielded the
first high-resolution observations on small patches of the sky (less than 10^{◦}
squared). These observations led to the first constraints on the global ge-
ometry of the Universe, which was found to be (very close to) Euclidean.

Another major breakthrough followed with the 2003, 2007 and 2008 data
releases from another NASA satellite experiment, WMAP (the data are
publicly available on the web sitehttp://lambda.gsfc.nasa.gov/). Such
data releases yielded measurement of the correlation structure of the random
field up to a resolution of about 0.22 degrees, that is, approximately 30 times
better than COBE (7–10 degrees). Another major boost in data analysis is
expected from the ESA satellite mission Planck, which is now scheduled to
be launched on October 31, 2008; data releases for the public are expected
in the following 3–5 years. Planck is planned to provide datasets of nearly
5 × 10^{10} observations, and this will allow to settle many open questions with
CMB temperature data. New challenging questions are expected to arise at
a faster and faster pace over the next decades; for instance, Planck will pro-
vide high quality for so-called polarization data, which will set the agenda for
the experiments to come. Polarization data can be viewed as tensor-valued,
rather than scalar, observations—that is, what we observe are not measure-
ments of a scalar quantity such as the temperature, but random quadratic
forms. As such, this entails an entirely new field of statistical research, which
is still in its infancy and will not be discussed in the present paper.

Our aim here is to provide a review of statistical issues arising in CMB data analysis, with many examples of applications of statistical procedures to real data from the WMAP experiment. Some of the empirical results we provide are new, as detailed below. The plan of the paper is as follows: in Section2we review very briefly some background material on map-making, component separation and spectral representations for the CMB data sets.

For brevity’s sake, we do not provide many details other than the material which is essential for our following discussion. In Section3we are concerned with angular power spectrum estimation, and we discuss procedures to deal with relevant practical questions such as the presence of observational noise and/or missing observations. In Section 4 we present some tools to test for Gaussianity and/or isotropy of CMB radiation: we focus, in particular, on harmonic methods such as the bispectrum, techniques based on differ- ential geometry such as the local curvature, and spherical wavelets (with the so-called Spherical Mexican Hat approach). Concerning the latter, we stress that many other possible approaches to wavelets on the sphere exist, which have been successfully applied to various parts of cosmological and astrophysical research: nevertheless, the field is still extremely active and

very much open for research (in particular, the derivation of the stochastic properties of wavelets procedures is still at the very beginning). Finally, we collect in the Appendixsome background mathematical material which we considered necessary for a better understanding of our proposals.

2. Some preliminary issues.

2.1. Map-making and component separation. To understand more pre-
cisely the nature of the statistical issues involved, we need to introduce
some more formalization. As explained above, CMB can be viewed as the
single realization of a random field on the surface of the sphere, that is, for
each x ∈ S^{2}, T (x) is a random variable on a probability space. Observations
are provided by means of electromagnetic detectors (so-called radiometers
and/or bolometers) which measure fluxes of incoming radiations (i.e., pho-
tons) on a range of different frequencies. For instance, the above mentioned
WMAP experiment is based upon 16 detectors, centered at frequencies 40.7,
60.8 and 93.5 GHz, which are labeled the Q, V and W band, respectively.

The forthcoming ESA mission Planck will be based upon 70 channels rang-
ing from 30 GHz to 857 GHz. As the satellites scan the sky, observations
are collected as a vector time series, the number of observations being in
the order of 10^{9} for WMAP and 5 × 10^{10} for Planck. A first issue then re-
lates to the construction of spherical maps starting from the Time Ordered
Data (TOD) provided by the satellite; this is the so-called map-making chal-
lenge; see, for instance, Keihanen, Kurki-Suonio and Poutanen (2005) and
De Gasperis et al.(2005). For brevity’s sake, we shall provide only the ba-
sic framework, and refer to the literature for more details. In short, we can
assume that in each of the p channels we actually observe

O_{i}(x) = T (x) + F_{i}(x) + N_{i}(x), i = 1, . . . , p, x ∈ S^{2};

here, T (·) denotes the CMB signal, Fi(x) denotes the so-called foreground
emissions by galactic and extragalactic sources of noncosmological nature
(for instance, galaxies, quasars, intergalactic dusts and others), and N_{i}(x)
instrumental noise. The crucial point to be understood is that the depen-
dence across the different frequency channels of CMB emission is known, and
it is different from the pattern followed by other sources: this capital property
makes component separation possible and allows the construction of filtered
maps [see. e.g., Patanchon et al. (2005) and the references therein]. More
precisely, a clear prediction from theoretical physics, confirmed to amazing
accuracy from the very first experiments [Smoot et al.(1992)], is that CMB
radiation should follow the Planckian curve of blackbody radiation, that is,
radiation is distributed across frequencies νi, i = 1, . . . , p according to the
function

R(ν; x) =8πhv^{3}
c^{3}

1

e^{−hv/k}^{B}^{T (x)}− 1,
(1)

Fig. 1. CMB radiation from WMAP data.

where R(ν; x) denotes the emission at frequency v for the corresponding
temperature T (x) (measured in Kelvin degrees), c is the speed of light in
the vacuum (= 2.99798 × 10^{8} m/s), h is Planck’s constant (= 6.6261 × 10^{−27}
er g/s), and k_{B} is Boltzmann’s constant (= 1.3807 × 10^{−16}er g/K). In other
words, the determination of T (x) is made possible by the inversion of (1):

the blackbody pattern can be estimated due to the presence of multiple de- tectors and the fact that astrophysical emissions of noncosmological nature are characterized by a different pattern of dependence across frequencies. In some regions, however, foreground emissions are so strong that component separation is still a difficult statistical problem; several groups of cosmolo- gists are active in this field and a unique consensus solution has not been delivered yet. Moreover, in some areas of the sky (e.g., the Galactic plane, i.e., the line of sight of the Milky Way) the problem is considered to be largely unsolvable, so that there are missing observations in CMB maps (these unobserved regions are becoming, however, smaller and smaller with more refined experiments). In Figure 1 we report a CMB map constructed from (the Q band of) WMAP data; the missing region around the galactic plane is immediately evident.

Full-sky maps can be constructed by weighted linear interpolation across different channels, but they are not considered fully reliable for data analysis, especially at high frequencies; we report this so-called ILC (Internal Linear Combination) map in Figure2, seeBennett et al.(2003) for more details on its construction.

Fig. 2. The so-called Internal Linear Combination map from WMAP data.

There are several other statistically interesting issues involved with the
reconstruction of the scalar value T (x) from the vector-valued observations
{O1(x), . . . , O_{p}(x)}; actually the real experimental set-up is more compli-
cated (and interesting) than this, because each location is observed un-
evenly, that is, the scanning strategy is such that some regions are more
accurately measured than others. Also, the contaminating noise can have a
time-dependent structure [there is indeed strong evidence for long memory
behavior, see, e.g.,Natoli et al.(2002)]; the possible existence of noise corre-
lation across different channels will be discussed below. These experimental
features have sparked in the cosmological literature a very lively statistical
debate on filtering and image reconstruction. We shall come back to some
of these points later.

2.2. Isotropy and spectral representation. In the idealistic case of no ex-
perimental noise and perfect map-making, we can focus on the random field
{T (x)}, assuming that it is exactly observed at each location on the unit
sphere S^{2}. A crucial assumption on CMB radiation is its isotropic nature,
that is, T (·)= T ◦ g(·), where^{d} = denotes equality in distribution (in the sense^{d}
of random fields) and g ∈ SO(3) is any element of the group of rotations in
R^{3}. More explicitly, the joint law of CMB radiation is assumed invariant to
any change of coordinate; the condition is viewed by the physicists as a real-
ization of so-called Einstein’s Cosmological Principle, that is, the statement
that the Universe should “look the same” to an observer in any arbitrary
location. In other words, we could impose isotropy by requiring that the

stochastic laws of CMB radiation are invariant with respect to the choice of coordinates. There is some (quite inconclusive) evidence from WMAP data that isotropy may fail, that is, some authors have suggested that data on CMB radiation may show some asymmetries which would be inconsistent with isotropy [see, e.g.,Park(2004),Hansen et al.(2004)]. The existence of these asymmetries remains highly disputed, though, and it actually provides yet another intriguing area for statistical research. It is in fact hotly debated whether these asymmetries should be ascribed to experimental features or truly cosmological causes. From the theoretical point of view, cosmological models that would produce asymmetries do indeed exist, but they are highly nonstandard, ranging from global rotating solutions of Einstein’s field equa- tions to unconventional topological structures for the whole Universe. Much more methodological and applied research is needed in this area, but the question will most probably remain unsolved at least until the first releases of Planck data are available in a few years’ time. By now, it is fair to say that a vast majority of cosmologists is still sticking to the isotropy assumption, and this is what we shall do in the present paper. Some of the procedures we shall consider in Section 4 for testing non-Gaussianity, however, are known to have also power against nonisotropic behavior; see, for instance, the local curvature approach below.

We shall hence focus on the statistical analysis of isotropic random fields.

Throughout this paper we shall assume that the CMB random field is mean- square continuous, as it is always done in the CMB literature. Under the previous assumptions, the following spectral representation holds, in the mean square sense

T (x) = X∞ l=0

Xl m=−l

a_{lm}Y_{lm}(x)
(2)

where a_{lm}=
Z

S^{2}

T (x)Y_{lm}(x) dx.

(3)

Here, the bar denotes complex conjugation and {Ylm(·)} the spherical
harmonics, which form an orthonormal system for L^{2} functions on the
sphere. Some explicit expressions for the spherical harmonics can be found
in the Appendix: much more complete treatment can be found elsewhere;

seeVarshalovich, Moskalev and Khersonskii (1988). For l = m = 0, we have
a_{00}=^{R}_{S}2T (x) dx, that is, the first coefficient is 4π times the sample mean
of the random field. This value can be subtracted from T (x), whence we can
take the expansion to start from l = 1; indeed, in practice, in the cosmo-
logical literature also the coefficients corresponding to l = 1 are discarded
(the so-called dipole terms), as they have no cosmological meaning, but
they simply reflect the absolute motion of the Earth with respect to the
frame of reference with respect to which CMB radiation is at rest. For l ≥ 2,

the triangular array {alm(·)} represents zero-mean, complex-valued random
coefficients, with variance E|alm|^{2}= C_{l}> 0, the angular power spectrum of
the random field. The coefficients are uncorrelated, Ea_{l}_{1}_{m}_{1}a_{l}_{2}_{m}_{2} = C_{l}_{1}δ_{l}^{l}_{1}^{2}δ_{m}^{m}_{1}^{2},
and, hence, in the Gaussian case they are independent [note, however, that
a_{lm}= (−1)^{m}a_{l−m}]. We have the identity

E
(_{∞}

X

l=2

Xl m=−l

a_{lm}Y_{lm}(x)
)2

= X∞ l=2

Xl m=−l

E|alm|^{2}Y_{lm}(x)

= X∞ l=2

C_{l}
Xl
m=−l

Y_{lm}(x) =
X∞
l=2

C_{l}2l + 1
4π ,

in view of a standard summation formula for spherical harmonics [Var-
shalovich, Moskalev and Khersonskii (1988)]. It follows immediately that
C_{l}(2l + 1) must be summable to ensure finite variance. The angular power
spectrum in the Gaussian case provides a complete characterization of the
dependence structure of the random field; to its estimation from CMB data
we now turn our attention.

3. Angular power spectrum estimation.

3.1. Power spectrum estimation under idealistic circumstances. As noted before, having observed the random field T (x), the coefficients {alm(·)} can be recovered by means of the inverse Fourier transform (3). In practice, with real data the integral is replaced by finite sums by means of (exact or approximate) cubature formulae, which are implemented in standard pack- ages for CMB data analysis such as HealPix or GLESP [see Gorski et al.

(2005), Doroshkevich et al. (2005)]. The angular power spectrum can then be estimated by

Cb_{l}= 1
2l + 1

Xl m=−l

|alm|^{2}.
(4)

This simple estimator highlights a very important issue when dealing with CMB data. It is indeed readily seen that the estimator is consistent in the Gaussian case, as l → ∞; more precisely,

EC^{b}_{l}= C_{l},
E^{ b}C_{l}

C_{l} − 1

2

= 1

(2l + 1)^{2}E

"

a^{2}_{l0}

C_{l} − 1 + 2
( _{l}

X

m=1

|alm|^{2}
C_{l} − 1

)#2

= 2

2l + 1= o(1),

because a^{2}_{l0}/C_{l}∼ χ^{d} ^{2}1 and for m = 1, . . . , l, 2a^{2}_{lm}/C_{l}∼ i.i.d. χ^{d} ^{2}2, where χ^{2}_{n} de-
notes a standard chi-square random variable with n degrees of freedom.

In the Gaussian case with fully observed maps, the issue of angular power spectrum estimation can thus be considered trivial, and indeed, the pre- vious expressions not only ensure consistency but they also provide exact confidence intervals: it is immediate to see that

Xl m=−l

|alm|^{2}=
(

|al0|^{2}+
Xl
m=−1

2|alm|^{2}
)

∼ Cd l× χ^{2}2n+1.

However, we must stress that these results rely heavily on the Gaussian
assumption. Indeed, Baldi and Marinucci (2007) and Baldi, Marinucci and
Varadarajan (2007) have shown that under isotropy the coefficients a_{lm} can
only be independent in the Gaussian case, despite the fact that they are
always uncorrelated by construction: in other words, sampling independent,
non-Gaussian random coefficients to generate maps according to (2) will
always yield an anisotropic random field. The correlation structure of the
coefficients {alm} is in general quite complicated, despite the fact that it can
be very nicely characterized in terms of group representation properties for
SO(3) [Marinucci and Peccati (2007)]. In view of this, to derive any asymp-
totic result forC^{b}_{l}under non-Gaussianity is by no means trivial; indeed, even
the possible consistency (as l → ∞) of the estimator (4) in non-Gaussian cir-
cumstances is still an open issue for research.

3.2. Dealing with instrumental noise. We shall now try to make our anal- ysis more realistic by considering the effect of noise and missing observa- tions. Starting from the former, we shall consider the case where we observe O(x) := T (x) + N (x), N (x) denoting instrumental noise; for simplicity, we shall follow the cosmological literature, assuming N (x) to be also a zero mean, mean square continuous and isotropic random field on the sphere.

Whereas the assumptions of zero-mean and mean square continuity are ba- sically immaterial, isotropy of the noise may need to be relaxed if the sky is unevenly observed. We shall also assume that T (x) and N (x) are indepen- dent. Performing the spherical harmonic transform, we obtain, in an obvious notation,

a_{lm}=
Z

S^{2}{T (x) + N(x)}Ylm(x) dx =: a^{T}_{lm}+ a^{N}_{lm},
which leads to

Cb_{l}= 1
2l + 1

" _{l}
X

m=−l

|a^{T}lm|^{2}+
Xl
m=−l

|a^{N}lm|^{2}+ 2 Re
( _{l}

X

m=−l

a^{S}_{lm}a^{N}_{lm}
)#

.

It is immediate to see that the resulting estimator is biased, EC^{b}_{l}= C_{l}^{T}+ C_{l}^{N};
the variance is easily seen to be given by

Var{C^{b}_{l}} = 2{Cl^{T} + C_{l}^{N}}
2l + 1 .
(5)

In the cosmological literature, the standard procedure to address this bias is
to assume that the noise correlation structure can be derived by Monte Carlo
simulations or instrumental calibration; under this assumption, it is possible
to subtract the bias from C^{b}_{l} and obtain a correct estimator with variance
(5). An obvious question is then to test whether the assumption that C_{l}^{N}
is known does not introduce some spurious effect into the analysis (namely,
some unaccounted bias). A proposal in this direction was put forward by
Polenta et al. (2005). To understand this idea, we must get back to the
multi-channel setting, where we observe

O_{i}(x) := T (x) + N_{i}(x), i = 1, . . . , p,
which in the harmonic domain leads to

a_{i;lm}:= a^{T}_{lm}+ a^{N}_{lm}^{i}.

Note that the temperature component of the random spherical harmonics
coefficients does not depend on the observing channel. We assume that the
noise is independent over channels, which is believed to be consistent with
the actual experimental set-ups of current datasets. Testing noise correlation
across different channels is yet another open challenge for research. For a
given noise structure, an obvious estimator for C_{l} is

Ce_{l}^{A}:= 1
p

Xp i=1

{C^{b}_{il}− Cl^{N}^{i}},
(6)

Cb_{il}:= 1
2l + 1

Xl m=−l

|ai;lm|^{2}.

The estimator C^{e}_{l}^{A} is known in the literature as the auto-power spectrum.

Simple computations yield [Polenta et al.(2005)]

EC^{e}_{l}^{A}= C_{l},
Var{C^{e}_{l}^{A}} = 2

2l + 1 (

C_{l}^{2}+2C_{l}
p^{2}

Xp i=1

C_{l}^{N}^{i}+ 1
p^{4}

Xp i,j=1

C_{l}^{N}^{i}C_{l}^{N}^{j}
)

.

Of course, the natural question that arises at this stage is the possible ex- istence of misspecification, that is, some errors in the bias-correction term

C_{l}^{N}^{i}. A solution for this issue was proposed by Polenta et al. (2005). The
idea is to focus on the cross-power spectrum estimator

Ce_{l}^{CP}= 2
p(p − 1)

p−1X

i=1

Xp j=i+1

1 2l + 1

Xl m=−l

a_{i;lm}a_{j;lm}

! .

The underlying rationale for C^{e}_{l}^{CP} is easy to gather: under the assumption
that noise is independent across a different channel, the estimator is unbi-
ased, regardless of the value of the C_{l}^{N}^{i}. More precisely,

EC^{e}_{l}^{CP} = 2
p(p − 1)

p−1X

i=1

Xp j=i+1

1 2l + 1

Xl m=−l

E(a^{T}_{lm}+ a^{N}_{lm}^{i})(a^{T}_{lm}+ a^{N}_{lm}^{j})

!

= 2

p(p − 1)

p−1X

i=1

Xp j=i+1

C_{l}= C_{l}.
Similar manipulations yield

Var{C^{e}_{l}^{CP}} = 2
2l + 1

(

C_{l}^{2}+2C_{l}
p^{2}

Xp i=1

C_{l}^{N}^{i}+ 1
p^{2}(p − 1)^{2}

p−1X

i=1

Xp j=i+1

C_{l}^{N}^{i}C_{l}^{N}^{j}
)

. Merely for notational simplicity, we also assume that the noise variance is constant across detectors. It is then readily seen that

Var{C^{e}_{l}^{CP}} − Var{C^{e}_{l}^{A}} = 2
2l + 1

1

p^{2}(p − 1)(C_{l}^{N})^{2}

.

More explicitly, the auto-power spectrum estimator is more efficient that the cross-power spectrum; however, the latter is robust to noise misspecification.

This is the classical setting which makes the implementation of a Hausman- type test for misspecification feasible [Hausman(1978)]. Indeed, it is possible to consider the statistic

H_{l}= [Var{C^{e}_{l}^{CP}−C^{e}_{l}^{A}}]^{−1/2}{C^{e}_{l}^{CP}−C^{e}_{l}^{A}},
Var{C^{e}_{l}^{CP}−C^{e}_{l}^{A}} = 2

2l + 1 (1

p^{4}
Xp
i=1

{Cl^{N}^{i}}^{2}+ 2
(p − 1)^{2}

p−1X

i=1

Xp j=i+1

C_{l}^{N}^{i}C_{l}^{N}^{j}
)

.
Under the null of exact bias correction, it is readily seen that H_{l}→dN (0, 1),
as l → ∞. On the other hand, in the presence of misspecification, that is,
when the actual noise variance is equal to C_{l}^{N}^{i}+ δ for some i, δ > 0, then
we expect EH_{l} to diverge with rate√

lδ as l → ∞.

It is also possible to consider a functional form of the same test, focusing on

B_{L}(r) := 1

√L

[Lr]X

l=1

H_{l}, r ∈ [0, 1].

It is standard to show that B_{L}(r) converges weakly to a standard Brown-
ian motion, as L → ∞. A test for noise misspecification can then be con-
structed along the lines of standard Kolmogorov–Smirnov or Cram´er–Von
Mises statistics. We refer again to Polenta et al. (2005) for a much more
detailed discussion and an extensive simulation study.

The methods discussed above rely on a basic identification assumption,
that is, the condition that instrumental noise be independent across dif-
ferent channels. This is an assumption which is commonly entertained in
the cosmological literature; suitable statistical issues to test its validity are
still lacking and represent an open issue for research. A more challenging
research task was mentioned before: the previous discussion was entirely
led under the assumption that the CMB field (and thus the corresponding
spherical harmonics coefficients) are Gaussian. It is very important to stress
that relaxing this assumption has much deeper consequences here than it
is usually the case in statistical inference. Indeed, it follows from results
in Baldi and Marinucci (2007) that if the field is isotropic, the coefficients
(a_{lm}) cannot be independent unless they are Gaussian. It follows that even
the simple consistency (as l → ∞) of the estimatorC^{e}_{l} remains an open issue
to address, in general non-Gaussian circumstances. We shall not go further
into this issue here, but we rather focus on another important feature of
realistic datasets: the presence of unobserved regions, which make the exact
evaluation of the inverse Fourier transform (3) unfeasible.

3.3. Missing observations. The presence of missing observations, that is, regions of the sky where the CMB is deeply contaminated by astrophysical foregrounds, posits serious challenges to angular power spectrum estimation.

The first consequence is that the sample spherical harmonics coefficients
a^{M}_{lm}=

Z

S^{2}/M

T (x)Y_{lm}(x) dx,

lose their uncorrelation properties (here, M denotes the unobserved region and, for notational simplicity, we came back to the case of a single detector with no instrumental noise). Indeed, we have

Ea^{M}_{l}_{1}_{m}_{1}a^{M}_{l}_{2}_{m}_{2}= E

Z

S^{2}/M

T (x)Y_{l}_{1}_{m}_{1}(x) dx

Z

S^{2}/M

T (y)Y_{l}_{2}_{m}_{2}(y) dy

= ^{X}

l1m1

X

l2m2

Ea_{lm}a_{l}^{′}_{m}^{′}

Z

S^{2}/M

Y_{lm}(x)Y_{l}_{1}_{m}_{1}(x) dx

(7)

×

Z

S^{2}/M

Y_{l}^{′}_{m}^{′}(y)Y_{l}_{2}_{m}_{2}(y) dy

=^{X}

lm

C_{l}W_{lml}_{1}_{m}_{1}W_{lml}_{2}_{m}_{2},

where

W_{lml}_{1}_{m}_{1}:=

Z

S^{2}/M

Y_{lm}(x)Y_{l}_{1}_{m}_{1}(x) dx.

In case the spherical random field is fully observed, then M = ∅ (the empty
set) and by standard orthonormality properties of the spherical harmonics
Y_{lm}, we obtain W_{lml}_{1}_{m}_{1} = δ^{l}_{l}_{1}δ^{m}_{m}_{1} and, therefore, Ea_{l}_{1}_{m}_{1}a_{l}_{2}_{m}_{2} = C_{l}δ_{l}^{l}_{1}^{2}δ^{m}_{m}_{1}^{2}. In
the presence of missing observations, the random coefficients are no longer
uncorrelated neither over l nor over m. In the physical literature the val-
ues of {Wlml1m1}l1m1l2m2 are computed numerically, exploiting the a priori
knowledge on the geometry of the unobserved regions; the resulting coupling
matrices can then be used to deconvolve the estimated values C^{b}_{l}, a proce-
dure which has become extremely popular under the name of MASTER [see
Hivon et al. (2002) for details]. In practice, it is not possible to identify by
this method the value of the angular power spectrum at every single multi-
pole l; it is then customary to proceed with binning techniques, where the
values of C_{l}at nearby frequencies are averaged and only these smoothed val-
ues are actually estimated. Plots for the estimates of the C_{l} derived along
these lines can be found, for instance, on the web site of WMAP ; a compar-
ison with angular power spectrum estimate from several other experiments
(based upon smaller patches of the observed sky) is also entertained.

The previous procedures can be computationally extremely demanding and we would like here to introduce an alternative strategy, which was basically put forward in Baldi et al. (2006). The idea is to implement power spectrum estimation by means of new kinds of spherical wavelets, called needlets [see alsoNarcowich, Petrushev and Ward (2006a), Narcowich, Petrushev and Ward (2006b), Marinucci et al. (2008) and Baldi et al.(2007)]. Needlets can be described as a convolution of the spher- ical harmonics basis by means of a suitable kernel function b(·); more pre- cisely, the general element of the needlet frame can be written down as

ψ_{jk}(x) =^{q}λ_{jk}

BX^{j+1}

l=B^{j−1}

b

l
B^{j}

Xl m=−l

Y_{lm}(x)Y_{lm}(ξ_{jk}),

where {ξjk} denotes a set of grid points on the sphere, B > 1 is a bandwidth parameter, b(·) is compactly supported and an infinitely differentiable func- tion which satisfies the partition-of-unity property, that is,

X

j

b^{2}

l
B^{j}

≡ 1 for all l > 1, (8)

and {λjk, ξ_{jk}} (the cubature points and cubature weights) can be chosen in
such a way that

X

k

Y_{l}_{1}_{m}_{1}(ξ_{jk})Y_{l}_{2}_{m}_{2}(ξ_{jk})λ_{jk}=
Z

S^{2}

Y_{l}_{1}_{m}_{1}(x)Y_{l}_{2}_{m}_{2}(x) dx = δ^{l}_{l}^{2}_{1}δ^{m}_{m}^{2}_{1}.

More details on this construction and its underlying rationale can be found inBaldi et al. (2006) and are not reported here for brevity’s sake; see also Kerkyacharian et al.(2007) and Guilloux, Fay and Cardoso (2007) for fur- ther work in this area. The corresponding random needlets coefficients are provided by the analysis formula

βb_{jk}=
Z

T (x)ψ_{jk}(x) dx =^{q}λ_{jk}

BX^{j+1}

l=B^{j−1}

Xl m=−l

b

l
B^{j}

a_{lm}Y_{lm}(ξ_{jk}),

whereas the synthesis expression is given as X

j,k

β_{jk}ψ_{jk}(x) =^{X}

j

BX^{j+1}

l1=B^{j−1}
l1

X

m1=−l1

b

l1

B^{j}

b

l2

B^{j}

a_{l}_{1}_{m}_{1}Y_{l}_{1}_{m}_{1}(x)

×

BX^{j+1}

l2=B^{j−1}
l2

X

m2=−l2

X

k

Y_{l}_{1}_{m}_{1}(ξ_{jk})Y_{l}_{2}_{m}_{2}(ξ_{jk})λ_{jk}

=^{X}

j

BX^{j+1}

l1=B^{j−1}
l1

X

m1=−l1

b

l1

B^{j}

b

l2

B^{j}

a_{l}_{1}_{m}_{1}Y_{l}_{1}_{m}_{1}(x)

×

BX^{j+1}

l2=B^{j−1}
l2

X

m2=−l2

δ^{l}_{l}_{1}^{2}δ_{m}^{m}_{1}^{2}

= X∞ l=1

Xl m=−l

a_{lm}Y_{lm}(x) = T (x),

using (8). For our purposes, it is sufficient to recall the main properties of the needlets construction:

• needlets enjoy excellent localization properties in the real domain, each
ψ_{jk}(x) being quasi-exponentially localized around its center ξ_{jk}. As such,
needlets coefficients have been shown to be minimally influenced by the
presence of missing observations.

• the needlets system is compactly supported in the harmonic domain; as
such, the random needlets coefficients are uncorrelated for j −j^{′}≥ 2. Much
more surprisingly, the random needlets coefficients are asymptotically un-
correlated for any fixed angular distance, as the frequency j diverges to
infinity. This property implies that (in the Gaussian case) it is possible to
derive a growing array of asymptotically i.i.d. observations out of a single
realization of an isotropic random field. This opens the way to a plethora
of statistical procedures.

In particular, it is possible to suggest the estimator
Γbj:=^{X}

k

βb_{jk}^{2} =^{X}

k

( _{B}j+1

X

l=B^{j−1}

Xl m=−l

qλ_{jk}b

l
B^{j}

a_{lm}Y_{lm}(ξ_{jk})
)2

=

BX^{j+1}

l1,l2=B^{j−1}

X

m1,m2

b

l_{1}
B^{j}

b

l_{2}
B^{j}

a_{l}_{1}_{m}_{1}a_{l}_{2}_{m}_{2}
(

λ_{jk}^{X}

k

Y_{lm}(ξ_{jk})Y_{lm}(ξ_{jk})
)

=

BX^{j+1}

l=B^{j−1}

b^{2}

l
B^{j}

Cb_{l}(2l + 1),
for which it is simple to show that

EΓ^{b}_{j}=

BX^{j+1}

l=B^{j−1}

b^{2}

l
B^{j}

C_{l}(2l + 1).

(9)

Equation (9) shows thatΓ^{b}j provides an unbiased estimator for a smoothed
version of the angular power spectrum; the advantage with respect to the
standard procedure is that not only unbiasedness, but even uncorrelation
over different scales is asymptotically conserved in the presence of missing
observations, making the implementation of confidence intervals and test-
ing procedures viable [see again Baldi et al. (2006) for details]. Also, even
in the presence of a masked region, the summands {β^{b}_{jk}} are still asymp-
totically independent (over k) as j → ∞, whereas we have seen in (7) that
this is not the case for the random coefficients {alm}. The price for such
robustness properties is clearly connected to the smoothing, that is, in the
presence of missing observations it turns out to be unfeasible to estimate
each angular power spectrum mode C_{l} by itself, and one must stick to a
slightly less ambitious goal, that is, the estimation of joint values averaged
over some subset of frequencies (chosen by the data analyst). There is, of
course, a standard trade-off in the choice of the bandwidth parameter B:

values closer to unity entail a much better resolution, but this brings about worse localization properties on the sphere and therefore a possibly higher contamination from spurious observations; on the other hand, higher values of B yield more robust, but less informative estimates.

Spherical wavelets in general, and needlets in particular, allow for many statistical applications, which go much beyond angular power spectrum esti- mation. One example is the analysis of cross-correlation between CMB and Large Scale Structure (LSS) maps; this is a key prediction of many cosmolog- ical models entailing some form of dark energy and has been implemented on real data byPietrobon, Balbi and Marinucci(2006). Other applications may include testing for non-Gaussianity and isotropy, bootstrap/subsampling

evaluation of confidence intervals for CMB statistics [Baldi et al. (2007)], component separation and many others. Given such a wide array of appli- cations, we stress the need for a more careful analysis of their theoretical underpinnings, with special reference to the effect of the Gaussianity as- sumption on our conclusions. This and many other related issues are left as topics for further research.

3.4. Parameter estimation. In this paper we shall neglect almost com-
pletely another crucial issue in CMB data analysis, which is very tightly
coupled to the estimation of the angular power spectrum, that is, cosmo-
logical parameter estimation. More precisely, the theoretical angular power
spectrum can be written as a function of a number of cosmological param-
eters, such as the baryon, matter and dark energy densities Ω_{b}, Ω_{m}, Ω_{Λ}, the
optical depth τ , the spectral index n_{s}, the Hubble constant H_{0} and others;

of course, the numbers of parameters to be estimated varies across differ-
ent cosmological models, typically ranging from 6 to 16; see againDodelson
(2003) for more details. There are no known closed-form expressions yielding
the theoretical angular power spectrum C_{l} as an explicit function of these
parameters (which we write for brevity as ϑ); however, there are indeed
very fast numerical routines which solve the associated partial differential
equations and provide as an output C_{l} after a specific value of ϑ has been
supplied [seeSeljak and Zaldarriaga (1996)].

Once the set of estimated values C^{b}_{l} has also been derived, there are ba-
sically two approaches that have been implemented to obtain estimates for
the set of parameters, namely, some form of minimum distance estimators,
where the parameters are calibrated to minimize a weighted distance be-
tween C_{l}(ϑ) and C^{b}_{l}, and approximate maximum likelihood methods, where
suitable approximations for the likelihood functions are derived and the esti-
mates are consequently derived. In practice, both methods are implemented
by means of a heavy use of numerical techniques (especially MCMC), and
a lively debate is growing on the construction of the most efficient algo-
rithms. Likewise, an extensive discussion is growing on the construction of
confidence intervals for the parameters, where fundamental issues such as
the differences between Bayesian and frequentist viewpoints are often called
upon (the distinction between these two approaches is not perceived in the
cosmological community in the same manner as in the statistical one; just
to give an example, maximum likelihood estimates are nearly unanimously
labeled a Bayesian procedure in the CMB literature).

For brevity’s sake, we are unable to go deeper into these issues, which are still quite far from a satisfactory solution. We refer, for instance, to Hamann and Wong (2008) and the references therein for more discussion and recent proposals in this area.

4. Testing for non-Gaussianity. Among the several statistical issues which arise in connection to the analysis of Cosmic Microwave Background radia- tion, a lot of attention has been drawn by non-Gaussianity tests. These tests have several motivations. The first is connected to the need for a statistical validation for the predictions of the so-called inflationary scenario, which is currently the leading incumbent as a standard model for the Big Bang dynamics; seeDodelson(2003) for discussions and explanations. Under this labeling, there exist an enormous variety of different physical models, which in a vast majority of circumstances lead to expressions such as

T (x) = T_{G}(x) + fNL{TG^{2}(x) − ETG^{2}(x)},
(10)

where T (x) denotes as before CMB, T_{G}(x) is an underlying Gaussian field,
fNLis a nonlinearity parameter and the unit of measurements are such that
the non-Gaussian part T_{G}^{2}(x) − ETG^{2}(x) is 10^{−4}/10^{−5} times smaller than
T_{G}(x). (10) should be viewed as a strong simplification, for several reasons:

in particular, we are considering exclusively the primordial dynamics, thus neglecting later interactions through the gravitational potential; also, we are ruling out more complicated models, where higher order terms or multiple subordinating fields may be present; and, of course, we are neglecting a whole plethora of observational issues, where possible non-Gaussianities may be formed by secondary effects, such as the interactions of incoming photons at more recent epochs. Despite all these simplifying conditions, (10) does provide an extremely good guidance for features to be expected and, indeed, it makes up a benchmark model against which many procedures have been tested in the last few years. In particular, considerable attention has been drawn by the possibility to constrain the value of fNL, as this depends on constants from fundamental physics [Bartolo et al. (2004)] and as such it allows to probe many features of cosmological models.

Among several statistical procedures which have been proposed in the literature, we shall focus on three main families, namely, tests based upon the bispectrum, tests based upon geometric features of Gaussian random fields (local curvature) and tests based upon spherical wavelets (in this case, so-called Spherical Mexican Hat Wavelets).

4.1. The angular bispectrum. It is obvious that, under Gaussianity, the
sequence {alm}, m = 0, . . . , l makes up an array of independent Gaussian ran-
dom variables (complex-valued for m 6= 0), so that a natural first option for a
test of Gaussianity is to consider their sample skewness a_{l}_{1}_{m}_{1}a_{l}_{2}_{m}_{2}a_{l}_{3}_{m}_{3} and
check whether it is significantly different from zero. This simple idea is made
much more sophisticated by the necessity to impose rotational invariance on
the sample coefficients. Such invariance can be imposed by demanding that
the probability law of the CMB field be invariant with respect to the action

of the rotation group. More formally, let g ∈ SO(3) be any element of the
rotation group in R^{3}; the assumption of isotropy can then be written as

T (x)= T (gx)^{d} for all x ∈ S^{2},
whereas in terms of the spectral representation, we have

X∞ l=0

Xl m=−l

a_{lm}Y_{lm}(x)=^{d}
X∞
l=0

Xl m=−l

a_{lm}Y_{lm}(gx).

(11)

As explained, for instance, inHu (2001) and Marinucci and Peccati (2007), from (11) it follows that the bispectrum of a rotational invariant random field must take the form

Ea_{l}_{1}_{m}_{1}a_{l}_{2}_{m}_{2}a_{l}_{3}_{m}_{3}=

l_{1} l_{2} l_{3}

0 0 0

l_{1} l_{2} l_{3}
m_{1} m_{2} m_{3}

b_{l}_{1}_{l}_{2}_{l}_{3},

where b_{l}_{1}_{l}_{2}_{l}_{3} (the reduced bispectrum) conveys the physical information and
does not depend on m1, m2, m3. The Wigner’s 3j symbols appearing on
the right-hand side are discussed in the Appendix; many more details can
be found, for instance, in Varshalovich, Moskalev and Khersonskii (1988),
Marinucci (2006) and Marinucci (2008), whereas generalization to higher
order cumulant spectra are described in Marinucci and Peccati (2008). A
feasible, rotationally invariant estimator for the (normalized) bispectrum is
provided by

I_{l}_{1}_{l}_{2}_{l}_{3}= (−1)^{(l}^{1}^{+l}^{2}^{+l}^{3}^{)/2} ^{X}

m1m2m3

l_{1} l_{2} l_{3}
m_{1} m_{2} m_{3}

a_{l}_{1}_{m}_{1}a_{l}_{2}_{m}_{2}a_{l}_{3}_{m}_{3}
pC_{l}_{1}C_{l}_{2}C_{l}_{3}
and its studentized version is of course

Ib_{l}_{1}_{l}_{2}_{l}_{3}= (−1)^{(l}^{1}^{+l}^{2}^{+l}^{3}^{)/2} ^{X}

m1m2m3

l_{1} l_{2} l_{3}
m_{1} m_{2} m_{3}

a_{l}_{1}_{m}_{1}a_{l}_{2}_{m}_{2}a_{l}_{3}_{m}_{3}
qCb_{l}_{1}C^{b}_{l}_{2}C^{b}_{l}_{3}

.

The sample bispectrum is discussed, for instance, byHu(2001); asymptotic
properties are provided byMarinucci(2006) andMarinucci(2008), where the
phase factor (−1)^{(l}^{1}^{+l}^{2}^{+l}^{3}^{)/2}is also introduced. In particular, it can be shown
that the sequences {Il1l2l3} and {I^{b}_{l}_{1}_{l}_{2}_{l}_{3}} converge to Gaussian independent
random variables in the high frequency limit where min(l_{1}, l_{2}, l_{3}) ↑ ∞. The
limiting behavior of the bispectrum ordinates, however, is perhaps not the
most significant instrument for the implementation of statistical procedures.

More precisely, it seems more promising to combine the different ordinates into a single statistic, by means of the integrated bispectra

J_{1L}(r) =

[Lr]X

l2=1

( 1

√K XK k=1

Ib_{l}_{1}_{+k,l}_{2}_{l}_{2}
)

, J_{2L}(r) =

[Lr]X

l=1

Ib_{lll}.

Convergence to Brownian motion for both these statistics was established
in Marinucci (2006). The underlying rationale can be briefly motivated as
follows: in both cases we combine several different ordinates into a single
functional statistic, capable of keeping track of the frequency location for
possible deviations from Gaussianity. The different combination of multi-
poles in J_{1L}(·), J2L(·) corresponds to the two well-known classes of squeezed
and equilateral configurations, as discussed again by Marinucci (2006),
Babich, Creminelli and Zaldarriaga (2004) and many others. It is also pos-
sible to provide some results on the asymptotic behavior of these statistics
under non-Gaussian circumstances; in particular, results inMarinucci(2006)
suggest that J1Lwill provide consistent testing procedures (as L → ∞) under
model (10), whereas tests based upon J_{2L}will have asymptotically negligible
power, for all values of fNL. These theoretical findings have been validated
by Monte Carlo simulations inCabella et al. (2006); the integrated bispec-
trum has also been shown to compare favorably with alternative statistical
procedures in some internal statistical challenges within the Planck collab-
oration.

In Figure 3 we report the results obtained by implementing J_{1L}(r) on
the data from the (2003) and (2007) WMAP data releases. We stress that
the simulations are calibrated in a realistic experimental setting, that is,
they do take into account features such as the presence of noise and missing
observations. More precisely, we used 1000 simulated maps of CMB signal
plus noise; we took into account the modulation of noise on the maps given
by WMAP scanning strategy, the presence of a masked region to avoid the
emission from the Milky Way and point sources, and we considered the

Fig. 3. The behavior of J1L(r) on WMAP data.

optical transfer function of the telescopes. To comply with the cosmological
literature, the shaded region represents the 68% confidence interval (1σ)
as evaluated by means of Monte Carlo simulations for various values of
r ∈ [0, 1]: we fixed L = 500 because WMAP data allow a reliable coverage
up to this multipoles; see Bennett et al. (2003) and Hinshaw et al. (2008)
for more technical details on the experiment (it should be noted that in the
x-axis we report rL). The dotted and dashed lines represent Monte Carlo
expected values for our statistics with fNL= ±100, . . . , 500, respectively. It
is possible to check that the boundary value of fNL to ensure detection is in
the order of 200 or larger, that is, with a signal to noise ratio in the order of
a few percentage points. This is indeed confirmed by a more detailed study
inCabella et al.(2006). Finally, triangles (2003 dataset) and squares (2007
dataset) represent the evaluation of the statistic on real data, on the basis
of the previously mentioned WMAP releases. It is clear that the evidence
for non-Gaussianity is rather weak, and, indeed, the statistics get closer to
zero as the observations increase. We must stress, however, that the level
of non-Gaussianity favored by theorists is well below 100, and this is still
consistent with observations at the current resolution. Note that the signal
to noise ratio for the non-Gaussian signal is in the order of fNL/10^{4}, so that
these values are really difficult to detect.

Very recently, inYadav and Wandelt(2007) it has been claimed a detec- tion of a nonzero fNL (≃ 80) by means of a modified bispectrum estimator, which is constructed to take into account the presence of noise and miss- ing observations, at the same time keeping computational costs at a feasible level. This proposal is indeed very interesting; the results, however, are quite close to the boundary level and as such they must probably be considered not conclusive. The general consensus in the community seems to be that new releases of data from more sophisticated experiments such as Planck, and possibly more efficient statistical procedures yet to be devised, will indeed be necessary to settle the question on the possible existence of non-Gaussianity in CMB. It should be stressed, in particular, that the bispectrum requires the evaluation of the inverse Fourier transform (3), and as such it is known to be severely affected by the presence of missing observations (there is some evidence that the detection level could reach fNL≃ 10 or lower for fully ob- served sky maps). Improving the performance of the bispectrum for partial sky coverage is a priority of current research in view of the forthcoming satellite data: for instance, inLan and Marinucci(2008) the bispectrum ap- proach is combined with the needlets construction described in the previous section. Rather than considering these further developments in the bispec- trum literature, we move to other methods which have a local nature, and are thus expected to be more robust in the presence of missing data.