Kinematic quantities computed above 5 Rfor individual events will mostly reflect the distribution of the errors in the acceleration, and not the true acceleration experienced by the events

(1)

Jingxiu Wang

National Astronomical Observatories, Chinese Academy of Sciences, Beijing, China Received 2005 December 7; accepted 2006 June 27

ABSTRACT

The properties of the coronal mass ejections (CMEs) observed by the Large Angle and Spectrometric Corona- graph Experiment ( LASCO) on the Solar and Heliospheric Observatory (SOHO) spacecraft are kept in an online catalog that has been widely used in a series of analyses of individual events and in statistical overviews of CME properties. A subject of some of these studies has been the problem of CME acceleration, in particular the differences between accelerated and decelerated events. The impact of measurement errors in the acceleration is an issue that has been mostly overlooked in such studies. We show in this paper how to obtain error estimates for the height measurements given in the catalog of CMEs observed by LASCO. We find that the error in CME leading-edge position measurements grows rather quickly in the first few solar radii, roughly with the square of the distance from Sun center, but becomes reasonably flat above 5 R, varying approximately as the square root of the distance. Above 5 Rthe typical errors in acceleration are of the same order or larger than the accelerations computed from catalog data. Kinematic quantities computed above 5 Rfor individual events will mostly reflect the distribution of the errors in the acceleration, and not the true acceleration experienced by the events.

Subject headings: Sun: activity — Sun: corona — Sun: coronal mass ejections (CMEs)

1. INTRODUCTION

The Large Angle and Spectrometric Coronagraph Experiment ( LASCO; Brueckner et al. 1995) on board the Solar and Helio- spheric Observatory (SOHO) space mission (Domingo et al. 1995) has observed many thousands of coronal mass ejections (CMEs) since it started its observations in 1996. The measured properties of all these events are kept in an online catalog, which we shall hereafter refer to as the LASCO catalog. The primary measurements made on each CME are the apparent central position angle, the angular width in the sky plane, and the height (distance from Sun center) as a function of time. The catalog also includes average apparent speeds and accelerations obtained through least- squares fitting of the height versus time measurements to first- and second-order polynomials. We refer to Yashiro et al. (2004) for detailed descriptions of the LASCO catalog, and for a description of CME statistical properties based on the catalog. The catalog has been widely used in analysis of many aspects of the nature of CMEs ( Vourlidas et al. 2002; Moon et al. 2002; Gopalswamy et al. 2003; Yurchyshyn et al. 2005).

One question that has been mostly overlooked in studies with LASCO catalog data is the impact of measurement errors in the estimated quantities. The aim of this paper is to provide a way to estimate errors from the data in the CME catalog. The focus will be on the behavior of errors and their impact on accelerations at heights above 5 R, that is, the height ranges sampled by the C3 coronagraph on LASCO. Nonetheless, we also look into the error function for the LASCO C2 coronagraph. During the period that we consider here, from 1996 to 2004 July, there were more than 8000 height-time profiles available in the CME catalog, and we

use information from more than 7000 of those events to infer the errors in CME height measurements.

In what follows we always consider constant acceleration. Al- though this approximation is likely not valid in the large distance range considered, in this paper we are essentially worried about the detection of deviations in constant speed, and not the exact form of the acceleration. Thus we adopt the simplest approach.

2. THE NEED FOR ERROR ESTIMATES IN CME HEIGHT MEASUREMENTS

Although errors in height of a few tenths of the solar radius are mostly irrelevant for things like average velocities, the impact on accelerations can be tremendous, even for simple issues like de- ciding whether an event is accelerated or decelerated. This hap- pens because the contribution of the acceleration to the distance traveled scales with the square of the propagation time. A CME with average velocity V will thus have

R¼a

2(t)²a 2

L V

2

; ð1Þ

where R is the contribution from the acceleration, a is the average acceleration, t is the time interval in which the CME is seen, and L is the length of the LASCO field of view (about 30 R).

An acceleration can be reliably measured if its contribution to the distance traveled by the CME exceeds the errors in the measurement of the CME position near the edge of the LASCO instrument. Let us suppose that height measurements for 30 R

are accurate at the 1% level, that is, the error is about 0.3 R. 1117

(2)

Assuming this value as a representative error, one can then estimate the minimum acceleration that can be determined for a particular average velocity. The typical 3 interval, which defines when a value can be considered positive or negative, will be around 11 m s²for a 2000 km s¹CME and about 0.5 m s² for CMEs with velocities about 400 km s¹. This represents a sort of best case scenario, since many events are not seen above 20 R. Although a relatively simple estimate, this shows that errors in the acceleration values can be of the same order as the accelerations typically measured. The real situation will be more complicated, since the acceleration will likely not be a constant motion, but may vary with height.

3. INTRINSIC CME PROPERTIES, INSTRUMENTAL EFFECTS, AND HUMAN FACTORS

In LASCO images the CME signature is mixed with K-corona structures, stars, planets, comets, and a stable background from stray light and the F corona. The contrast of the images near the edges of the field of view and near the occulter mask is also affected by effects such as vignetting. Added to all this we have subjective human factors that are difficult to quantify. In what follows we briefly describe some of the effects that affect the measurements.

3.1. Pixel Size

One thing that needs to be taken into consideration is that the measurements reported in the LASCO catalog are from two different coronagraphs, an inner C2 coronagraph and an outer C3 coronagraph. We will treat them separately, for they have different instrumental resolutions and correspond to different height ranges (with an overlap in the 4.3Y 6.5 Rrange). It is important to note that the data used in the LASCO catalog are obtained from movies converted to half the original resolution, and that the pixel size in the C3 movies used in the LASCO catalog is typically 0.11 R. For C2 the pixel effects are about 5 times smaller. In what follows we will always refer to catalog pixels to emphasize that we are talking about rebinned data, which differs in resolution from the original data by a factor of 2. We note that the pixel size implies a systematic effect because of the fact that different observers may define the leading edge as starting in the pixel just before or just after where one sees the transition from the background corona. For a given event all positions can always be offset by an amount of1 catalog pixel.

This bias is an important factor when comparing C2 and C3 measurements for the same event. Besides the offset bias, which is likely the same for all measurements in a particular event, the po-

sition of the CME front in a given image cannot be determined with an accuracy better than half a pixel. This is a scatter contribution that marks the lower limit for the error in the measurements.

3.2. Image Cadence

One issue of obvious importance for error assessment is the number of measurements in each of the C2 and C3 coronagraphs.

Figure 1 shows the distribution of the number of measurements in the C2 field of view for slow CMEs, with velocities in the interval from 200 to 500 km s¹, and for fast CMEs, with velocities in excess of 1000 km s¹. As expected, the average number of measurements for the slow CMEs is about 3 times higher than the number of measurements for the fast CMEs, a value that corresponds roughly to the ratio of the average velocities in each group. Things are quite different in LASCO C3. Figure 2 shows the distribution of the number of measurements in the C3 field of view. There are two striking facts: (1) many slow CMEs are not seen in C3, and (2) the average number of images is about the same for fast and slow CMEs. The second fact is an indication that on average the slow CMEs only reach a height of about one- third of the LASCO C3 field of view.

3.3. CME Properties

Figure 3 shows the distribution of the heights corresponding to the last position reported in each event for four different velocity groups. As can be seen, the curves differ strikingly from the lowest to the highest velocity group. The average value for the 400Y500 km s¹CMEs is 8 Rlower than the average value for the group of CMEs with velocities higher than 1000 km s¹. Although we expect that some events might not reach the limit of LASCO C3 due to data gaps or, for example, the excess noise that is seen when a particle event reaches the spacecraft, for the majority of events the maximum height does indeed mean that the CME can no longer be detected in subsequent images.

There may be several causes for the maximum height versus velocity behavior of the CMEs in the catalog, one of them being the intrinsic brightness of the events. A detailed study of the phe- nomenon is outside the scope of this paper, which aims at es- tablishing the error function using only the quantities available in the LASCO catalog. We note that the height of ‘‘disappearance’’

of CMEs from the field of view varies strongly with CME properties. We plot in Figure 4 the distribution of CME angular widths, as reported in the LASCO catalog, for different height ranges. As can be seen, the distribution of CME widths changes consider- ably with height. Narrow CMEs reach much lower heights on average. The broadest CMEs are much less affected. This analysis

Fig.1.—Number of measurements in the C2 field of view for events in the 200Y 500 km s¹velocity range and for events with velocities exceeding 1000 km s¹. The differences between the two groups of CMEs are what are expected for similar time cadences.

(3)

illustrates an important fact: CME average properties change con- siderably with the velocity and height ranges being considered.

3.4. Human Factors

The positions of the leading CME features are determined by human observers using sequences of running difference images.

In general, the transition from CME to background is sharp enough and there is not much ambiguity in determining the position. We decided to duplicate the measurements for a certain number of events to see if we could determine any systematic differences between the observers that collected the positions.

We tested the measurements of one of the observers ( Yashiro) for a few events. In the outer C3 coronagraph we tested 30 measurements at about 10 R against catalog values and found no significant systematic effect, the differences being randomly dis- tributed. The procedure was repeated at 20 Rand again we found the same average values with systematic differences below

the pixel size. In the inner C2 coronagraph, below 3 R, also for 30 measurements, we found a small systematic difference, at about the pixel size. Systematic differences at the level of a pixel are to be expected; the effect is more evident in C2 because the CME leading edge low in the corona is very sharp for many events and the uncertainty in deriving its position is quite small.

Another human effect present in the catalog data is what we will call oversampling. The leading-edge positions are measured using running-difference movies. For a human observer this means that there is an hidden interpolation effect, in the sense that consecutive measurements are not really independent. When the events become more diffuse and difficult to track there may be an involuntary tendency for the observer to place the estimate between what seems to be the position of the leading edge in the other images. The effect will be particularly important when the average separation between consecutive positions is close to the average errors in the measurements. This effect needs to

Fig.3.— Height of the last point for each CME for four distinct velocity groups. In each plot the average of the height distribution is given. Note the strikingly different distributions for the low- and high-velocity CME groups. CMEs with velocities in excess of 1000 km s¹are, on average, seen up to a height that exceeds by 8 Rthe maximum height at which CMEs with velocities in the 400Y 500 km s¹range are seen.

Fig.2.— Number of measurements in the C3 field of view for events in the 200Y 500 km s¹velocity range and for events with velocities exceeding 1000 km s¹. The two groups show very similar average values, a fact that suggests that many of the slow events are detected over a much lower height range than the fast CMEs.

(4)

be taken into consideration for very slow CMEs with particularly high temporal cadence.

3.5. LASCO Signal-to-Noise Ratio and Height Measurement Errors

The quantity that ultimately limits the accuracy of identifying the position of the CME leading edge is the signal-to-noise ratio in LASCO images. This is a function of the CME intrinsic brightness and how that brightness evolves through the LASCO field of view, and also of the background corona and of the instrumental response. We do not try to infer the error function for CME measurements from this quantity; we simply show qualitatively

how the signal-to-noise ratio varies throughout the field of view of LASCO C3. The brightness in the LASCO cameras reflects the coronal contribution from the K and F coronas but also includes components that arise from instrumental aspects, such as stray light. The most insidious effect is vignetting, which scales the brightness according to the distance from the occulter mask used to block the Sun. A detailed account of the effects of stray light and vignetting in LASCO C2 can be found in Llebaria et al. (2004). These effects are also important in C3, but there is a region between 10 and 25 Rwhere they are not very pronounced. We illustrate the variations in noise fluctuations and CME intrinsic brightness variation for an event on this region.

Figure 5 shows the radial variation of the background intensity, background pixel fluctuations, and maximum in CME front- edge brightness for an event in the late hours of 1999 February 2.

The background and background fluctuations were computed at the equator line in the west limb from a set of raw images (so- called level 0.5 data). From each image a median was computed from a strip of 5 pixels above and 5 pixels below the equator. The same region was used to compute the rms, after subtraction from a previous image. The values were then averaged from the set of five ensembles of image/pre-event image. Note that we used 10 images in total to ensure that the quantities were independent.

The CME had a typical three-part structure with a well-defined front edge, and we were able to determine rather accurately the leading-edge positions and the position of the maximum brightness at the same position angle. As seen in Figure 5, from about 7 to 23 Rfrom Sun center both the CME and the background noise follow a roughly power-lawYlike behavior. The CME brightness varies with distance from Sun center R as R^2:5and the background pixel noise as R^1:2.

The CME front-edge width, defined as the distance between the maximum in intensity and the point where the CME reaches background values (the leading edge), varies in a self-similar way throughout this particular CME outward movement. When the leading edge is at about 7 R, the ratio between the two

Fig.4.— Distribution of CME angular widths. The gray area corresponds to CMEs that reach at least 12 R, while the black region to CMEs that reach at least 24 R. Top: Distribution for CMEs with velocities in the interval from 400 to 500 km s¹. A large fraction of the narrower events disappears even before 12 R. Bottom: Distribution for CMEs with velocities in excess of 1000 km s¹. Note that the highest velocity CMEs are still detected at 12 R, and that a large fraction of events is seen at heights greater than 24 R.

Fig.5.—Variation in brightness as a function of distance for different features in LASCO C3 images. The emission is dominated by the background due to stray light and F corona, with the CME showing a very strong decrease as the distance from Sun center increases. As the CME approaches the limit of the field of view of LASCO C3 the event brightness approaches the background pixel noise.

(5)

noise ratios. The human brain performs a series of filtering, in- tegration, and clustering operations that need to be taken into consideration. A human observer is extremely efficient in sorting features out of extreme noise conditions, and the robustness of human inspection is very hard to duplicate in automated procedures. To clearly evaluate the biases resulting from human observations it would be necessary to provide human observers with realistic synthetic events that adequately reproduced CME variation in size and brightness and also the signal-to-noise behavior. The approach of Llebaria et al. (2004) was mostly in- tended to sort out the contributions of the K and F coronas, but would also be important for constructing reliable synthetic events for LASCO C3.

4. ERRORS AND LEAST-SQUARES FITTING A usual way to estimate errors in application using least- squares techniques is to make use of the residuals, that is, the differences between the original values and the values resulting from the fitting procedure. The rationale behind this approach is relatively straightforward, and the method is well known.

We briefly discuss it here, since there are some aspects of it that are important limiting factors in using the LASCO catalog data.

As an example, let us suppose that we have a set of n measurements of CME heights ri, at different instants ti, with errors following Gaussian distributions with standard deviations _i. Let us further suppose that we adjust those values to a certain height-time profile, depending on a number of parameters np, thus obtaining estimated ˆripositions at the tiinstants.

The quantity involved in the minimization leading to the well- known least-squares method is what we shall call, following com- mon usage, ²:

²¼Xⁿ

i

ri ˆri

ð Þ²

_i² : ð2Þ

For a sufficiently large number of points this quantity approaches the number of degrees of freedom, n_f ¼ n np. Hence, to evaluate the goodness of fit, it is usual to use the normalized quantity

S¼

ffiffiffiffiffiffiffiffiffiffiffiffi

²=nf

q

: ð3Þ

We refer to this quantity as the scale factor. A good fit generally means having a scale factor close to 1. A scale factor much higher than 1 means either a poor fit or that the errors were somehow underestimated and should be corrected by a factor S(hence the name ‘‘scale factor’’).

In principle, one can use the quadratic deviations from the least-squares fit to infer the error values. The simplest case is

function for the scale factor has an average of 1 and a standard deviation of 1/(2nf)^1/2. If we consider the typical significance levels of 2.5% to each side of the distribution in scale factors, the effect of the number of events becomes obvious. For 50 degrees of freedom, the scale factor range defined by this high- confidence interval is rather narrow, between 0.80 and 1.20, and as such, an error value inferred from the residuals will be rather accurate. On the other hand, at 8 degrees of freedom the range is between 0.5 and 1.5, meaning that one is prone to underestimate the errors by a factor of 2 or overestimate them by a factor of 50%. Although the overestimation limit is within acceptable usage, underestimating the errors is in general worse than over- estimating, and 8 degrees of freedom (a factor of 2) probably marks the lower limit of what can be considered enough data points.

The relevance of this for the LASCO catalog is obvious when one considers the distribution in the number of measurements in the C2 and C3 fields of view. As shown in Figure 2, for the CMEs with projected velocities over 1000 km s¹there is only one event with more than 11 points in C3, which corresponds to 8 degrees of freedom for a second-order fit. The situation is not much better at lower velocities. Only about 10% of the events have more than 11 measurements in the C3 field of view. What the typical number of measurements means is that, in general, we cannot infer error values from individual events. Even for the best 10% of the events, the error estimates can easily be underestimated by a factor of 2. Hence, trying to infer the error distribution in the CME catalog measurements must be done in a statistical fashion.

5. STATISTICAL APPROACH

As discussed, most events have a rather small number of leading-edge measurements, and the use of the residuals from linear or second-order fits cannot be used on a general basis. This is particularly true if we wish to split data in different distance ranges, since we expect an increase in the error values with distance as CMEs become fainter and more diffuse as they prop- agate outward. Hence, we combine information from a large fraction of the events in the LASCO catalog by using a three-point method, which relies on extrapolations or interpolations using positions and times of a pair of a reference points to the time of a third target point. In what follows we refer to the target point as having time and distance coordinates (ta; ra), and to the pair of reference points as having coordinates (t₀; r₀) and (t₁; r₁).

From the measurements at the two reference points we can compute the estimated position ˆr_aat the time t_aof the target point by using a simple linear interpolation,

ˆra¼ r1

ta t0

t₁ t0

þ r0

t1 ta

t₁ t0

: ð5Þ

(6)

For each group of three points we then define a quantity r, which we shall call the residual, as

r¼ ra ˆra: ð6Þ

Assuming that the three points are relatively close, the standard deviations that characterize their errors will be close to the value aat the reference point, and as such the standard deviation rfor rwill be given by

r¼ a

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1þ t_a t0

t1 t0

2

þ t₁ ta

t1 t0

2

s

: ð7Þ

We note that if we have evenly spaced points in time, then the error distribution for r will have a variance of r¼ a

ffiffiffiffiffiffiffi 3/2 p for an interpolation, and r ¼ a

ffiffiffi6

p for an extrapolation. To put interpolations and extrapolations at the same level, and also to compensate for the fact that the data points are not evenly distrib- uted, it is then better to use the weighted residuals _rgiven by

r¼ r

, ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1þ t_a t0

t1 t0

2

þ t₁ ta

t1 t0

2

s

: ð8Þ

The weighted residuals will have a distribution with a standard deviation equal to _a, both for interpolations and for extrapolations.

5.1. Error Distribution for the Fast Events

We start by considering all the CMEs in the catalog with reported velocity higher than 1000 km s¹that are not ‘‘halo’’

events. From the height-time profile of each individual CME we extracted the positions and times of the measurements closer to 5, 10, and 20 R. We imposed a proximity limit of 1, 2, and 3 R, respectively. We then selected the points just before and after that particular measurement as the reference points.

As a quantitative indicator of the deviations in r we use a robust estimator that is obtained by (1) ordering the values of

r, (2) selecting the rvalues in the ordered sample in positions corresponding to 16% and 84% of the total of events, and (3) dividing this interval by 2. For a nearly Gaussian distribution of values in _rthis procedure will return a quantity that is a good proxy for the standard deviation. Although slightly less efficient than the usual formula for standard deviation, this estimator, which relies only on ordering, is very robust and is not affected by the magnitude of the outliers, only by their number.

As shown in Figure 6, ais about 0.16 Rat a distance from Sun center of 5 R, increasing to about 0.30 Rat a distance from Sun center of 20 R. We note that (1) the average and median values for the residuals are essentially zero, (2) the avalues are substantially larger than the 0.06 Rexpected from the catalog pixel size for C3, and (3) the distributions are nearly symmetric. The _avalues vary by a factor of 2 for a variation of a factor of 4 in the distance, indicating that the behavior can be approximated as being proportional to the square root of the distance.

We note that although acceleration contributes to the differences, there are some aspects that suggest that acceleration is not an important factor at the short time intervals considered here. If the majority of the points shared a positive or negative acceleration, that would mean that the two-point interpolations would preferentially lie above or below the measurement of

the target point. Since the residual distribution does not have a noticeable negative or positive tail, that is not the case. There could be a symmetrical distribution of accelerations, but if that were the dominating factor the acceleration would have to substantially increase in absolute value with the distance. Hence, the most likely explanation is that we are indeed seeing the distribution of the uncertainties in height measurements.

We will thus make the assumption that the error in the esti- mation of a CME leading edge is proportional to the square root of the distance from Sun center:

i

R

¼ ffiffiffiffiffiffiffiri

R

r

: ð9Þ

A correction taking into consideration this evolution with distance, a more refined approximation than the one used to derive

Fig.6.— Distribution of weighted residuals as a function of distance. The values reported on top of the histogram are the sizes of the1 intervals. Note the growth of the standard deviations by a factor very close to 2 for an increase of a factor of 4 in the distance.

(7)

equation (8), can be made by using rather simple error propagation formulas, which leads to a rather useful and simple ex- pression for ,

¼ rR¹⁼²

, ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi raþ r1

ta t0

t₁ t0

2

þ r0

t1 ta

t₁ t0

2

s

: ð10Þ

That is, is the standard deviation that we would expect if we divide the value of rby the same term that is dividing rin equation (10). We will refer to this quantity as the distance- normalized residuals.

We note that the exact shape of the error distribution as a function of distance is not really well constrained; a linear behavior could also characterize it. The square root adopted has the advan- tage that a single parameter can easily describe the radial depen-

dence seen in the errors. Also, as discussed inx 3.5, the noise and CME brightness both likely follow a power-lawYlike behavior in C3, and those are the factors that ultimately determine the accuracy of leading-edge determination. Hence we adopt the square root behavior as a convenient parameterization.

Figure 7 shows the distribution of distance-normalized residuals for the same sample of events used in Figure 6. The distributions at 5, 10, and 20 Rare essentially the same. They are nearly symmetric, with relatively few outliers, and imply that

¼ 0:072 0:0023 adequately describes the error behavior throughout most of the C3 field of view.

In Figure 8 we show the results that are obtained by using the last three measurements for each event, not for interpolation, but

Fig.7.— Distribution of CME distance-normalized residuals in CME position interpolations for CMEs with velocities in excess of 1000 km s¹as a function of height. The distributions are essentially the same at the three reported distances.

See text for details.

Fig.8.— Distribution of CME distance-normalized residuals as a function of velocity for extrapolations involving the last three measurements. The distributions are essentially the same at the three reported velocities. See text for details.

(8)

extrapolating at the time of the last measurement, that is, using the last measurement as the target point. The process is illustrated not only for CMEs with velocities in excess of 1000 km s¹, but also for CMEs with velocities in the 600Y700 and 800Y900 km s¹ ranges. For all velocity groups we retrieve essentially the same value of . Averaging the values from the three velocity ranges we obtain ¼ 0:074 0:0020, in good agreement with the re- sult from the interpolation, and meaning that the last reported position for a given CME is not different from the others in terms of expected error behavior.

5.2. Halo Events

Earth and anti-Earth directed CMEs, the so-called halo events, have rather diffuse edges. Since they account for a rather large fraction of the fast events, we have decided to treat them separately. There were 287 halo events from 1996 to 2004 July. The results for halos are basically the same as the results obtained for the limb events, and we show only the results of the extrapolations to the time of the last measurement. Figure 9 shows the distribution of the weighted residuals for the last three measurements in the C3 field of view for each halo. The value ¼ 0:072 0:0042 is clearly in agreement with the results of the fast events in the C3 field of view.

5.3. The Slow Events

Most of the events in CME catalog are relatively slow events.

As of 2004 July there were 2453 nonhalo events with average velocities in the 200Y500 km s¹ range. The results are very similar to the ones obtained for the fast events. Figure 10 (top) shows the distribution of distance-averaged residuals for three- point extrapolations, where we use the last three measurements in the C3 field of view for each event. The value ¼ 0:060 0:0011 is slightly smaller than the value obtained for the faster events, but the difference is relatively minor if one thinks in terms of C3 catalog pixel size. The difference between the error estimates at 30 Rfor the fast and slow CME samples are about 0.5 C3 catalog pixels. In fact, all possible subplots for the sample of slow events, such as narrow versus broad events, or measurements in the vicinity of 10 Rversus points in the vicinity of 20 R, do not deviate from the behavior found for the fast events by more than 0.5 C3 catalog pixels. Overall, the entire catalog seems to follow the same error distribution, something remarkable given the very disparate morphologies and height ranges of the CMEs in the LASCO catalog.

5.4. Comparing C2 with C3

Figure 10 (bottom) shows the results obtained by selecting only the events for which there is a C3 measurement between two measurements in the C2 coronagraph and using the two C2 points as the reference points and the C3 point as the target point. For an interpolation the error value depends mostly on the error in the target point, and the fact that C2 points likely do not follow equation (9) will not affect the expected results much.

The distribution of distance-weighted residuals is nearly symmetric and averages zero, and the value ¼ 0:062 0:0010 agrees with the one obtained for the C3 extrapolations to the last measurement. Hence we verify that there is no significant offset bias in going from C2 to C3, and that the last and the first measurements in C3 for slow events follow a radial behavior that can be satisfactorily approximated by a variation proportional to the square root of the distance from Sun center.

5.5. Errors in the C2 Coronagraph

Many of the slow CME events have at least three measurements in the C2 field of view, and as such this group of events can be used to determine the error function in the C2 field of view.

The procedure for C2 is somewhat more uncertain than the one for C3, since there is some overlap in the measurements of the reference points in the different intervals. The goal in this section is not to obtain an accurate description of the error behavior in C2 but to illustrate the large differences in accuracy in the height ranges sampled by C2 and C3.

Fig.9.— Distribution of halo CME distance-normalized residuals in extrapolations involving the three last measurements. See text for details.

Fig.10.— Distribution of CME distance-normalized residuals for CMEs with velocities in the 200Y500 km s¹interval. Top: Distribution for extrapolations involving the three last measurements. Bottom: Distribution for C3 measurements that lie between two C2 data points. See text for details.

(9)

The number of slow events is large enough to allow binning of the results according to the height of the target point. Thus, we considered the first three points in C2, used two-point interpolations, and then binned the results according to the value of the target point. The residuals corresponding to measured- minus-estimated values were corrected by the factor given by equation (7), since we do not know if the lower C2 measurements will follow the same error behavior as C3 points. This correction assumes that the error is roughly the same for the three points we are considering, which is a safe approximation for an interpolation.

We then computed the standard deviation of the corrected residuals _r in each distance bin. Figure 11 shows the distribution of errors for C2 and compares it with the a similar distribution for C3 points as a function of distance. As can be seen, the uncertainty around 3.5 Ris very low, but increases very quickly as one gets closer to the end of the C2 field of view, where the deviations approach those seen in C3. Unfortunately, the same kind of procedure cannot be made for faster events, since both the number of events and the number of measurements are too low to get meaningful quantities.

5.6. Summary of the Three-Point Approach

The error in the C3 field of view depends on the distance from the Sun center. The variation is relatively flat and can be approximated as a square-root law. In C2 the variation is much more pronounced, with a behavior that can be approximated by a power law with an index of 2. For both fast and slow events the following is an adequate description of the errors for the measurements in the CME catalog:

(r)

R ¼ 0:065; r R

2

for C2; ð11Þ

¼ 0:07; ffiffiffiffiffiffiffir R

r

for C3: ð12Þ

6. DISCUSSION

Having quantitative estimates for the errors in LASCO measurements, we are in a position to evaluate the prevalence of acceleration for events where such determinations are possible. We have computed the accelerations using a least-squares method and weighting using the errors in the measurements given by equation (12). We also computed an ‘‘average’’ velocity by doing a first- order fit, with no weighting of the points. In this section we highlight only one of the issues linked to acceleration determination; a detailed study of CME acceleration will be made in a separate paper.

From the 7304 events in the LASCO catalog with velocity exceeding 200 km s¹, we selected all events with at least four

Fig.11.— Error in CME height measurements as a function of distance for CMEs in the 200Y500 km s¹velocity interval. Open circles correspond to LASCO C2 measurements, filled circles to LASCO C3 measurements. Note the much faster increase rate of the error with distance in LASCO C2.

Fig.12.—Acceleration vs. quality indicator, defined as the acceleration divided by its error, for a sample representative of the LASCO CME catalog. The dashed line marks the position of the events corresponding to a 3 limit. Only for the events above this line is it possible to reliably estimate whether the event is accelerated or decelerated. The number of individual events in those conditions is very low, only about 16% of the sample shown here.

(10)

measurements in C3, and computed accelerations and errors in the accelerations using only C3 data. Only for 59% of the events did we have four data points. In order to assess the quality of the measured accelerations we define a quality indicator as the absolute value of the acceleration divided by its error. A quality indicator of 3 corresponds to an event where the signal of the acceleration is highly significant. Figure 12 shows the quality of the measured accelerations as a function of velocity for a randomly chosen subsample of 25% of the events. This smaller sample is representative of the full sample and makes the scatter plot easier to analyze. The figure is quite clear; only a very small fraction of the events is above the 3 threshold. Only 675 events have a quality indicator above 3. This corresponds to only 9% of the events with average velocity above 200 km s¹, and to only 16% of the events with at least four measurements in C3.

For most individual CMEs one cannot say whether they are accelerated or decelerated, although the events with low quality indicators are still quite useful. The value of the error can be used to place limits on the absolute value of the acceleration, which for the over- whelming majority of CMEs above 5 Rwill be below 10 m s². On the other hand, a very low quality indicator also means that for practical purposes that particular event can be considered as having a constant velocity. The importance of the quality indicator relates more to studies aimed at splitting the CMEs into accelerated and decelerated subgroups. For most CMEs we simply cannot assign them to such categories. Nonetheless, average properties of CMEs can still be determined, since the acceleration errors due to measurement errors will average out for a sufficiently large sample.

One problem that generally will not be possible to solve with LASCO data is how the acceleration varies with distance above 5 Rfor a given event. The number of events for which one could break the distance interval by a factor of 2 and still get a meaningful acceleration is very small: only 41 out of a sample of more than 7000 events. Even for those 41 events what we get is essentially only the sign of the acceleration in each distance interval.

7. CONCLUSION

Using a three-point approach, through the analysis of expected and observed positions in consecutive measurements of CME positions, we estimated the error function for the measurements in the LASCO catalog. The nature of human pattern recognition and effects such as oversampling mean that true errors are probably somewhat higher, but the subjective human factors cannot be quantified solely through an analysis of the data in the LASCO catalog. Equation (12) is in fact a consistency check of the measurements in the LASCO catalog that can be used as a proxy for the uncertainty in CME height measurements that reproduces the observed behavior of a sample of about 7000 CMEs. A better approximation to the real error behavior in CME front measurements could be done, in principle, by knowing the signal-to-noise behavior for the C2 and C3 coronagraphs with an approach similar to Llebaria et al. (2004), and the intrinsic brightness variations of a sample of representative events. In this situation reliable artificial events could be studied and human factors could be evaluated.

Even with the above limitations, the error function that we found, which takes into consideration the variation with distance from Sun center, allows one to estimate errors for CME accelerations. We verify that for the vast majority of CMEs observed by LASCO one cannot infer whether a given CME is accelerated or decelerated above 5 R. Only for less than 10% of all observed CMEs can one extract the sign of the acceleration in the 5Y30 R

height range. The apparent smoothness found in the distribution of velocities for accelerated and decelerated CMEs (Yurchyshyn et al. 2005) can thus be explained, at least partially, by the fact that most of the information in the acceleration is obliterated by the large errors. Our results imply that any discussion on acceleration properties of CMEs above 5 Rneeds to address the effects of the errors in the acceleration and include discussions on the biases associated with the different height ranges spanned by the CMEs as function of width and distance.

Our results do not invalidate studies based on averages of a large sample of events. The negative correlation between speed and median acceleration found, for example, by Moon et al. (2002) is still valid. This is so because the errors in individual events should average to zero. Our approach would nonetheless allow for a refinement in the values of those studies by allowing com- putation of error bars in those estimates, and the weighting of each value according to its expected error. Since a detailed study of the changes in CME properties with height would also be of importance in such studies, we will not address it here but in a separate paper. We would also like to emphasize that this paper was only concerned with the acceleration above 4Y5 R. Lower in the corona it is well known that large accelerations can occur (Zhang et al. 2004). Hence there is no necessary contradiction between our results and the ones from Moon et al. (2004), for example, since at least a few of their events show essentially all the relevant acceleration below 5 R.

Automated CME recognition procedures have been devel- oped recently ( Robbrecht & Berghmans 2004). These are based on the Hough transform, which detects ridges assuming constant velocity. At present one cannot use these procedures for estimating CME accelerations, but it is expected that new versions will include some measure of acceleration. As in the case of human observers, the automated procedures should be fully tested with realistic artificial events, for errors in height measurements are needed in order to reliably characterize the CME kinematic behavior.

The CME catalog is generated and maintained at the CDAW Data Center by NASA and the Catholic University of America in cooperation with the Naval Research Laboratory. SOHO is a project of international cooperation between ESA and NASA.

This work is supported by the National Basic Research Program of China (2006CB806300) and the National Science Founda- tion of China (10573025 and 40674081). D. Maia was partially supported by Fundac¸a˜o para a Cieˆncia e Tecnologia, through the program POCTI / FNU/43776/2001 and through grant SFRH / BPD/5521/2001.

REFERENCES Brueckner, G. E., et al. 1995, Sol. Phys., 162, 357

Domingo, V., Fleck, B., & Poland, A. I. 1995, Space Sci. Rev., 72, 81 Gopalswamy, N., Shimojo, M., Lu, W., Yashiro, S., Shibasaki, K., & Howard,

R. A. 2003, ApJ, 586, 562

Llebaria, A., Lamy, P. L., & Bout, M. V. 2004, Proc. SPIE, 5171, 26 Moon, Y.-J., Cho, K. S., Smith, Z., Fry, C. D., Dryer, M., & Park, Y. D. 2004,

ApJ, 615, 1011

Moon, Y.-J., Choe, G. S., Wang, H., Park, Y. D., Gopalswamy, N., Yang, G., &

Yashiro, S. 2002, ApJ, 581, 694

Robbrecht, E., & Berghmans, D. 2004, A&A, 425, 1097

Vourlidas, A., Buzasi, D., Howard, R. A., & Esfandiari, E. 2002, in Proc. 10th European Solar Physics Meeting ( ESA SP-506; Noordwijk: ESA), 91 Yashiro, S., Gopalswamy, N., Michalek, G., St. Cyr, O. C., Plunkett, S. P., Rich,

N. B., & Howard, R. A. 2004, J. Geophys. Res., 109, A07105

Yurchyshyn, V., Yashiro, S., Abramenko, V., Wang, H., & Gopalswamy, N. 2005, ApJ, 619, 599

Zhang, J., Dere, K. P., Howard, R. A., & Vourlidas, A. 2004, ApJ, 604, 420