4.2 Data Selection, Organisation
4.2.2 Data Organisation
This section will discuss how the raw data are organised to create the initial raw data files and alternative data binning methods to the conventional mean-averaging method. Each binning technique will be explored in relative detail so as to establish which is most suited to maximise the amount of data available for analysis while producing reliable and meaningful results.
4.2.2.1 Initial Raw Data file Creation
The initial datasets to be used during the research discussed in this thesis are the CIS moment and auxiliary (aux) datasets. The moment dataset provides the ion velocity vectors and time-stamp while the auxiliary dataset provides the position of each Cluster spacecraft at any given time in GSE coordinates, as well as another time-stamp. The data contained within the two datasets are sampled at different time-resolutions, the moment data at spin-resolution (four seconds) and the aux data at one-minute resolution. In order to ensure no data is lost at this early stage, an interpolation is carried out between consecutive position data-points, creating an intermediate position value which can then be assigned to the appropriate velocity vector. This preserves the four-second resolution rather than reducing it down to one minute.
Once all six years of data have been interpolated and the new position vectors generated, merged data files are created to store not only the velocity and position vectors, but also other useful information such as temperature and density values as well as the various data quality flags.
4.2.2.2 Spatial Binning
The raw data are averaged into discreet 3 RE× 3 RE× 3 RE spatial bins. The main reason for this bin size is that it was used by Kissinger et al. (2012)’s study of plasma flow in the magnetotail during SMC and substorm times. Using the same bin size allows for direct comparison between Kissinger’s investigation and the results presented in this thesis. It also allows for visual clarity when plotted as there are few overlapping vectors as a result. This binning criteria was used throughout the research presented in this thesis.
The data was also experimentally organised into 1 RE×1 RE×1 RE bins in an attempt to improve the spatial resolution. This was quickly decided against as it was difficult to present the data in a clean and organised manner with so many vectors and counts being printed on the various figures. In addition, using smaller bins drastically reduced the amount of data available per bin, roughly reducing the counts to one-ninth of the larger bins. This meant that the counts often fell below the nominated 100 count cutoff value.
Using this method of binning, as with any other does have its drawbacks. For example it does not show precisely when the data was collected during the experimental window, whether it was taken over multiple years or on one single orbit. It is also not dwell-time normalised so does not show how much useful data was actually collected in comparison to the amount of time the spacecraft spends in the corresponding bin. While these parameters are useful in determining the fine details about when the data were collected and how they correspond to specific events, ultimately this investigation was about the average flow characteristics and thus it was decided that going into this level of detail was not necessary at this time.
4.2.2.3 Angle Average Binning
For this binning method, the velocity vectors contained within each spatial bin are subdivided into 8 discreet bins which in total cover an angle of 360◦, as seen in Figure 4.5. Each angle bin covers a 45◦ direction with the zeroth bin centred in the positive x-direction and each consequential bin numbered moving in a clockwise direction such that the positive y-direction is in bin two. The velocity vectors are subsequently placed in one of the eight angle bins according to the angle at which it makes with the positive x-axis.
This method essentially reduces the amount of data cancellation that occurs when av- eraging vectors with components orientated in opposite directions. By using discreet
Figure 4.5: Angle averaging bin orientation with positive x directed towards the left of the page.
angle bins, the method preserves the magnitude of plasma flow velocity in each of the 45◦ bins and avoids the possibility of it cancelling down to zero, which would not give a good representation of the average plasma flow behaviour.
The main limitation of the angle averaging method is that it restricts the angular res- olution to 45◦ (smaller bins could be used, but it would lead to the reduction of data available for the study) This can be improved by introducing a larger number of nar- rower angular bins, but this has the potential to lead to the skewing of results if an anomalously large velocity vector is present and the data-count per bin in too low. The other difficultly with this method is the selection of which of the eight angular bins best represents the data spread in each spatial bin. The options were to select via the great- est velocity or the largest number of counts per angular bin. The latter was chosen to avoid the skewing previously mentioned and using count number essentially makes the selection according to which is the most common direction in which plasma flow occurs, giving the best representation of the overall behaviour of the system. This method was not used in the final study. The following section describes the data binning method used through the research conducted in this thesis.
4.2.2.4 Hemispherical Data Overlap
The offset of the Earth’s rotation axis from the z-axis of the GSE coordinate system gives rise to some effects which need to be assessed. As the Earth rotates, an apparent rocking motion of the Earth’s magnetotail occurs as a result of fixing the x-axis of the GSE and GSM coordinate systems along the Earth-Sun line. This motion can be seen
Figure 4.6: Effect on Earth’s magnetotail orientation caused by it’s daily rotation.
The red and black dipole-axis and magnetic field lines show an example of how the mag- netotail could appear at 12 hour intervals. Adapted from two images,timeanddate.com
(1995-2017) andNarinder (2016).
Figure 4.7: Seasonal effect on the Earth’s magnetotail caused by the tilt of it’s polar axis and it’s relative position to the Sun. Adapted from two images,timeanddate.com
(1995-2017) andNarinder (2016).
in figure4.6. The red polar-axis and magnetic field lines illustrate how the magnetotail could be orientated about either of the coordinate system’s x-axis. The black dipole-axis and magnetic field lines illustrate how the system could be orientated 12 hours later. It is clear there can be quite a dramatic difference.
This daily effect can also be seen on a larger scale as a seasonal effect as the Earth orbits then Sun and can be seen in figure 4.7. It is clear from the figure that depending on where the Earth is relative to the Sun, the magnetotail can appear to be situated on either side of the Earth-Sun line.
The apparent rocking motion of the magnetotail can attribute to cross-contamination of plasma flows in the northern and southern hemispheres if the selection criteria is based solely on GSE/GSM coordinates. This could lead to a skewed average direction results if the vectors are placed in the incorrect bins on account of this effect. Therefore another method of binning the data could be to organise it according to the moment vectors’ associated Bxvalue. If a vector’s Bxvalue is less than zero, it is situated in the southern hemisphere and if Bxis positive, it is in the northern hemisphere. Combining this binning technique with the angle averaging technique would both maintain the magnitude of the velocity and reduce the hemispherical contamination of the vectors allowing for a clearer representation of the average plasma flow in the Earth’s magnetotail. This method was also not used in the final study because a better averaging method was found, as described below.
4.2.2.5 Earthward and Tailward Plasma Flow
An alternative method of data binning which could be utilised is to bin the data accord- ing to the prevailing x-direction the plasma flows in. Vectors are classed as earthward if they have an associated positive Vx value and tailward if they have a negative Vx value. The dataset would be organised according to the sign of the vectors Vx within each spatial bin, and then mean-averaged together. The mean-averaging is carried out by calculating the sum of the vector components in each of the x, y and z-directions separately, and then dividing each value by the total number of data-points contributing to each returned value. Finally the three averaged values are combined trigonometri- cally to provide the prevailing vector magnitude and direction for both earthward and tailward flows for every spatial bin. This method does not require the use of average- angle-binning (section 4.2.2.3) as there is no risk of cancelling out the moment vectors to zero.
Binning moment data according to Vx was chosen to be the main averaging technique throughout the research presented in this thesis because it avoids the cancellation of oppositely directed vectors. This is because they are not placed into discrete angular bins and as such, this method possesses a much higher degree of directional precision than the average-angle-binning technique. The main drawback to the Vx method is that it is not used in conjunction with the hemispherical binning method. While it is possible to combine both binning methods, in practice it is somewhat impractical as it would double the amount plots created and reduce the amount of data available in many of the bins. The reduced amount of data per bin would become more significant when analysing substorm data as there is not a huge amount available to begin with. As a result while there will be some northern and southern hemispherical cross-contamination of moment
vectors, it was decided that preserving the number of data points within each spatial bin was of higher priority. Overall, it was decided this data binning the data according to earthward or tailward plasma flow direction would yield the best representation of the moment datasets.