Mining the SuperDARN Geospace
Mining the SuperDARN Geospace
Facility Database Using Spatio-
y
g p
Temporal Process Discovery
Al
ith
Algorithms
Joseph B.H. Baker and J. Michael Ruohoniemi
Bradley Department of Electrical and Computer Engineering Virginia Tech
Bradley Department of Electrical and Computer Engineering, Virginia Tech
Naren Ramakrishnan
Department of Computer Science Virginia Tech
Department of Computer Science, Virginia Tech
EarthCube Scientific Discovery
Scientists are not interested in data for data’s sake, but rather in extracting deeper
physical content from the measurements (e.g. wave phenomena) and examining
h
t t i
d t
t l t t
t t
idi
i
th d t
t
how content in one dataset relates to content residing in other datasets.
How do we steer EarthCube towards the goal of making data-based research
easier to conduct and more productive? Can we develop broadly applicable,
interactive tools to help users directly access the physical content of geoscience
datasets and incorporate this functionality into the structure of EarthCube? How
do we test the tools and EarthCube innovations generally?
Geoscientists need to know what is possible using advanced data mining
techniques, and computer scientists need to know more about what the
geoscientists are looking for in their data.
A Possible Data Mining Goal: To develop an interactive EarthCube data mining
interface that will lower the threshold for gaining access to the deepest physical
content residing in all geosciences datasets and thereby enhance scientific
d
i i
d
id
i i f
h
d d
i
productivity and provide new opportunities for outreach and education.
Use Case: SuperDARN
The Super Dual Auroral Radar Network (SuperDARN) is an international network of high-frequency (HF) radars for researching the Earth’s upper atmosphere, ionosphere and connection into geospace.
SuperDARN Radar Targets
B
Ionospheric plasma irregularities
F-Region Ionosphere
•
HF radar signals are refracted in the ionosphere as they traverse gradients in electron density.HF radar signals are refracted in the ionosphere as they traverse gradients in electron density.•
The signals transmitted by SuperDARN radars can be backscattered by: 1) Ionospheric plasma irregularities (“half-hop” mode)2) The ground (“one-hop” mode)
3) M il h l i d ( h ) 3) Meteor trails at mesosphere altitudes (near-range echoes)
The Geospace System
SuperDARN data provides a bottom-side view of the Geospace system: ionospheric electric fields, plasma structuring, winds, gravity waves, planetary waves, pulsations, etc.
SuperDARN Data Views
Time R ange P ower 1 1/2 h 2-minute scan R P D oppler 1 1/2 hop 1/2 hop 1 1/2 hop 1/2 hop Ground ScatterRadar Field-of-View Doppler Map
D
W
idth
/ p
Single beam range-time series (Power, Doppler, Spectral Width)
Sun W ial P olar Potent i
Multi-Radar Convection Map Multi-Radar Key Parameter Index (i.e. Cross-Polar Potential)
SuperDARN Observables
SuperDARN data is widely used by geospace scientists to specify ionospheric electric fields l i l l (i 1) h h i ifi bili i (i 2 13) over large spatial scales (item-1); however, other significant capabilities (items 2-13) are under-utilized because extensive interaction with a SuperDARN scientist is required.
Example: Gravity Waves
Gravity waves are buoyancy waves which propagate in the atmosphere and have similar characteristics to ocean waves. They are large-scale features that appear in multiple datasets.
Example: Gravity Waves
Gravity Wave
At ionospheric altitudes gravity waves produce periodic structures in ionospheric plasma density that
Ti
m
e
appear in SuperDARN radar data as quasi-periodic enhancements in backscatter from the ground.
Gravity waves are ubiquitous features in SuperDARN d t b t th t d t b l d ith d b d data but they tend to be overlapped with and obscured by other more dominant features.
Extraction of gravity wave signatures from the SuperDARN data stream and quantifying their large
Range
SuperDARN data stream and quantifying their large-scale impact on the structure and dynamics of the ionosphere-thermosphere is a computational challenge.
Mining SuperDARN Data
Feature Extraction using Spatial Aggregates Interpolation by Data Driven Surrogates
AGW Detection Algorithm
BASIC APPROACH: Use temporal “Motif Mining” to identify gravity waves as
recurrent events (i e “motifs”) with broadly similar (but not identical) features
recurrent events (i.e. motifs ) with broadly similar (but not identical) features.
COMPLICATION: Motif mining is generally applied to simple time series data
(either univariate or multivariate) and so the basic framework had to be extended
to work on sparse two-dimensional time-series data.
ALGORITHMIC STEPS:
Thresholding detects enhanced backscatter at each radar range
Cluster analysis identifies temporal hot-spots of backscatter at each range
Edge detection ties together the backscatter clusters over multiple ranges
Motif mining identifies transitions in the clustered data (i.e. wave fronts)
Raw Range-Time Data
Time
R
ange
w
er
(dB)
ge
Gate
R
Po
w
Ran
g
Universal Time
Gravity Wave Detections
Time
Gravity Wave Fronts
R
ange
ge
Gate
R
Ran
g
Universal Time
Summary (Wish List)
Many geosciences datasets are currently under-utilized because the custodian scientists do not have the expertise in advanced computer-aided data analysis required to extracto ve e e pe se dv ced co pu e ded d ys s equ ed o e c complex spatiotemporal features.
The EarthCube initiative provides the opportunity to not only produce transformative change in geosciences, but also in data analytics, as well.
An EarthCube data mining interface should incorporate a “plug-and-play” compositional capability that allows users to interactively identify known events or features, extract quantitative information about them, and search for recurring instances.
Ideally the interface should also allow the user to analyze multiple datasets at significantly Ideally, the interface should also allow the user to analyze multiple datasets at significantly higher levels of abstraction from the individual sensor streams and interactively define new classes of patterns from simpler building blocks and identify as-yet unknown recurrent patterns and features in individual datasets and across multiple datasets.
A h i E thC b d t i i i t f ill l th th h ld f i i A comprehensive EarthCube data mining interface will lower the threshold for gaining access to the deepest physical content residing in the various datasets and increase scientific productivity.
Such a system will also make geosciences data immediately accessible to students of ally g y ages, allowing them to become “citizen scientists”.