• No results found

GPS Sensor Web Time Series Analysis Using SensorGrid Technology

N/A
N/A
Protected

Academic year: 2020

Share "GPS Sensor Web Time Series Analysis Using SensorGrid Technology"

Copied!
23
0
0

Loading.... (view fulltext now)

Full text

(1)

GPS Sensor Web Time Series Analysis

Using SensorGrid Technology

Robert Granat

1

, Galip Aydin

2

, Zhigang Qi

2

,

Marlon Pierce

2

1

Science Data Understanding Group, Jet

Propulsion Laboratory

2

Community Grids Laboratory, Indiana

University

National Aeronautics and Space Administration

(2)

Introduction

Modern earth sensor networks are producing large volumes of

data.

This demands three things:

Automated methods to search, analyze, and mine the data.

Infrastructure to connect sensors collecting data with users

and methods.

Interfaces through which users can access data and employ

methods.

Here address these demands in a GPS sensor web context

-but most of this work can be generalized to other contexts.

We use RDAHMM, a hidden Markov model-based time series

analysis method, and SensorGrid, a web infrastructure

(3)

Hidden Markov Models

Statistical models for time series data.

Can be used with continuous or discrete valued data.

Fitting an HMM allows us to describe discrete modes

of behavior to the system.

Can be trained with labeled examples (supervised

learning) or without labeled examples (unsupervised

learning).

Successful in many fields (e.g., speech processing,

(4)

Hidden Markov Model Mechanics

Q

1

Q

2

O

1

O

2

O

3

Q

3

Q

T

O

T

State

Sequence

Observations

The HMM is a stochastic state machine: the state at each

point in time is a probabilistic function of the previous

state; likewise the observed output at that time is a

Nois

(5)

Hidden Markov Models for Geophysical

Sensor Webs

Classification of the observation into system/operational modes

is the goal.

Fitting an HMM automatically provides classification; the

solution inherently implies an underlying sequence of discrete

states. Observations are classified according to the state to

which they belong.

(6)

Example of HMM Classification

Seismograph data

collected at 1Hz from

a station in Pasadena,

California.

HMM states are

color-coded.

(7)

Challenges of Geophysical Data

Large volumes of data collected by sensor webs (e.g.,

GPS/seismic networks, ocean buoys).

Little or no labeled training data - so we are almost always in an

unsupervised learning mode.

A priori system information is often unavailable or unreliable.

Data is complicated enough to induce large numbers of local

maxima.

Standard Expectation-Maximization fitting method is vulnerable

(8)

Regularized Deterministic Annealing

Expectation-Maximization

RDAEM is a method for overcoming the problems inherent in

basic EM.

Deterministic annealing modifies the objective function based on

a computational temperature that flattens or accentuates

features.

The annealing method greatly reduces the sensitivity of the

method to initial conditions, but gets stuck in certain structural

local maxima with duplicate states.

We overcome this problem by adding regularization terms that

(9)

Comparison of EM and RDAEM

We compare the methods with two metrics:

The log likelihood of the solutions: Quality.

The number of maxima found in repeated tests:

Stability.

(10)

SensorGrid Architecture

Major components

:

Real-Time filters

Grid Messaging Substrate

Information Service

Filters can be run as Web

Services to create workflows.

Filter Chains can be

deployed for complex

processing.

Streaming messaging

provide high-performance

transfer options.

NaradaBrokering

provides a

robust message-passing

(11)

Real-Time Filters

Real-time data processing is supported by

employing filters around publish/subscribe

messaging system.

The filters are extended from a generic class to

inherit publish and subscribe capabilities.

They can be connected in parallel or serial as

(12)

SOPAC GPS Network

8 networks for 80 stations produce 1Hz high resolution

data.

Socket based real-time binary-RYO format access is

available, but not utilized!

We developed filters to provide multiple format (RYO,

(13)

Integration with SCIGN and SOPAC GPS

Step 1:

Raw GPS data (1Hz) is

converted to RYO format and made

available through a data server.

Step 2:

Data is passed through a series of filters that perform

format conversion and station separation. Message passing

is handled through NaradaBrokering.

In this context, analysis applications - such as RDAHMM - are viewed as just another filter.

(14)

RDAHMM GPS Results via SensorGrid

A Google Maps interface allows a user to selection GPS

stations.

Models are fit to a large initial body of data from each station

(assumes body of data is representative)

.

Trained models are applied to incoming data from each station

.

Currently data are held in 10 minute buffers, analyzed and then

presented to the user (

near-real time, the 10-minute buffer time

is arbitrarily chosen

).

Additional interfaces exist for exploration of archived data.

Segmented time series can be used to perform exploratory

(15)
(16)
(17)

Recording and Replaying Sensor Streams

Filters can be used to record and replay scenarios,

such as Earthquakes in GPS case.

We developed RYO Recorder and RYO Publisher

Filters.

The RYO Recorder creates daily archives of the GPS

Streams.

RYO Publisher can be used to play daily or certain

segments of the records.

We replayed the 2004 Southern California Earthquake

(18)

Conclusions

We have developed analysis and infrastructure methods for

GPS sensor web data.

These methods are not network or data specific and can be

extended to other sensor networks and data types.

A hidden Markov model-based time series analysis method

provides robust segmentation and classification results that can

be applied in near-real time

(next step: full real time).

SensorGrid infrastructure allows robust and flexible connections

between data sources, applications, and users.

Demo of the user interface (with Scripps collaborators) at Tue.

(19)

Hidden Markov Model Parameters

Initial probabilities

State-to-state transition probabilities

Output distributions

Wher

e

A hidden Markov model

wit

(20)

Hidden Markov Model

Expectation-Maximization

EM is the standard method for fitting HMMs to data.

Iterative, starts with an initial model guess.

“E”-step: Calculate the

expectation

of the log likelihood of the

model given an estimate of the unknown parameters.

“M”-step:

Maximize

the expected value of the log likelihood in

the unknown parameters.

The so-called

Q-function

optimized in the “M”-step is

(21)

Regularization Terms:

Gaussian Output Distributions

We modify the likelihood objective function with the

following improper prior:

This prior is smallest when the means are identical. It

manifests as a regularization term added to the

Q-function

:

(22)
(23)

National Aeronautics and Space Administration

Jet Propulsion Laboratory - California Institute of Technology

National Aeronautics and Space Administration Jet Propulsion Laborator California Institute of Technolog

References

Related documents

To combat this issue, CCA’s Adult Health Center has partnered with the Food Pantry to help patients with high blood pressure, diabetes, and high cholesterol manage their conditions

Eliminating fi ring restrictions in just one country, leads to an increase in steady state output by 3.43 percent, consumption by 3.17 percent, employment by 4.50 percent and real

We are convinced that metrics are very useful also for SCM processes and tools – and have experienced how difficult it is to start finding and using SCM metrics.. We see a need

On motion of Commissioner Gabrielson, seconded by Commissioner Reinartz, the following Proclamation was unanimously passed and adopted by the Mower County Board of Commissioners at

Any bits set in the Status byte register acts as a summary register for all the three other status registers and indicates if there is a service request, an error in the Error

This thesis will focus on investigating what millennials are looking for and requiring when doing their groceries in addition to the trends of the industry

Considering the operating principle of the phase-locked loop (PLL), the offshore frequency flexibility is utilized to improve the offshore voltage

By tracking levels of student engagement, the use and analysis of learning analytics provides a level of clarity which can dispell uncertainty around how to allocate