• No results found

How To Create A Signal Data System

N/A
N/A
Protected

Academic year: 2021

Share "How To Create A Signal Data System"

Copied!
19
0
0

Loading.... (view fulltext now)

Full text

(1)

Biomedical Big Data for Clinical Research and

Patient Care: Role of Semantic Computing

Satya S. Sahoo

Division Medical Informatics Case Western Reserve University

(2)

Signal Big Data: Research and Patient Care

• 

Velocity

: Rapid rate of signal data collection in epilepsy centers

o  24 hours recordings: 8-10 GB per patient

o  Typical patient admissions span 5 days

o  100-150 patients per year

• 

Volume

: 11 TB in 3 years and 18 TB by end of May 2014

• 

Variety

: Data collected using different study protocols and

equipment

0 2000000 4000000 6000000 8000000 10000000 12000000 Ja n ' 11 F eb ' 11 M arc h ' 11 A pri l ' 11 M ay ' 11 June '1 1 Jul y ' 11 A ug ' 11 S ep ' 11 O ct '1 1 N ov ' 11 D ec '1 1 Ja n ' 12 F eb ' 12 M arc h ' 12 A pri l ' 12 M ay ' 12 June '12 Jul y ' 12 A ug ' 12 S ep ' 12 O ct '12 N ov ' 12 D ec '12 Ja n ' 13 F eb ' 13 M arc h ' 13 A pri l ' 13 M ay ' 13 S iz

e of D

ata (M

B)

Time Period (in months)

Growth in Electrophysiological Signal Data

Cumulative patient data in EMU Cumulative PRISM-specific patients data

0 200 400 600 800 1000 1200 Ja n ' 11 F eb ' 11 M arc h ' 11 A pri l ' 11 M ay ' 11 June '1 1 Jul y ' 11 A ug ' 11 S ep ' 11 O ct '1 1 N ov ' 11 D ec '1 1 Ja n ' 12 F eb ' 12 M arc h ' 12 A pri l ' 12 M ay ' 12 June '12 Jul y ' 12 A ug ' 12 S ep ' 12 O ct '12 N ov ' 12 D ec '12 Ja n ' 13 F eb ' 13 M arc h ' 13 A pri l ' 13 M ay ' 13

Cumulative Number of Patients

Number of patients admitted to EMU Number of patients enrolled in PRISM

(a)

(b)

(3)

Background: Electrophysiological Signal Data

Electrophysiological signal data

o  Electroencephalogram (EEG): intracranial or scalp electrodes

o  Electrocardiogram (ECG)

o  Polysomnogram (PSG)

Signal data plays critical role in clinical research and patient

care

o  Pre-surgical evaluations to identify eloquent cortex

o  Identify seizure onset zone in epilepsy

o  Correlation between seizure events and other physiological

(4)

Background: Neurosciences Research

Multi-center project to study Sudden Unexpected Death in

Epilepsy (SUDEP)

Low rate of reported incidents

o  Multi-center collaboration for viable cohort size

Expect to enroll 1200 patients from Epilepsy Monitoring

Units

o  Case Western-University Hospital, Cleveland

o  University of California, Los Angeles

o  Northwestern University, Chicago

o  National Hospital for Neurology and Neurosurgery, London,

(5)

Computational Challenges: Signal “Big Data”

Scalable storage

for large volume of data

o  Data partitioning and storage on distributed file systems

High performance

data processing pipeline to cope with

rapid rate of data generation

o  High level of parallelization for both speedup (to cope with

velocity) and scale out (to cope with volume)

Efficient

query execution and data retrieval

o  Optimal data partitioning for parallelizing data retrieval

o  Optimal co-location for minimizing remote data transfer

Interactive

signal visualization

o  Minimize network transfer latency

(6)

Cloudwave Architecture

Cloudwave aims to support:

o  Web-based interface for signal

analysis and visualization

o  Multi-center collaborative

studies

o  Efficient signal processing and

analysis

Three components:

o  MapReduce data processing

pipeline

o  Data Modeling and optimal

data partitioning

o  Ontology-driven query and

(7)
(8)

Data Processing: Results of Comparative Evaluation

Performance evaluated over two variables

o  Extracting data for increasing number of channels

o  Extracting data for increasing number of patient studies

•  An order of magnitude improvement in time performance

...

Ch1Ch2 Ch3 Ch4Ch5 Ch6 Ch k

... ... ... . . . rec 1 rec 2 rec 3 rec n Ch1 rec 1 rec 2 rec 3 rec n Ch2 Ch3 . . . ... Ch k

EDF File Channel-specific Files Distributed File

System Map Reduce Program 0 20 40 60 80 100 120 140

10 20 30 40

Ex e c u ti o n T Im e (m in )

Number of Signals

Average EDF Processing Time for increasing number of signals

Standalone Cloudwave 0 20 40 60 80 100 120 140

5 10 15 20 25

Ex e c u ti o n T Im e (m in )

Number of Studies

Average EDF Processing Time for

increasing number of studies Standalone Cloudwave

Desktop

Computer Desktop Computer

Map Reduce Map Reduce

(9)

Signal Modeling: Cloudwave Signal Format

Cloudwave Signal Format (CSF)

Epilepsy and Seizure Ontology Cloudwave Signal Format (CSF)

Metadata: Signal Collection Protocol

Segmented Signal Data

Metadata: Study Patient Details

(10)

Signal Query and Visualization

Patient cohort queries: Using the VISAGE interface

Electrophysiological signals for selected patient visualized in

(11)

Features of the Signal Query and Visualization Module

•  Query using the Epilepsy and Seizure Ontology (EpSO)* classes

•  Reconciling semantic heterogeneity and subsumption reasoning

* Sahoo et al. JAMIA 2013

(12)

Data Partitioning: Efficient Network Transfer and

Visualization

Performance evaluated for transferring segments of CSF

files corresponding to “

signal epoch

” (e.g. 30 sec epoch)

Consistently faster than naïve signal channel-based

approach for six standard “signal montages”

0" 10000" 20000" 30000" 40000" 50000" 60000"

Channel" Epoch" Channel" Epoch" Channel" Epoch" Channel" Epoch" Channel" Epoch" Channel" Epoch" M1"Montage" M2"Montage" M3"Montage" M4"Montage" M5"Montage" M6"Montage"

CSF"Epoch"Render" Channel"Data"Segment"Render" CSF"Epoch"Load" Channel"Data"Segment"Load"

T

ime

(i

n

mi

lli

se

co

nd

(13)

Performance evaluated for transferring CSF format files

with signal data as array of integers

Consistently faster than traditional binary signal data

format for six standard “signal montages”

0" 10000" 20000" 30000" 40000" 50000" 60000"

Binary" CSF"" Binary" CSF" Binary" CSF"" Binary" CSF" Binary" CSF" Binary" CSF" M1"Montage" M2"Montage" M3"Montage" M4"Montage" M5"Montage" M6"Montage"

T

ime

(mi

ll

is

ec

on

d

s)

Binary Format Load CSF Load Binary Format Render CSF Render

Data Partitioning: Efficient Network Transfer and

Visualization

(14)

Semantics: Epilepsy and Seizure Ontology

EpSO models the four-dimensional epilepsy and seizure

(15)

Signal Processing: ECG Data

Use MapReduce algorithms to identify QRS complexes in ECG

data

(16)
(17)

Signal Processing: ECG Data

(18)

Take Home Points

Cloudwave represents a new approach for managing

massive amounts of electrophysiological signal data

o  Potential role in clinical research and patient care

o  Brain studies

o  Sleep medicine

Cloudwave

!

Domain semantics

(modeled in Ontology)

+

Distributed Storage

+

Parallel Computation

Domain ontology support:

o  Optimal data partitioning scheme

o  Complex ad-hoc queries

(19)

Thank you!

•  Funding: The PRISM (Prevention and Risk Identification of

SUDEP Mortality) Project (1-P20-NS076965-01)

•  Acknowledgements: PRISM PI: Dr. Samden Lhatoo, Co-I Dr. GQ

Zhang, Catherine Jayapandian, Aman Dabir, Chien-Hung Chen, Licong Cui

References

Related documents

Puji syukur kehadiran Allah SWT atasa segala rahmat,hidayah serta pertolonganya sehingga skripsi dengan judul ”Pengaruh Share Growth, Ukuran KAP, Pergantian

corresponding document representation vectors(see UC-9 Create the representation vectors for the document ot the user’s query).Then, based on the document's MeSH

Consequently, we will be able to compute the entangle- ment degradation introduced by the Hawking effect as a precise function of three physical parameters, the distance of Rob to

The Government believes that introducing a financial sanction specific to cases to which the GAAR applies might be an appropriate means of strengthening the deterrent effect

I argue that women who migrated to Mexico City to work as domestic employees, and who relied on female social networks, were able to more effectively challenge patriarchal

Está solamente por la presencia de la tierra que el fuego puede quemar (fuego en la tierra), agua puede ser aspirada de (agua en la tierra), los cuerpos puede

In such a context, the ICJCE had a low-proŽ le role with respect to the overriding state legislation, which in turn affected the ICJCE’s policies towards the exclusion of women from

Wally points out that a short waiting period is a must for self-employed individuals, “Because I had a 7 day waiting period, I was able to move quickly into claim stage, with