• No results found

From Data to Foresight:

N/A
N/A
Protected

Academic year: 2021

Share "From Data to Foresight:"

Copied!
10
0
0

Loading.... (view fulltext now)

Full text

(1)

1 © 2011 IBM Corporation

From Data to Foresight:

Laura Haas, IBM Fellow

IBM Research - Almaden

(2)

The road from data to foresight is long



Must acquire, integrate, enhance and align



Must deal with missing and incomplete data



Must store, protect, and manage



Must create models and other analytics and test them



Must run these analyses efficiently over large data volumes



Must understand and share results



Requires significant (and expensive) EXPERTISE in data management,

systems, analytics, and the domain



Takes TIME

?

How can I reduce my

?

Consumer Reports RAINFALL ERROR RAINFALL ERROR SATURATION & SURFACE Runoff OVERLAND ROUTING UPDATE STATE UPDATE STATE UPDATE STATE UPDATE STATE SOLVE STATE EQUATIONS SOLVE STATE EQUATIONS SOLVE STATE EQUATIONS SOLVE STATE EQUATIONS PERCOLATION PERCOLATION MISCELLANEOUS FLUXES MISCELLANEOUS FLUXES MISCELLANEOUS FLUXES MISCELLANEOUS FLUXES MISCELLANEOUS FLUXES UPPER LAYER EVAPORATION UPPER LAYER EVAPORATION UPPER LAYER EVAPORATION LOWER Layer EVAPORATION LOWER Layer EVAPORATION LOWER Layer EVAPORATION INTERFLOW BASE FLOW BASE FLOW BASE FLOW SATURATION & SURFACE Runoff PERCOLATION INTERFLOW SOLVE STATE EQUATIONS LOWER LAYER EVAPORATION UPPER LAYER EVAPORATION Misc fluxes UPDATE STATE

Note: in addition to dependencies shown, most flux calculations are dependent on values of state variables at the previous timestep Instantaneous Runoff Routed Runoff Total Water: Upper Layer, Lower Layer OUTPUT

Legend: Flux computations State computations

(3)

3 © 2011 IBM Corporation

The 4 V’s of data

Volume

Velocity

Variety

Veracity*

Data at Rest

Terabytes to

exabytes of existing

data to process

Data in Motion

Streaming data,

milliseconds to

seconds to respond

Data in Many

Forms

Structured,

unstructured, text,

multimedia

Data in Doubt

Uncertainty due to

data inconsistency

& incompleteness,

ambiguities, latency,

deception, model

approximations

(4)

Valuable new insights are hidden in this wealth of data!

Identify criminals and threats

from disparate video, audio,

and data feeds

Make risk decisions based on

real-time transactional data

Predict weather patterns to plan

optimal wind turbine usage, and

optimize capital expenditure on

asset placement

Detect life-threatening

conditions at hospitals in

time to intervene

(5)

5 © 2011 IBM Corporation

Fortunately, new platforms can unlock the value of data

BI / Reporting BI / Reporting Exploration / Visualization Functional App Industry App Predictive Analytics Content Analytics

Analytic Applications

IBM Big Data Platform

Systems

Management

Application

Development

Visualization

& Discovery

Accelerators

Information Integration & Governance

Hadoop

System

Stream

Computing

Data

Warehouse

New analytic applications drive the

requirements for a big data platform

• Integrate and manage the full

variety, velocity and volume of data

• Apply advanced analytics to

information in its native form

• Visualize all available data for

ad-hoc analysis

• Develop new analytic applications

• Optimize and control scheduling of

many simultaneous analyses

(6)

Outcome-based medicine vision: Leverage public and private content, rich analytics

to improve treatment outcomes

Research & Development and

Intellectual Property

Target Identification and Validation Lead Discovery and Optimization Safety and Efficacy

Genomics Proteomics Metalobomics Chemical and Biological Extraction, Profiling, Analytics, And Reasoning

Clinical Decision Support

Patient Similarity and Segmentation Patient Cohorts for Clinical Support Clinical Genomics Analysis

Comparative Effectiveness Research Predictive Modeling of Outcome Disease Progression Analysis Treatment Cost Analysis Temporal Analysis

Patient experience

and social

community support

Patient first hand experiences Social community

development and support

Target Selection Candidate Selection Development Selection Target Identification Lead Discovery Preclinical Development Clinical I II IIIl Patient Experience Launch Patient Outcome Medical Care

(7)

7 © 2011 IBM Corporation

An Example: Leveraging data to accelerate life sciences R&D

► R&D

Find white space and gain insight into complex chemical and biological patents; Gain early insights into given target-compound match from past patents for better research target & target-compound selection decisions

► Legal

Detect IP infringement earlier and increase the quality of patent filings

► Corporate Strategy / Business Dev

Identify collaboration and acquisition targets for greater research value and effectiveness and find patent in- and out licensing candidates for efficient management and monetization of IP

► Valuable insights into competitive landscape, white space, and IP portfolio

► High quality chemical extractions available hours after patents are available from patent authorities

► Previously unobtainable insights at the scientists’ fingertips with the touch of a button

► Fast and easy search and analysis drastically reducing search time from weeks and months to just minutes

The Benefits

Highly volatile, increasingly complex environment

Traditional R&D is not delivering

New approaches are needed

Collaborative R&D models

The new normal requiring open platforms, clear boundaries and protection

Agile responses

Vital to drive fast adaptation to changing competitive IP landscape including, adjustments to strategy, portfolio investments and partnerships

Effective IP portfolio management

Delivering key value for out-licensing and monetizing of non-core IP

Strategic ecosystem development

Growth and competitive differentiation through aggressive collaboration, early identification of acquisition and recruitment targets

The Situation

IBM BAO strategic IP insight platform (SIIP)

A unique and powerful

data and analytics offering

Aggregates and processes

30M+ patents and scientific literature from around the globe

Automatically extracts

chemical and biological entities – 200M+ chemical compound instances to date

Generates

chemical and biological entity profiles

Searches and analyzes

using natural language-based inputs for key relationship discovery and IP insights

Reasoning

about causality of drug, diseases, targets, and efficacy and side effects

Integrates and enhances

existing data and applications

(8)

A Smart Entity Profiling, Analytics and Reasoning Methodology

Medicine

Disease

Patients

IP - Legal status - Assignee - Foreign filings - Expiration Date - . . . Drug - Activity - Half life - Protein Binding - . . . Physical - Computational - Molecular Weight - MF, Bp, Mp - . . . Spectral- IR - NMR - Mass Spectra - . . . Toxocity - Clinical Trials - Pre-Clinical - . . . Pathways - Metabolic - Genetic - Environmental - Cellular - Organism - . . . Screening - Activity - . . . Genetic -. . . Organisms - Organism - Organ - Cell - Tissue - . . . Life styles -. . . Reactions - Enzymes - . . .

Patents

Literature

Experimental

HTS

Medical

Records

Clinical

Business

Medical History -. . .

Social

•An integrated framework leveraging broad set of data, and many types of analytics:

• Hypothesis generation • Entity extraction and

profiling

• Relationship discovery and analytics

• Summarization • Reasoning

• Scoring and ranking • Predictive modeling •Key steps:

• Extract key entities • Combine information

from multiple sources • Discover relationships

Medical Records

(9)

9 © 2011 IBM Corporation

Information and Governance for Big Data

(10)

Summary

 There is much to be gained from leveraging available data and content

– Accelerate discovery

– Avoid repeating work

 Unlocking the value buried in there is difficult

– 4 V’s: Volume, Velocity, Variety, Veracity

– A long process requiring many types of expertise

 There are powerful platforms and tools that can help

– Aid development of type-specific analytics

– Enable fast and timely processing of large diverse data sets

 Sharing, with appropriate data governance, can accelerate discovery

– Controls for the entire data lifecycle

References

Related documents

Findings to this study include the absence of ethical standards in practices and inadequate and ineffective leadership provided by the Boards of Directors within SOEs, as well

This partnership emphasises accountability for the results, investigation of a wide variety of alternative service delivery mechanisms, and competition be- tween public and

Hasil penelitian menunjukkan bahwa terdapat pengaruh penggunaan media pembelajaran Computer Assisted Instruction (CAI) terhadap hasil belajar siswa pada pokok bahasan

Based on the parallel framework and the selective sampling technique, we designed three algorithms Par-FP, Par-Span and Par CSP for parallel mining frequent itemsets,

A novel algorithm named FSGP for mining sequential generator patterns is proposed with the safe pruning strategy consuming a little time cost and the mechanism of

In our work, we introduce a new form of sequential pattern with multi-granularities, which is a se- quential pattern where each transition is annotated with multi-granularity

Imaging data were acquiredfor two murine xenograft models of human colorectal cancer (LoVo and HCT116) treated with either a single high-dose fraction of radiotherapy (RT) or

The title of the paper was ÒEvaluation of a suitable learning style for iLearn: a personalised e-learning platformÓ and contained the evaluation of learning