• No results found

Chris Greer. Big Data and the Internet of Things

N/A
N/A
Protected

Academic year: 2021

Share "Chris Greer. Big Data and the Internet of Things"

Copied!
44
0
0

Loading.... (view fulltext now)

Full text

(1)

Big Data and the Internet of Things

(2)

Big Data and the

Internet of Things

• Overview of NIST • Internet of Things

• Big Data

(3)
(4)

NIST  Bird’s eye view

Courtesy HDR Architecture, Inc./Steve Hall © Hedrich Blessing

The National Institute of Standards and Technology (NIST) is where Nobel Prize-winning

science meets real-world engineering.

With an extremely broad research

portfolio, world-class facilities, national networks, and an international reach, NIST works to support industry

(5)

• Major assets

– ~ 3,000 employees

– ~ 2,800 associates and facilities users

– ~ 1,300 field staff in partner organizations

– Two main locations:

Gaithersburg, Md., and Boulder, Colo.

–Nobel Prize Winners: 1997, 2001,

2005, 2007, 2013

NIST: Basic Stats and Facts

©

R.

Rat

h

(6)

Working with NIST on

Cyber Physical Systems

• Overview of NIST

• Internet of Things • Big Data

(7)

If we had computers that knew

everything there was to know about

things—using data they gathered

without any help from us—we would

be able to track and count everything,

and greatly reduce waste, loss and

cost.

—Kevin Ashton, That 'Internet of Things' Thing, RFID Journal, July 22, 2009

(8)

Internet of Things

What are the defining characteristics of

the “Internet of Things?”

Scale

Capability

(9)

Internet of Things - Scale

Devices connected to the Web: • 1970 = 13 • 1980 = 188 • 1990 = 313,000 • 2000 = 93,000,000 • 2010 = 5,000,000,000 • 2020 = 31,000,000,000 Source: Intel

(10)

Internet of Things - Capability

Intel Edison:

"It's a full Pentium-class PC in the

form factor of an SD card,"

(11)

Internet of Things – Reach

Physical

(12)

Internet

Physical

(13)

Internet of Things

Physical

(14)
(15)

Working with NIST on

Cyber Physical Systems

• Overview of NIST

• Internet of Things

• Big Data

(16)

IBM Smarter Planet

“For five years, IBMers have been working with companies, cities, and communities around the world to build a Smarter

Planet. We’ve seen enormous

Source:

http://www.ibm.com/smarterplanet/global/files/us__en_us__overview__win_in_the_ era_of_smart_op_ad_03_2013.pdf

advances, as leaders have begun using the vast supply of Big Data to transform their enterprises and institutions …

(17)

GE Industrial Internet

New GE technology merges big iron

with big data to create brilliant machines. This convergence of machine and intelligent data is Source: http://www.ge.com/stories/industrial-internet

known as the Industrial Internet, and it's changing the way we work.

(18)

Big Data - Volume

Source: IDC Corporation, http://idcdocserv.com/1678, sponsored by EMC

From 2013 to 2020, the digital universe will grow by a factor of 10 – from 4.4 trillion gigabytes to 44 trillion. It more than doubles every two years.

In 2014, the digital universe will equal 1.7

megabytes a minute for every person on Earth.

Data from embedded systems will grow from 2%

of the digital universe in 2013 to 10% in 2020.

In 2013, the available storage capacity could hold just 33% of the digital universe. By 2020, it will be able to store less than 15%.

(19)

Source: John Gantz, IDC Corporation, The Expanding Digital Universe 0 100,000 200,000 300,000 400,000 500,000 600,000 700,000 800,000 900,000 1,000,000 2005 2006 2007 2008 2009 2010

Big Data - Volume

Information Available Storage Pe tabytes W orld wi de

(20)

Big Data - Velocity

Sloan Digital Sky Survey

• 140 Terabytes, year 2000 to present

LSST – Large Synoptic Survey Telescope

• Expect 140 Terabytes every 5 days

Square Kilometer Array

• Expect 140 Terabytes every 3 sec

LSST:

“Suspended between its vast mirrors will be a three billion-pixel sensor array, which on a clear winter night will

produce 30 terabytes of data. In less than a week this remarkable telescope will map the whole night sky …. And then the next week it will do the same again … building up a database of

billions of objects and millions of billions of bytes.”

(21)

Big Data - Variety

(22)

Big Data – Potential Value

“Using detailed survey data on the business

practices and information technology investments of 179 large publicly traded firms, we find that

firms that adopt DDD [data driven decision

making] have output and productivity that is 5-6% higher than what would be expected given their

other investments and information technology usage.”

Brynjolfsson, Erik and Hitt, Lorin M. and Kim, Heekyung Hellen,

Strength in Numbers: How Does Data-Driven Decision making Affect Firm Performance? (April 22, 2011).

(23)

Big Data - Limitations

Good Data Won't Guarantee Good Decisions

Shvetank Shah, Andrew Horne, and Jaime Capellá Harvard Business Review April 2012

“At this very moment, there’s an odds-on chance that someone in your organization is making a

poor decision on the basis of information that was enormously expensive to collect.”

• Analytical skills are concentrated in too few employees

• IT needs to spend more time on the “I” and less on the “T”

(24)

Working with NIST on

Cyber Physical Systems

• Overview of NIST

• Internet of Things

• Big Data

(25)

Big Data - Questions

• What are the attributes that define Big Data solutions?

• How is Big Data different from traditional data environments and related applications?

• What are the essential characteristics of Big Data environments?

• How do these environments integrate with currently deployed architectures?

• What are the central scientific, technological, and standardization challenges needed to accelerate the deployment of robust Big Data solutions?

(26)

NIST Big Data Public

Working Group &

Standardization

Activities

Wo Chang, NIST

Robert Marcus, ET-Strategies

Chaitanya Baru, UC San Diego

(27)

Big Data PWG - Charter

The focus of the (NBD-PWG) is to form a community of interest from industry, academia, and government, with the goal of developing consensus definitions, taxonomies, secure reference architectures, and a

technology roadmap. The aim is to create vendor- and technology-neutral and infrastructure-agnostic deliverables to enable big data stakeholders to select the best analytics tools for their processing and

visualization requirements while enabling value-add from big data service providers and data flow among stakeholders in a cohesive and secure manner.

(28)

Big Data PWG - Deliverables

1. Definitions 2. Taxonomies

3. Requirements & Use Cases

4. Security & Privacy Requirements 5. Architectures Survey

6. Reference Architecture

7. Security & Privacy Architecture 8. Technology Roadmap

(29)

SUBGROUPS

NBD-PWG Requirements and Use Cases Definitions & Taxonomies Security and Privacy Reference Architecture Technology Roadmap 29

(30)

Requirements

& Use Cases

Geoffrey Fox, U. Indiana Joe Paiva, VA Tsegereda Beyene, Cisco Scope (M0020)

The focus is to form a community of interest from industry, academia, and government, with the goal of developing a consensus list of Big Data requirements across all stakeholders. This includes gathering and understanding various use cases from diversified application domains.

Tasks

• Gather input from all stakeholders regarding Big Data requirements.

• Analyze/prioritize a list of challenging general requirements that may delay or prevent adoption of Big Data deployment

• Develop a comprehensive list of Big Data requirements

(31)

Requirements

and Use Case Subgroup

1. Government Operations (4): National Archives & Records Administration, Census Bureau

2. Commercial (8): Finance in Cloud, Cloud Backup, Mendeley (Citations), Netflix, Web Search, Digital Materials, Cargo shipping (e.g. UPS)

3. Defense (3): Sensors, Image Surveillance, Situation Assessment

4. Healthcare & Life Sciences (10): Medical Records, Graph & Probabilistic Analysis, Pathology, Bio-imaging, Genomics, Epidemiology, People Activity Models, Biodiversity

5. Deep Learning & Social Media (6): Driving Car, Geolocate Images, Twitter, Crowd Sourcing, Network Science, NIST Benchmark Datasets

6. Astronomy & Physics (5): Sky Surveys, Large Hadron Collider at CERN, Belle Accelerator II (Japan)

7. Earth, Environmental & Polar Science (10): Ice Sheet Scattering, Earthquake, Ocean, Earth Radar Mapping, Climate Simulation, Atmospheric Turbulence,

Subsurface Biogeochemistry, AmeriFlux &FLUXNET gas sensors

8. Energy (10): Smart Grid

31

(32)

Definitions &

Taxonomies

Nancy Grady, SAIC Natasha Balac, SDSC Eugene Luster, R2AD

Scope (M0018)

It is important to develop a consensus-based

common language and vocabulary terms used in Big Data across stakeholders from industry, academia, and government. In addition, it is also critical to identify essential actors with roles and responsibilities…

Tasks

• For Definitions: Compile terms used from all stakeholders regarding the meaning of Big Data from various standard bodies, domain

applications, and diversified operational environments.

• For Taxonomies: Identify key actors with their roles and responsibilities from all stakeholders, categorize them into components and

subcomponents based on their similarities and differences

• Develop Big Data Definitions and taxonomies

(33)

Big Data consists of extensive datasets, primarily in the

characteristics of volume, velocity and/or variety, that require a scalable architecture for efficient storage, manipulation, and analysis.

Data Scientist is a practitioner who

has sufficient knowledge of the

overlapping regimes of expertise in business needs, domain knowledge, analytical skills and programming expertise to manage the end-to-end scientific method process through each stage in the Big Data lifecycle.

(34)

Reference

Architecture

Orit Levin, Microsoft James Ketner, AT&T Don Krapohl,

Augmented Intelligence

Scope (M0021)

The goal is to enable Big Data stakeholders to pick-and-choose technology-agnostic analytics tools for processing and visualization in any computing platform and cluster while allowing

added value from Big Data service providers

and the flow of data between the stakeholders in a cohesive and secure manner.

Tasks

• Gather and study available Big Data

architectures representing various stakeholders, different data types,’ use cases, and document the architectures using the Big Data taxonomies model based upon the identified actors with their roles and responsibilities.

• Ensure that the developed Big Data reference architecture and the Security and Privacy Reference Architecture correspond and complement each other.

(35)

Reference Architecture Subgroup

Key documents:

M0151 – White Paper M0123 – Working Draft

35

M0039 | Data Processing Flow M0017 | Data Transformation

Flow

(36)

Security &

Privacy

Arnab Roy, CSA/Fujitsu Nancy Landreville, U. MD Akhil Manchanda, GE Scope (M0019)

The focus is to form a community of interest from industry, academia, and government, with the goal of developing a consensus secure

reference architecture to handle security and privacy issues across all stakeholders. This includes gaining an understanding of what

standards are available or under development, as well as identifies which key organizations are working on these standards.

Tasks

• Gather input from all stakeholders regarding security and privacy concerns in Big Data processing, storage, and services.

• Analyze/prioritize a list of challenging security and privacy requirements that may delay or prevent adoption of Big Data deployment

• Develop a Security and Privacy Reference Architecture that supplements the general Big Data Reference Architecture

(37)

Security and Privacy Subgroup

37 Requirements Scope • Infrastructure Security • Data Privacy • Data Management • Integrity & Reactive Security Requirements – Use Cases Studied • Retail (consumer) • Healthcare • Media • Government • Marketing Architecture & Taxonomies • Privacy • Provenance • System Health

(38)

Technology

Roadmap

Carl Buffington, USDA/Vistronix Dan McClary, Oracle

David Boyd, Data Tactic

Scope (M0022)

The goal is to develop a consensus vision with recommendations on how Big Data should move forward by performing a good gap analysis through the materials gathered from all other NBD

subgroups. This includes setting standardization and adoption priorities through an understanding of what standards are available or under development as part of the recommendations.

Tasks

• Gather input from NBD subgroups and study the taxonomies for the actors’ roles and responsibility, use cases and

requirements, and secure reference architecture.

• Gain understanding of what standards are available or under development for Big Data

• Perform a thorough gap analysis and document the findings

• Identify what possible barriers may delay or prevent adoption of Big Data

• Document vision and recommendations

(39)

Technology Roadmap Subgroup

Key document: M0087 – Working Draft

• Inputs from other subgroups

• Potential Standards Group with

Big Data-related activities (M0035)

• Capabilities & Technology Readiness

• Decision Framework

• Mapping & Gap Analysis

• Big Data Strategies

39 Definitions & Taxonomies Requiremen ts & Use Cases Security & Privacy Reference Architecture

(40)

Subgroups Working Draft Outline

Contact: [email protected]

Website: http://bigdatawg.nist.gov

Join NBD-PWG: http://bigdatawg.nist.gov/newuser.php

Documents: http://bigdatawg.nist.gov/show_InputDoc.php

Working Drafts (under editing) Big Data Definitions & Taxonomies (M0142) Big Data Requirements (M0245)

Big Data Security & Privacy Requirements (M0110) Big Data Architectures White Paper Survey (M0151) Big Data Reference Architectures (M0226)

Big Data Security & Privacy Reference Architecture (M0110) Big Data Technology Roadmap (M0087)

NIST Big Data

Workshop Slides: http://bigdatawg.nist.gov/workshop.php

(41)

Build an integrated Cyber-Physical Systems

Framework

that allows interconnection of test

beds and interoperation through shared data

and associated data analytics for easy

integration and accelerated adoption of CPS

applications.

The “Arpanet” for CPS Innovation

The SmartAmerica Challenge

(42)

• Industry

– GE, IBM, Qualcomm, Intel, Schneider Electric, Philips,

AT&T, UTRC, Boeing…

• Research/Educational Institutions

– MIT, Harvard, UC Berkeley, Vanderbilt, U Penn, UCLA,

Internet2, US Ignite, Massachusetts General Hospital…

• Government

– NIST, NSF, DoT, DoD, DHS, Montgomery County…

(43)

June 11, 2014

Washington, DC

See:

http://www.nist.gov/el/smartamerica.cfm

(44)

Thank you!

bigdatawg.nist.gov

Web Sites:

www.nist.gov/el/smartamerica.cfm

Contact:

[email protected]

References

Related documents

Conclusions: In critically ill patients without pre-existing kidney disease, both pNGAL and uNGAL measured at admission can predict AKI (defined by RIFLE criteria) occurrence up to

If the Roger MyLink has been dropped or damaged, if it overheats during charging, has a damaged cord or plug, or has been dropped into liquid, stop using your Roger MyLink

• Faculty – New York State Bar Association, Dispute Resolution Section, 3 Day Commercial Arbitration Training: Comprehensive Training for the Conducting of Commercial Arbitrations

Incident Management Problem Management Change Management Release Management Application Management Service Level Management IT Financial Management Availability

The vendor has to provide unlimited on-site support of Oracle database at our H.O., CBD Belapur, Navi-Mumbai, which includes configuring, patching, connectivety,

secondary preventive care to patients presenting with fragility fractures provides an opportunity to break the fragility fracture cycle shown in figure 4.. When patients

In support of the shea products value chain of West Africa, the Trade Hub and African Partners Network (Trade Hub) will work primarily, if not exclusively, in partnership with

The purpose of the present article is to demonstrate the need for distance career counseling services, and to present an evolving counseling model that combines the best practices