• No results found

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum

N/A
N/A
Protected

Academic year: 2021

Share "Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum"

Copied!
37
0
0

Loading.... (view fulltext now)

Full text

(1)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Trends and Research Opportunities in

Spatial Big Data Analytics and Cloud Computing

NCSU GeoSpatial Forum

Siva Ravada

Senior Director of Development Oracle Spatial and MapViewer

(2)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 2

(3)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Evolving Technology Platforms

• Compass, telescope, sexton, paper maps

• Mainframe computers

• Workstations, GIS applications

• IT revolution, spatial databases

• GeoEnabled Infrastructure:

LiDAR, Mobile, Stream Processing,

Sensors, Cloud Computing

Geographic Information Systems rely on the technology of the era

(4)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Disappearing line between Geospatial Technologies and

Information Technologies

Mapping Digital data

file Spatial Information

Technology

SOA

Geographic Information Systems

(5)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Latest Technology Trends

Big Data Technology

Hadoop, MapReduce, Hadoop File System (HDFS), Apache SPARK

Cloud Computing

(6)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Big Data Technology Defined

Big Data: Techniques and

Technologies that Enable Enterprises

to Effectively and Economically

Analyze All of their Data

(7)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Emerging Viewpoint:

Big Data = Hadoop + Relational + NoSQL…

-2013 Facebook, 2014 Gartner

Big Data Definition

Current Viewpoint:

Big Data = Hadoop

Volume (amount of data)

Velocity (speed of data in and out)

Variety (range of data types and sources) -2001

Meta Group (now Gartner) definition of Big Data

4th V - Veracity (Uncertainty of Data)

- 2012 IBM added a 4thV

The 3Vs

Is Big Spatial Data different from Big Data?

How does Big Spatial data fit into GIS?

(8)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential 8

Big Data Architecture

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

BIG DATA MANAGEMENT

BIG DATA ANALYTICS

BIG DATA APPLICATIONS

BIG DATA INTEGRATION DATA

CAPITAL

Connect And Govern Any Data

Simplify Access To All Data Discover And Predict, Fast

Accelerate Data- Driven Action

(9)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 9

Key Factors

Simplify access

to all data

Discover and

predict, fast

Govern and

secure all data

(10)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Big Data + Advanced Analytics

Profile

Easily add data and see it automatically and continuously cataloged, enriched

and related

Find

Use familiar guided search across massive

amounts of diverse data

Understand

Know what’s important from diagnostic analysis

of millions of data characteristics

Transform

Powerful tools to quickly clean up

and wrangle dirty data so it’s

ready to go

Discover

Uncover valuable new

insights

Collaborate

Publish, share and evolve as you learn

more

Predict

Use new insights to define and refine predictive

models

Oracle Confidential – Internal 10

(11)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Cloud Computing

• Cloud computing enables customers to consume compute resources as a utility

Just like electricity

No need to build and maintain computing infrastructures in-house

• Involves large data centers by cloud providers

• Public Cloud and Private Cloud

• infrastructure as a service (IaaS): Amazon AWS storage

• platform as a service (PaaS): IBM, Oracle, MS Azure

• software as service (SaaS): AWS web services, Oracle, IBM

Oracle Confidential 11

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

(12)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Elastically Scalable

Oracle Confidential – Internal/Restricted/Highly Restricted

• Consumers can scale up as needs increase and then scale down again as demands decrease

• Elastic is ideally dynamic and transparent, but can also be a specific action

Most important is that it is possible

• Applies to storage, infrastructure, and software

• Elasticity also implies fault tolerance built into the system

Seamlessly transfer the state of the application to a backup if the primary fails

• Virtualization Software is very important to achieve this goal

(13)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Self-Service Operations

• End users can spin up computing resources for almost any type of workload on-demand

This applies to storage and compute resources

• All application and system related operations that a customer performs should be accessible via self-service by a customer without requiring any filing of service request to either support or cloud operations teams

• This involves – managing space (eg. their block store or object store space)

being able to access and analyze diagnostic logs

being able to migrate data and metadata from one environment to another

Oracle Confidential – Internal/Restricted/Highly Restricted

(14)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Pay Per-Use

• Computing resources are measured at a granular level

allows users to pay only for the resources and workloads they use

• This is one of the most important aspects for the growth of the cloud

• Consumers can now access a very large pool of computing resources when required without worrying about the cost or management of these

hardware resources

Oracle Confidential – Internal/Restricted/Highly Restricted

(15)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Big Data vs Cloud Computing

• Is Big Data same as cloud computing?

• Not really, but they are tightly related

• Big Data by itself is not affordable for all consumers

Large infrastructure cost to build cluster computing resources Human cost to find trained IT staff to manage them

• But Cloud service providers can afford to manage these large computing resources and make slices of it available to consumers

• Cloud computing has many technologies

Hadoop, MapReduce, Relational DBs, middleware technology

(16)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Spatial Big Data and Cloud Computing

challenges

(17)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Spatial Big Data Challenges

• Geo-tagging in the context of partial or indirect reference

• Minimize the time it takes to make the data available for analysis

• Discover Spatial and Temporal correlations between different data points

• Data loading time should be minimal to make the data available for use

• Load the data for immediate use, but create spatial indexes over time

• How to leverage the code from spatial database applications developed over the years

• Predictive Analytics for various applications

(18)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Location Infused Technology

Java, Databases, Applications, Cloud

(19)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

GeoSpatial Big Data Sources

• Traditional Data sources

Raster (satellite imagery, elevation models, images) Vector (road networks, admin boundaries)

• Machine generated

Internet of things Social media

Sensors

In vehicle navigation systems (trajectories, traffic information) Mobile phones

(20)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Extend Spatial Analytics with Cloud and Big Data

(21)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Predictive Analysis based on tweets

• How to infer potential trouble based on tweets ?

• Data may have more than one spatial location

Tweets are generated from a location, but the tweet might be referring to events at a different location

• Find the trend

“meet at NYC city center at 4PM”

“protest against climate change”

• Take action to deploy law enforcement to stop any potential crowd trouble

• Needs new algorithms to find spatial-temporal correlations and predict future events

(22)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Spatial Cloud Services

• System Developers (CS background)

Focus on big data, database, SPARK, enabling data sets, etc.

• Application Developers (GIS background)

Think about solving bigger problems

More analysis frameworks and data sets are available now No barriers for entry

Predictive analytics

(23)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Precision Farming Example

• Goal: Build Predictive Analytical Model to increase the crop yield

• Minimize water resources

• Minimize fertilizer

• Minimize the human capital cost

• Use all available sensor based data sources

Satellite imagery, ground based sensors, etc.

(24)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

How to build a precision farming application

• Acquire satellite data as required

• Acquire disk storage for storing the data

• Setup a cluster of machines to do the computations

• Find a scientist to build the models required to do the analytics using the raster data

• Expensive due to hardware, data acquisition and software costs

(25)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Need to develop new Spatial Algorithms ?

• Map-Reduce uses data partitioning to achieve high performance

• Can we use divide and conquer algorithms without modifications ?

Depends on how the data is stored

• Need new algorithms for new use cases

• Data Scientist’s focus should be on analysis of data

Storage and data management should be done by the system This should be done via a model driven architecture

(26)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Spatial Cloud Services Development

(27)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Data storage and indexing

• Systems should support a few data storage and indexing models

Vector data Raster data Sensor data

Enable spatial and temporal search

Applications can choose one of the provided storage models based in the data and query requirements

• Provide alternate ways to acquire data as required

Web services, buy as needed, use from existing sources

• Provide reference data and models

• Free up the data scientist to do actual data analysis instead of data storage and layout models

(28)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Geo-Spatial Big Data Management

• Use once

Data is loaded into the data store and analyzed once

Extract summary or intelligence once and use it in other places

• Use many times

Query the data to answer different types of questions Produce new data products

(29)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Data Analytics Challenge

Separate silos of information to analyze

29

Database

(30)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Data Analytics Challenge

Separate data access interfaces

30

Database

(31)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Before After

What does simplification mean for Spatial Big data analytics

Data Science

PhD

???

Anyone

Web service APIs

(32)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Spatial Cloud Services Application Development

(33)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Advanced Analytics

Bring the Analytics to the Data

• Understand the data

• Decipher the data to uncover hidden patterns that can be used for better decisions

• Understand hidden correlations and use these relationships to solve business problems

• Predict future outcomes based on observed data before they happen

• Use predictive analytics, machine learning, and data mining techniques on big data

(34)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Two types of Analytical Approaches

• Reactive

Collect large volumes of data from event logs, web logs, etc.

Process, analyze and extract summaries from the data Feed the summary data into a traditional DW system

• Proactive

Process the data as it comes in to find the correlations Find out if the patterns in the new data mean something Initiate actions based on perceived patterns

(35)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Precision Farming Example

• Multi-band raster data

RGB

Thermal Vegetation

• Analyze thermal band for vegetation properties

• Compute NDVI models

• Results can be used to model

Water requirements for different parts of the farm Growth indicators

Fertilizer schedules

Identify under growth (caused by pests)

(36)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

After

Application Development in Spatial Cloud

Application Development

Database

Database

Before

Spatial Cloud APIs

(37)

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Breaking Barriers with Cloud Computing

• Cloud computing changing the way systems are built

• No more proprietary data silos

Better result than what OGC/ISO standards have achieved in this respect

• No more closed systems

• Traditional software development paradigms are changing

• On premise cloud will replace most of the on-premise proprietary systems

References

Related documents

The main wall of the living room has been designated as a "Model Wall" of Delta Gamma girls -- ELLE smiles at us from a Hawaiian Tropic ad and a Miss June USC

Given the criticism on the usefulness of authoritative requirements for SSP reporting and the recognition of accountability expectations in the literature, the first

We intend initiating a Community College (CC) Science, Technology, Engineering, and Mathematics (STEM) Partnership Program for New York State to increase the number of CC

This letter implements outputs of a detailed power system optimisation model into a prospective life cycle analysis framework in order to present a life cycle analysis of 44

Each Party shall accord national treatment to the goods of another Party in accordance with Article III of the General Agreement on Tariffs and Trade (GATT), including

After treatment with a Plk1 inhibitor or carrier (DMSO), cancer cells were fixed with formaldehyde and stained with DAPI (DNA stain; blue), alpha-tubulin antibody (microtubules

(To be eligible, services must not have been received within the last 30 days.) More information about the program and our services can be found online at

Abstract In this paper the well-known minimax theorems of Wald, Ville and Von Neumann are generalized under weaker topological conditions on the payoff function ƒ and/or extended