• No results found

Cloud Computing and Big Data

N/A
N/A
Protected

Academic year: 2021

Share "Cloud Computing and Big Data"

Copied!
19
0
0

Loading.... (view fulltext now)

Full text

(1)

Cloud Computing

and Big Data

Karl Benedict

Earth Data Analysis Center, University Libraries, Department of Geography

University of New Mexico

[email protected]

E D

(2)

An Architecture

Designed for Scalability

Definitions of

Cloud Computing

and

Big Data

Development of an extensible

Services

Oriented Architecture

for

Research Data

management, discovery & access

Use of cloud-compatible

software

components

and

(3)

Cloud Computing?

Mell P & Grance T (2011) The NIST Definition of Cloud Computing - Recommendations of the National Institute of Standards and Technology. (National Institute of Standards and Technology, Computer Science Division, Information Technology Laboratory, Gaithersburg, MD), p 7.

On-demand self-service

Broad network Access

Resource Pooling

Rapid Elasticity

(4)

Cloud Computing?

Mell P & Grance T (2011) The NIST Definition of Cloud Computing - Recommendations of the National Institute of Standards and Technology. (National Institute of Standards and Technology, Computer Science Division, Information Technology Laboratory, Gaithersburg, MD), p 7.

On-demand self-service

Broad network Access

Resource Pooling

Rapid Elasticity

Measured Service

Software as a Service (SaaS)

Platform as a Service (PaaS)

Infrastructure as a Service (IaaS)

(5)

Cloud Computing?

Mell P & Grance T (2011) The NIST Definition of Cloud Computing - Recommendations of the National Institute of Standards and Technology. (National Institute of Standards and Technology, Computer Science Division, Information Technology Laboratory, Gaithersburg, MD), p 7.

On-demand self-service

Broad network Access

Resource Pooling

Rapid Elasticity

Measured Service

Software as a Service (SaaS)

Platform as a Service (PaaS)

Infrastructure as a Service (IaaS)

(6)

Cloud Computing?

Mell P & Grance T (2011) The NIST Definition of Cloud Computing - Recommendations of the National Institute of Standards and Technology. (National Institute of Standards and Technology, Computer Science Division, Information Technology Laboratory, Gaithersburg, MD), p 7.

On-demand self-service

Broad network Access

Resource Pooling

Rapid Elasticity

Measured Service

Software as a Service (SaaS)

Platform as a Service (PaaS)

Infrastructure as a Service (IaaS)

Private Cloud

Community Cloud

Public Cloud

Hybrid Cloud

(7)

Future Characteristics and Capabilities

Cloud Computing?

Current Characteristics and Capabilities

On-demand self-service

Broad network Access

Resource Pooling

Software as a Service (SaaS)

Private Cloud

Rapid Elasticity

Measured Service

Hybrid Cloud

(8)

Big Data

Many Problems and Solutions

*

http://www.opentracker.net/article/25-definitions-big-data

NASA - Goddard Space Flight Center

(9)

Big Data

Sample Solutions

Horizontal vs. vertical scaling

Unstructured and semi-structured data

models

Parallelism

Linked analytics

Machine learning

(10)

Big Data

An Aligned Definition

“An easily scalable

system of

unstructured data

with accompanying

tools that can

efficiently pull

structured datasets.”

*

Definition provided by John Denver as a comment on the FCW Blog Post of 4/15/2013 Entitled “Sketching the big picture on big data”

(11)

EDAC’s Data Challenge

A Snapshot

(12)

EDAC’s Data Challenge

A Snapshot

RGIS%

120,031%datasets%

12,171%files%

11,808%vectors%

96,052%rasters%

NM%EPSCoR%

281,315%datasets%

9,054%files%

169,981%vectors%

102,280%rasters%

10.5TB%on%disk%

140GB%MongoDB%(22GB%Postgres)%

114%million%documents%

1.115%billion%data%points%

1.5%million%discrete,%downloadable%data%objects%

(13)

EDAC’s Data Challenge

A Snapshot

RGIS% •  120,031%datasets% •  12,171%files% •  11,808%vectors% •  96,052%rasters% NM%EPSCoR% •  281,315%datasets% •  9,054%files% •  169,981%vectors% •  102,280%rasters% 10.5TB%on%disk% 140GB%MongoDB%(22GB%Postgres)% 114%million%documents% 1.115%billion%data%points% 1.5%million%discrete,%downloadable%data%objects%

Metadata Vectors Rasters

Files

Services

FGDC

SHP

GeoTIFF

ZIP

WMS

FGDC-RSE

KML

IMG

HTML

WFS

ISO

19115-2 /

19139

GML

SID

PDF

WCS

ISO 19119

GeoJSON

ECW

DOC/DOCX

ISO 19110

JSON

DEM

GZ

CSV

ASCII

XLS/XLSX

XLS

PPT/PPTX

(14)

EDAC’s Data Challenge

Our Solution

PostgreSQL/PostGIS..

Dataset.metadata.

Spa4al.data.

MongoDB..

Vector.a;ribute.data.

File.system.

Rasters.

Compressed.datasets.

Dataset.cache.

HTTP/HTTPS. REST.Services.(loadIbalanced.applica4on.cluster). OGC.Services. WMS. WFS. WCS. Search. Download.Data. Stream.Data. Admin. DataONE.

API. CUAHSI.API.

MongoDB. Sharded.replica. set. PostgreSQL/ PostGIS. (loadIbalanced.PGI Pool). OAIIPMH/OGCICSW/z39.50/HTTP. Geoportal.Server. Geoportal.Server. WAF. Cl ie nt. Ap pl ic a4 on s. Se rv ic es. D ata. Man ag em en t. File.System. Metadata.

(15)

Cloud Computing &

Big Data? (Now)

PostgreSQL/PostGIS..

Dataset.metadata.

Spa4al.data.

MongoDB..

Vector.a;ribute.data.

File.system.

Rasters.

Compressed.datasets.

Dataset.cache.

HTTP/HTTPS. REST.Services.(loadIbalanced.applica4on.cluster). OGC.Services. WMS. WFS. WCS. Search. Download.Data. Stream.Data. Admin. DataONE.

API. CUAHSI.API.

MongoDB. Sharded.replica. set. PostgreSQL/ PostGIS. (loadIbalanced.PGI Pool). OAIIPMH/OGCICSW/z39.50/HTTP. Geoportal.Server. Geoportal.Server. WAF. Cl ie nt. Ap pl ic a4 on s. Se rv ic es. D ata. Man ag em en t. File.System. Metadata.

(16)

Cloud Computing &

Big Data? (Now)

PostgreSQL/PostGIS..

Dataset.metadata.

Spa4al.data.

MongoDB..

Vector.a;ribute.data.

File.system.

Rasters.

Compressed.datasets.

Dataset.cache.

HTTP/HTTPS. REST.Services.(loadIbalanced.applica4on.cluster). OGC.Services. WMS. WFS. WCS. Search. Download.Data. Stream.Data. Admin. DataONE.

API. CUAHSI.API.

MongoDB. Sharded.replica. set. PostgreSQL/ PostGIS. (loadIbalanced.PGI Pool). OAIIPMH/OGCICSW/z39.50/HTTP. Geoportal.Server. Geoportal.Server. WAF. Cl ie nt. Ap pl ic a4 on s. Se rv ic es. D ata. Man ag em en t. File.System. Metadata.

(17)

Cloud Computing &

Big Data? (Future)

PostgreSQL/PostGIS..

Dataset.metadata.

Spa4al.data.

MongoDB..

Vector.a;ribute.data.

File.system.

Rasters.

Compressed.datasets.

Dataset.cache.

HTTP/HTTPS. REST.Services.(loadIbalanced.applica4on.cluster). OGC.Services. WMS. WFS. WCS. Search. Download.Data. Stream.Data. Admin. DataONE.

API. CUAHSI.API.

MongoDB. Sharded.replica. set. PostgreSQL/ PostGIS. (loadIbalanced.PGI Pool). OAIIPMH/OGCICSW/z39.50/HTTP. Geoportal.Server. Geoportal.Server. WAF. Cl ie nt. Ap pl ic a4 on s. Se rv ic es. D ata. Man ag em en t. File.System. Metadata.

(18)

In Conclusion

(19)

Acknowledgements

Support

New Mexico Leglature - NM RGIS Program

National Science Foundation EPSCoR Program -

Awards 0814449, 0918635 (in collaboration with

ID and NV EPSCoR programs)

NASA - ACCESS Award NNX12AF52A (in

collaboration with UTEP, KU & PNNL)

Lead Developer: Soren Scott

Contributions by: Renzo Sanchez-Silva, William

Hudspeth

References

Related documents

The Mekong-Ganga Cooperation was launched with the Vientiane Declaration on November 10, 2000 in the capital city of Laos, Vientiane with the primary thrust to

KOMPAS © was also administered to 140 officers in the Ministry of Education (MOE), State Education Department (SED) as well as the District Education Department (DED)

For instance, before teaching When We Collided in our Young Adult Literature course, I asked students to discuss their previous experiences with mental health conversations

One simply cannot assign a colonial mindset to Boyle (an Irish Catholic by birth) and Simon Beaufoy, the screenwriter, and this is partly why a straightforward

PAH was diagnosed based only on certain echocardiographic and Doppler parameters (TR systolic peak velocity greater than 2.8 m/s, PI diastolic peak velocity greater than 2.2

To realize the improvement in BER with the increase in bandwidth for the similar absorption loss at a particular distance, we considered modulation schemes (CS-BPSK, BCSS, and

report sought to establish the total cost associated with workplace fatalities, reportable injury (major or over three calendar days), non-reportable injury (three or fewer

We are requesting 60 hours per week of Instructional Specialist time to further decrease wait time, allow for expansion of the MLC program in Marina, and ensure adequate access