• No results found

A Quick Introduction to Google's Cloud Technologies

N/A
N/A
Protected

Academic year: 2021

Share "A Quick Introduction to Google's Cloud Technologies"

Copied!
44
0
0

Loading.... (view fulltext now)

Full text

(1)

A Quick Introduction to Google's Cloud

Technologies

Chris Schalk Developer Advocate

@cschalk

Anatoli Babenia

(2)

Agenda

● Introduction

● Introduction to Google's Cloud Technologies ● App Engine Review

● Google's new Cloud Technologies ○ Google Storage

○ Prediction API ○ BigQuery

(3)

Google's Cloud Technologies

Google BigQuery Google Prediction API Google Storage

(4)

Specialized Services

Blobstore

Images

Mail XMPP Task Queue

Memcache Datastore URL Fetch

User Service But, is that it?

(5)

New Google Cloud Technologies

● Google Storage

○ Store your data in Google's cloud

● Prediction API

○ Google's machine learning tech in an API

● BigQuery

○ Hi-speed data analysis on massive scale

(6)

Google Storage for Developers

(7)

What Is Google Storage?

● Store your data in Google's cloud

○ any format, any amount, any time

● You control access to your data

○ private, shared, or public

(8)

Sample Use Cases

Static content hosting

e.g. static html, images, music, video

Backup and recovery

e.g. personal data, business records

Sharing

e.g. share data with your customers

Data storage for applications

e.g. used as storage backend for Android, AppEngine, Cloud based apps

Storage for Computation

(9)

Google Storage Benefits

High Performance and Scalability

Backed by Google infrastructure

Strong Security and Privacy

Control access to your data

Easy to Use

(10)

Google Storage Technical Details

● RESTful API

○ Verbs: GET, PUT, POST, HEAD, DELETE ○ Resources: identified by URI

○ Compatible with S3 ● Buckets ○ Flat containers ● Objects ○ Any type ○ Size: 100 GB / object

● Access Control for Google Accounts ○ For individuals and groups

● Two Ways to Authenticate Requests ○ Sign request using access keys ○ Web browser login

(11)

Demo

● Tools:

○ GSUtil

○ GS Manager

(12)

Google Storage usage within Google

Haiti Relief Imagery USPTO data

Partner Reporting Google BigQuery Google Prediction API Partner Reporting

(13)
(14)

Google Storage - Pricing

○ Free trial quota until Dec 31, 2011

■ For first project ■ 5 GB of storage

■ 25 GB download/upload data

■ 20 GB to Americas/EMEA, 5GB APAC ■ 25K GET, HEAD requests

■ 2,5K PUT, POST, LIST* requests

○ Production Storage

■ $0.17/GB/Month (Location US, EU) ■ Upload - $0.10/GB

■ Download

■ $0.15/GB Americas / EMEA ■ $0.30/GB APAC

■ Requests

■ PUT, POST, LIST - $0.01 / 1000 Requests ■ GET, HEAD - $0.01 / 10,000 Requests

(15)

Google Storage Summary

● Store any kind of data using Google's cloud infrastructure ● Easy to Use APIs

● Many available tools and libraries

○ gsutil, GS Manager ○ 3rd party:

(16)

Google Prediction API

(17)

Google Prediction API as a simple example

(18)

How does it work?

"english" The quick brown fox jumped over the

lazy dog.

"english" To err is human, but to really foul things up you need a computer.

"spanish" No hay mal que por bien no venga.

"spanish" La tercera es la vencida.

? To be or not to be, that is the

question.

? La fe mueve montañas.

The Prediction API finds relevant

features in the

sample data during training.

The Prediction API later searches for those features during prediction.

(19)

A virtually endless number of applications...

Customer Sentiment Transaction Risk Species Identification Message Routing Legal Docket Classification Suspicious Activity Work Roster Assignment Recommend Products Political Bias Uplift Marketing Email Filtering Diagnostics Inappropriate Content Career Counselling Churn Prediction

(20)

Using the Prediction API

1. Upload

2. Train

Upload your training data to

Google Storage

Build a model from your data

Make new predictions 3. Predict

(21)

Step 1: Upload

Upload your training data to Google Storage

● Training data: outputs and input features

● Data format: comma separated value format (CSV)

"english","To err is human, but to really ..."

"spanish","No hay mal que por bien no venga."

...

Upload to Google Storage

(22)

Step 2: Train

Create a new model by training on data

To train a model:

POST prediction/v1.3/training

{"id":"mybucket/mydata"}

Training runs asynchronously. To see if it has finished:

GET prediction/v1.3/training/mybucket%2Fmydata

{"kind": "prediction#training",... ,"training status": "DONE"}

(23)

Step 3: Predict

Apply the trained model to make predictions on new data POST

prediction/v1.3/training/mybucket%2Fmydata/predict

{ "data":{

"input": { "text" : [

(24)

Step 3: Predict

Apply the trained model to make predictions on new data POST prediction/v1.3/training/bucket%2Fdata/predict

{ "data":{

"input": { "text" : [

"J'aime X! C'est le meilleur" ]}}}

{ data : { "kind" : "prediction#output", "outputLabel":"French", "outputMulti" :[ {"label":"French", "score": x.xx} {"label":"English", "score": x.xx} {"label":"Spanish", "score": x.xx}]}}

(25)

Step 3: Predict

Apply the trained model to make predictions on new data

import httplib

header = {"Content-Type" : "application/json"}#...put new data in JSON format in params variable

conn = httplib.HTTPConnection("www.googleapis.com")conn.request ("POST",

"/prediction/v1.3/query/bucket%2Fdata/predict", params, header)print conn.getresponse()

(26)

Google Prediction API Demo!

● Command line Demos ○ Training a model

○ Checking training status ○ Making predictions

● A complete Web application using the JavaScript API for Prediction

(27)

Prediction API Capabilities

Data

● Input Features: numeric or unstructured text ● Output: up to hundreds of discrete categories Training

● Many machine learning techniques ● Automatically selected

● Performed asynchronously Access from many platforms:

● Web app from Google App Engine

● Apps Script (e.g. from Google Spreadsheet) ● Desktop app

(28)

Prediction API - key features

● Multi-category prediction

○ Tag entry with multiple labels

● Continuous Output

○ Finer grained prediction rankings based on multiple labels

● Mixed Inputs

○ Both numeric and text inputs are now supported

(29)

Google BigQuery

(30)

Introducing Google BigQuery

● Google's large data adhoc analysis technology ○ Analyze massive amounts of data in seconds ● Simple SQL-like query language

● Flexible access

(31)

Why BigQuery?

(32)

Many Use Cases ...

Spam Trends Detection Web Dashboards Network Optimization Interactive Tools

(33)

Key Capabilities of BigQuery

● Scalable: Billions of rows ● Fast: Response in seconds ● Simple: Queries in SQL

● Web Service ○ REST

○ JSON-RPC

(34)

Using BigQuery

1. Upload

2. Import

Upload your raw data to

Google Storage

Import raw data into BigQuery table

Perform SQL queries on table

3. Query

(35)

Writing Queries

Compact subset of SQL ○ SELECT ... FROM ... WHERE ... GROUP BY ... ORDER BY ... LIMIT ...; Common functions

○ Math, String, Time, ... Statistical approximations

○ TOP

(36)

BigQuery via REST

GET /bigquery/v1/tables/{table name}

GET /bigquery/v1/query?q={query}

Sample JSON Reply:

{

"results": { "fields": { [

{"id":"COUNT(*)","type":"uint64"}, ... ] }, "rows": [ {"f":[{"v":"2949"}, ...]}, {"f":[{"v":"5387"}, ...]}, ... ] } }

(37)

Security and Privacy

Standard Google Authentication

● Client Login ● OAuth

● AuthSub

HTTPS support

● protects your credentials ● protects your data

(38)

Large Data Analysis Example

Wikimedia Revision history data from: http://download.wikimedia. org/enwiki/latest/enwiki-latest-pages-meta-history.xml.7z

(39)

Using BigQuery Shell

Python DB API 2.0 + B. Clapper's sqlcmd

(40)
(41)
(42)

Recap

● Google App Engine

○ Application development platform for the cloud

● Google Storage

○ High speed cloud data storage on Google's infrastructure

● Prediction API

○ Google's machine learning technology able to predict outcomes based on sample data

● BigQuery

○ Interactive analysis of very large data sets ○ Simple SQL query language access

(43)

Further info available at:

● Google App Engine

○ http://code.google.com/appengine

● Google Storage for Developers

○ http://code.google.com/apis/storage ● Prediction API ○ http://code.google.com/apis/predict ● BigQuery ○ http://code.google.com/apis/bigquery

(44)

Thank you!

Questions?

● @cschalk

References

Related documents

Boost Productivity using Google Documents, Google Sites, Google Cloud Connect for Microsoft Office and Google Video for Business.. Google Docs - Online Documents with

• Function that enables the user to sign in to the administrator setting function of the application using the administrator password registered on the MFP.. • Function that

Now it’s time to configure the Android SDK for Development Tools and the Google APIs.. Configure the Android SDK for Development Tools and

AduHidTest is a USB Device test program used to test the connection of ADU data acquisition devices to a USB port.. The program is also a useful tool to allow programmers to

A related diversification strategy is when the firm’s value chain exhibits competitively important cross business relationships. An unrelated diversification strategy occurs when

Solution Overview.. Figure 4 shows how you can use the MapReduce infrastructure to distribute data transfer from Google Cloud Storage to MapR-FS across Google Compute

This section describes the system requirements for customers who plan to use Google Cloud Platform (GCP) as their installation environment for Platfora, and Google Cloud Storage

Google — Has a private cloud that it uses Google — Has a private cloud that it uses for delivering many different services to its users, including email access,