• No results found

Building Life Sciences & Genomics Data Workflows with AWS Storage Gateway

N/A
N/A
Protected

Academic year: 2021

Share "Building Life Sciences & Genomics Data Workflows with AWS Storage Gateway"

Copied!
23
0
0

Loading.... (view fulltext now)

Full text

(1)

Building Life Sciences & Genomics Data

Workflows with AWS Storage Gateway

Stephen Litster, HPC HCLS Lead GTM, AWS

Michael Leonard, File Gateway Product Manager, AWS

(2)

Agenda

• Healthcare & Life Sciences Industry trends & workloads

• How Healthcare & Life Sciences customers are using AWS

• AWS Storage Gateway & clinical workflows

(3)

Life Sciences, Genomics, and Drug Discovery

Key Workloads

• Genomics

• Computational chemistry/M&S

• Informatics and ML

• Imaging

Key Industry Trends/Challenges

• Exponential data growth

• Secure, global collaboration platforms

• Inform research efforts with real world data

• Scientific reproducibility

Clinical development

Manufacturing & supply chain

(4)

Drug Discovery Pipeline

Challenges we’re hearing from life sciences executives

(5)

High volume lab workflows – typical challenges

Resource

constraints

Skill set

Visualization

latency

Application

integration

Database/web

operation

Latency

Application

integration

Policies

Functionality

Data transfer from

Lab to NAS

HPC/HPDA Job

Scheduling

Enterprise

Services

Archive

(Object Based)

Local I/O and/or

network

(6)

Experiment

Management

Data

Analysis

Reporting

Inference

• X-Ray Crystallography, 1998

• Genomic Sequencing, 2009

• Digital Pathology, 2015

• Light Sheet Microscopy, 2017

• Cryo-Electron Microscopy, 2018

Planning for disruptive technologies

• Value

• Veracity

• Variability

• Velocity

• Volume

(7)
(8)

The hybrid cloud model:

You have

on-premises

data and applications…

…that want to use storage

and services in

the cloud

(9)

AWS Storage Gateway overview

File Gateway

Store and access objects in

Amazon S3 from file-based

applications with local caching

Volume Gateway

Block storage on-premises

backed by cloud storage with

local caching, Amazon EBS

snapshots, and clones, integrated

with AWS Backup

Tape Gateway

Drop-in replacement for physical

tape infrastructure backed by

cloud storage with local caching

File-based applications

(10)

Hybrid cloud storage use cases

For all stages of your cloud adoption journey

Migration

Modernization

Continuous

Reinvention

Backup and archive

data to AWS

on-premises applications to

Low latency access for

cloud data

(11)

Backup and archive data to AWS

on-premises file shares backed by cloud storage

Provide on-premises applications low-latency access to in-cloud data

Use cases

AWS Cloud HTTPS

File

Gateway

NFS/SMB

Any S3 storage class

On premises

Amazon S3

lifecycle

Store and access objects in Amazon S3 from file-based applications with local caching

File Gateway

(12)

Features

• NFS/SMB protocol support, mount shares directly on

database and application servers

• Files stored durably in Amazon S3, lifecycle to any S3

storage class

• Local cache up to 64TB for accessing recent backups • Windows ACL support to control access to

backup files

• Reduce on-premises storage for backups • Easily integrates with SAP, SQL Server,

Oracle, HDFS and other applications

• Restore backups on-premises or in the cloud on EC2 or RDS AWS Cloud HTTPS

File

Gateway

NFS/SMB Database / Application Server

Any S3 storage class

On premises

Benefits

Move database and file backups into the cloud and free up on-premises storage capacity

Amazon S3

lifecycle

(13)

Access virtually unlimited, highly durable cloud storage using common file protocols

Features

Benefits

• Supports NFS and SMB protocols – no application

changes required

• Files stored durably in Amazon S3, lifecycle to any S3

storage class

• SMB shares up to 64TB integrate with Active Directory • AWS CloudWatch events for automated workflows

• Reduce costs by moving storage to Amazon S3 while still accessing from on-premises

• Virtually unlimited cloud storage – no more running out of capacity

• Eliminate expensive hardware refresh cycles • Files stored as native S3 objects for further

processing in AWS AWS Cloud HTTPS

File

Gateway

NFS/SMB

Sequencers Any S3 storage class

On premises

Amazon S3

lifecycle

NAS storage

(14)

AWS Cloud

Sequencers

NFS/SMB Cache refresh HTTPS Cache refresh HTTPS NFS/SMB On-premises

File Gateway

On-premises

File Gateway

AWS

DataSync SnowballAWS

Access files quickly from distributed locations and scale capacity as needed

Features

Benefits

• Generate data in-cloud or ingest from on-premises

using AWS DataSync or AWS Snowball

• Up to 64 TB local cache per gateway

• Fully-managed gateway cache provides low-latency

• Access cloud storage from any on-premises location

• Process data in the cloud and refresh gateway cache for up-to-date results

Provide on-premises apps low-latency access to in-cloud data

In-cloud processing

(15)

“AWS Storage Gateway plays a vital role

in many important applications at Bristol

Myers Squibb, especially where data

transfers from local labs to cloud…

Storage Gateway allows us to continue to

use existing applications in new cloud

platforms, with zero changes

.”

Oleg Moiseyenko

Senior Cloud Architect

File shares backed by cloud storage

Made a strategic decision to move more computing

capabilities to the cloud

Needed to reduce overall IT costs

Accelerate move to cloud while closing a primary data

center

Problem

Outcome

Use legacy apps on-premises without changes to apps

On-prem apps get low-latency access to cloud storage

Lower cost and simplicity

Automation of data management

Solution

Multiple Storage Gateway appliances (virtual &

physical) copy new data to Amazon S3

Seamlessly access 100s of TBs of data in Amazon S3

(16)

1. Instruments write raw data into File Gateway file share

2. File Gateway transfers files to S3 buckets

3. Data Management system scans S3 buckets regularly

4. Applications request data via Data Management system meta catalog

Bristol Myers Squibb data flow

(17)

Customer Case Study:

Gritstone Oncology

(18)

Gritstone Oncology AWS Storage Gateway Case Study

Enabling automation and scalability for GxP laboratory instrumentation

Solution

Gritstone Oncology replaced

our on-premises storage with

AWS Storage Gateway,

providing a scalable storage

solution with Amazon S3

Challenge

Gritstone Oncology is a

pharmaceutical company with a

GxP compliant laboratory that

needs a secure and scalable

storage solution to

accommodate its ever growing

dataset

Benefits

With AWS Storage Gateway,

Gritstone Oncology was able

to reduce operational

overhead and leverage

(19)

AWS Cloud

(20)

AWS Cloud Genomic Mass Spectrometer Local Cache File Gateway Technician

Storage and Archival File-Based Data Access to S3

Research Scientists

Bioinformatics Scientists

Data Sources

Genomics Data Transfer, Data Access Patterns, Storage, and Archival

Amazon S3 Glacier Lifecycle policy Amazon S3 IA File Gateway Local Cache Research

tools 1 3 4 Data Scientists

1. Genomics Data Sources

2. Process Automation

3. Storage and Archival

4. File-Based Data Access to S3

1

2

3

4

Process Automation

Object Put S3 Event Lambda

2

(21)
(22)

Resources

Blogs

Cloud storage in minutes with AWS Storage Gateway

How Bristol Myers Squibb uses Amazon S3 and AWS Storage Gateway to

manage scientific data

Bristol Myers Squibb increases performance and cost savings using AWS Storage Gateway

Videos

Introduction to AWS Storage Gateway

Cloud storage in minutes with AWS Storage Gateway

Migrating file data to AWS – demo & technical guidance

Web Pages

File Gateway Product Page

Reach out to your

Account team for

more information and

(23)

References

Related documents

Evaluation of urinary iodine excretion as a biomarker for intake of milk and dairy products in pregnant women in the Norwegian Mother and Child Cohort Study (MoBa).. Brantsæter

Storage, Backup and Disaster Recovery in the Cloud AWS Customer Case Study: HERE „Maps for Life“... or

Relational Database Service (RDS), Amazon SimpleDB Message Queues – Amazon Simple Queue Service (SQS) Backup / Archival Storage – EBS Snapshots, S3.. Also

To overcome storage problem Remote File Manager can manage user data at local and remote locations as cloud.. User can use the cloud as the storage and can access their data at

Compared to peer online storage solutions, Gladinet Cloud is quite unique by providing drive mapping random access; providing file server cloud storage gateway and server

How to setup NovaBACKUP DataCenter to backup data to Amazon S3 using Amazon’s AWS Storage Gateway... All

•  AWS is hosting the public data sets at no charge for the community, and like all AWS services, users pay only for the compute and storage they use for their own applications.

When a DR event occurs, you can deploy a new virtual instance of an AWS Storage Gateway and SteelFusion to perform recovery of required services for branch offices, as shown