• No results found

Internet Storage Sync Problem Statement

N/A
N/A
Protected

Academic year: 2021

Share "Internet Storage Sync Problem Statement"

Copied!
28
0
0

Loading.... (view fulltext now)

Full text

(1)

Internet Storage Sync

Problem Statement

draft-cui-iss-problem Zeqi  Lai  

(2)

Outline

• 

Background  

• 

Problem  Statement  

•  Service  Usability  

•  Protocol  Capabili?es  

• 

Our  Explora?on  on  Protocol  Capabili?es  

• 

Summary  

2 15/11/3

(3)
(4)

Internet Storage Sync Services

• 

New  data  entrance  of  the  Internet  

•  Basic  func?on:  storing,  sharing  and   synchronizing  data  

•  Large  user  base:  Dropbox  has  more  than  400   million  users    

•  Significant  traffic:  Dropbox  accounts  for  

approximately  4%  of  the  total  Internet  traffic   [IMC  2012]  

4 15/11/3

(5)

Internet Storage Sync Services

• 

New  data  entrance  of  the  Internet  

•  Major  players:  Dropbox,  Google  Drive,  One   Drive,  Box.com,  Apple  …  

•  Combining  other  services  via  APIs:  photo   sharing,  email  aZachment,  social  apps  

(6)

Typical Architecture & Flow

• 

Typical  architecture  of  ISS  services  

•  Control  flow:  exchanging  metadata     •  Storage  flow:  exchanging  contents  

•  Sync  process  with  your  mul?ple  clients  

6 15/11/3

(7)

Capabilities in Sync Protocol

• 

Key  storage  capabili?es  [IMC’  13]  

•  Chunking:  spli_ng  a  large  file  into  mul?ple  units  

•  Bundling:  mul?ple  small  chunks  as  a  single  one  

•  Deduplica2on:  avoiding  the  retransmission  of  

content  already  available  in  the  server  

(8)

Outline

• 

Background  

Problem  Statement  (draa-­‐cui-­‐iss-­‐problem)  

•  Service  Usability  

•  Protocol  Capabili?es  

• 

Our  Explora?on  on  Protocol  Capabili?es  

• 

Summary  

8 15/11/3

(9)

Outline

• 

Background  

• 

Problem  Statement  

•  Service  Usability  

•  Protocol  Capabili?es  

• 

Our  Explora?on  on  Protocol  Capabili?es  

(10)

Using Multiple ISS Services

• 

Users  may  use  mul?ple  services  

•  Performance  or  func?onality  diversity  

•  Dropbox  works  beZer  for  synchronizing  docs  

•  Google  Drive  connects  to  Gmail  and  Google  Doc   •  BaiDu  cloud  provides  2TB  free  space  

10 15/11/3

(11)

However that is not easy …

• 

For  users  

•  Users  may  install  mul?ple  similar  clients  

•  It  is  unable  to  synchronize  data  across  services   (e.g.  sync  between  a  Dropbox  user  and  a  Google   Drive  user)  

• 

For  applica?on  developers  

•  A  developer  has  to  deal  with  many  different   APIs  in  order  to  connect  his  app  with  mul?ple   sync  services  

(12)

Using a Private ISS Service

• 

Enterprise  may  want  their  own  storage  

•  Public  ISS  services  may  not  be  trusted   •  Like  what  email  is  doing  

• 

It  is  difficult  to  build  and  use  a  private  ISS  

service  

•  There  is  no  standard  sync  protocol   •  Need  to  start  from  scratch

(13)

Outline

• 

Background  

• 

Problem  Statement  

•  Service  Usability  

•  Protocol  Capabili?es  

• 

Our  Explora?on  on  Protocol  Capabili?es  

(14)

Rethinking about Capabilities

• 

Ideal  

•  With  these  capabili?es,  sync  services  can   efficiently  synchronize  our  data  

• 

Reality  

•  The  sync  2me  is  s2ll  much  longer  than  

expected  with  various  network  condi?ons!  

• 

Measurement  study  

•  We  measured  several  sync  services  to  iden?fy   and  analyze  the  sync  inefficiency  problem  

14 15/11/3

(15)

Impact of Missing Capabilities

• 

Bandwidth  inefficiency  

•  Sync  is  not  efficient  for  large  #  of  small  files  in     high  RTT  condi?ons  because  the  client  waits  for   an  app-­‐level  ACK  before  sending  next  chunk  

(16)

• 

Deduplica?on  is  NOT  always  efficient  

•  More  effec?ve  dedup  does  not  work  well  in   good  network  condi?ons  because  of  its  high   computa?on  overhead  

•  Network-aware  dedup  may  be  important

16

DER:  the  ra?o  of   the  deduplicated   file  size  to  the   original  file  size    

15/11/3

(17)

Impact of Misusing Capabilities

• 

Delta-encoding  fails  with  fixed-­‐size  chunking  

•  3  basic  file  opera?ons  (flip  bits,  insert,  delete)   •  Changing  2MB  of  a  10MB  file  leads  to  more  

than  6MB  sync  traffic  

TUO:   Traffic   data  /   modified   data    

(18)

Impact of Misusing Capabilities

• 

Why  the  delta-­‐encoding  fails?  

•  A  large  file  is  split  into  mul?ple  chunks  

•  Delta-­‐encoding  is  performed  between  chunks   •  But  modifica?ons  will  move  cut  points!  

18 15/11/3

(19)

Measurement Conclusion

• 

Missing  or  Misusing  these  key  capabili?es    

leads  to  the  sync  inefficiency  problem  

• 

Challenges  of  improving  sync  efficiency  

•  Are  these  capabili?es  enough?    

•  Should  we  combine  these  storage  techniques   with  network  parameters  (e.g.  delay,  loss  and   etc.)?    

(20)

Outline

• 

Background  

• 

Problem  Statement  

•  Service  Usability  

•  Protocol  Capabili?es  

Our  Explora?on  on  Protocol  Capabili?es  

• 

Summary  

20 15/11/3

(21)

Exploration on Capabilities

• 

QuickSync  [MobiCom15]  with  3  techniques  

•  Propose  network-­‐aware  content-­‐defined   chunker  to  iden?fy  redundant  data    

•  Design  improved  incremental  sync  approach  

that  correctly  performs  delta-­‐encoding  between   similar  chunks  to  reduce  sync  traffic  

(22)

QuickSync Implementation

• 

Implementa?on  over  Dropbox  

•  Unable  to  directly  modify  Dropbox,  so  we  design   a  proxy-­‐based  architecture  built  on  Amazon  EC2  

• 

Implementa?on  over  Seafile  

•  The  proxy-­‐based  architecture  adds  overhead   •  Full  implementa?on  with  Seafile  (open  source)

22 15/11/3

(23)

Impact of Network-aware Chunker

• 

Network-­‐aware  Chunker  

•  Larger  chunks  in  good  network  condi?ons, make  aggressive  chunking  in  slow  networks  

• 

Performance  results  

•  200GB  backup;  up  to  31%  speed  improvement   •  Network-­‐aware  chunker  works  well  

(24)

Integrated System Performance

• 

Setup  

•  Prac?cal  sync  workloads  on  Windows  /  Android   •  Performance  results  (Win  /  Android)  

•  Traffic  size  reduc?on:  up  to  80.3%  /  63.4%  

•  Sync  ?me  reduc?on:  up  to  51.8%  /  52.9%  

24 15/11/3

(25)

Outline

• 

Background  

• 

Problem  Statement  

•  Service  Usability  

•  Protocol  Capabili?es  

• 

Our  Explora?on  on  Protocol  Capabili?es  

(26)

Related Work

• 

WebDAV  [RFC  4918],  Git  

•  These  efforts  focus  on  authoring  and  versioning   •  Can  not  well  support  large  files  

• 

Rsync  

•  Delta-­‐encoding  algorithm  only  works  well  in  file   granularity  

• 

Different  from  ISS  

•  ISS  focuses  on  the  sync  opera?on  

•  Other  important  capabili?es  are  closely  related   and  required  (e.g.  chunking,  deduplica?on)  

26 15/11/3

(27)

Future Work

• 

Goal:  usability  &  capabili?es  

•  Easier  to  use  mul?ple  storage  sync  services     •  Easier  to  build  a  private  sync  service  

•  Achieve  interoperability  

•  Reasonably  configure  capabili?es    

• 

Possible  solu?on:  standard  sync  protocol  

•  Standardize  the  sync  process  and  capabili?es   •  Want  to  apply  IETF  Transport  and  Security  

(28)

References

•  Problem  Statement:   hZp://datatracker.iet.org/doc/draa-­‐cui-­‐iss-­‐ problem/   •  Wiki:   hZps://github.com/iss-­‐iet/iss/wiki/Internet-­‐ Storage-­‐Sync   •  QuickSync  [MobiCom2015]:   hZp://www.4over6.edu.cn/cuiyong/cindex.html   •  A  First  Look  at  Mobile  Cloud  Storage  Services  [IEEE  

Network  Magazine]:  

hZp://www.4over6.edu.cn/cuiyong/cindex.html  

References

Related documents

suppliers of raw materials concerning possible price fluctuations for raw materials and components, readiness for flexible action in their supply, the possibility

But for today there are considerable problems with satisfaction of new requirements of sustainable development in single-industry towns of Ukraine where the basis of

Conducted research and critical analysis of the functioning of the large industrial enterprises of the old industrial regions in Ukraine (Dnipropetrovsk, Donetsk,

(1) Government funding of sports programmes will not be a significant impediment to sports sponsorship in Nigeria.. (2) Economic environment will not be a significant impediment to

Abstract: The main object of the present paper is to obtain the large deflection and bending stresses for a clamped circular plate under non-uniform load by using Berger’s

This paper seeks to explore the acquisition modes of collections in the libraries of Tamale and Bagabaga Colleges of Education in Ghana. The population consisted of all

This paper first discusses why the sentence-level drills still being used extensively in the teaching of grammar to second language learners have not been successful. What follows

We hope that our proposed framework will provide researchers with a template and a foundation for developing a solid and clear problem statement in research projects