• No results found

Retaining globally distributed high availability Art van Scheppingen Head of Database Engineering

N/A
N/A
Protected

Academic year: 2021

Share "Retaining globally distributed high availability Art van Scheppingen Head of Database Engineering"

Copied!
47
0
0

Loading.... (view fulltext now)

Full text

(1)

Retaining globally

distributed high

availability

Art van Scheppingen

(2)

1.  Who  is  Spil  Games?   2.  Theory  

3.  Spil  Storage  Pla9orm  

4.  Ques=ons?  

(3)

Who are we?

Who  is  Spil  Games?    

(4)

•  Company  founded  in  2001  

•  350+  employees  world  wide  

•  180M+  unique  visitors  per  month  

•  45  portals  in  19  languages   •  Casual  games  

•  Social  games  

•  Real  =me  mul=player  games  

•  Mobile  games  

•  35+  MySQL  clusters  

•  60k  queries  per  second  (3.5  billion  qpd)  

(5)

Geographic Reach

180  Million  Monthly  Ac=ve  Users(*)  

 

•  Over  45  localized  portals  in  19  languages  

•  Mul=  pla9orm:  web,  mobile,  tablet  

•  Focus  on  casual  and  social  games  

(6)

Girls,  Teens  and  Family    

spielen.com  

juegos.com  

gamesgames.com  

games.co.uk

 

Brands

(7)

Foundations

The  exci2ng  theory    

(8)

•  What  does  it  exactly  mean?  

(9)

Wikipedia:  

High  availability  is  a  system  design  approach  and  

associated  service  implementa=on  that  ensures  a  

prearranged  level  of  opera=onal  performance  will  be   met  during  a  contractual  measurement  period.  

 

Oracle:  

•  Availability  of  resources  in  a  computer  system    

(10)

•  Master  with  (many)  slave(s)  

How do we reach HA with MySQL?

Master

(11)

•  Master  with  (many)  slave(s)  

•  Mul=  Master  

How do we reach HA with MySQL?

Master

Slave

Master

(12)

•  Master  with  (many)  slave(s)  

•  Mul=  Master  

•  Clustering  

How do we reach HA with MySQL?

Mysqld Mysqld ndbd ndbd ndbd ndbd ndbd mgmt

(13)

•  Master  with  (many)  slave(s)  

•  Mul=  Master  

•  Clustering  

•  Geographical  redundancy    

How do we reach HA with MySQL?

Master local DC Slave local

DC

(14)

•  Scale  up   •  Ver=cal  

•  Faster  CPU/Memory/disks  

•  Expensive  

•  Costs  mul=ply  in  same  rate  as  #  of  nodes  

•  Scale  out  

•  Horizontal  

•  More  (small)  machines  

•  Inexpensive  

•  Par==oning/federa=ng  (sharding)  

(15)

•  Func=onal  

•  Shard  your  database  func=onally  

•  Reads  

•  Add  more  slaves  (keep  them  coming!)  

•  Writes  

•  More  disks  

•  Horizontal  par==oning  

•  Federated  par==ons  

(16)

•  Breaking  up  tables  in  small  parts  on  the  same  host  

•  Par==oned  on  a  column  

•  Infinite  growth  (as  long  as  you  add  diskspace)  

•  Less  used  data  to  slower  (cheaper)  disks  

•  No  stored  procedures,  func=ons,  etc  

•  Uneven  usage  of  par==ons  (hash  par==on  may  help)  

•  Once  wrihen,  data  remains  on  the  par==on  

Horizontal partitioning

(17)

•  Breaking  up  your  table  in  parts  on  mul=ple  hosts  

•  Par==oned  on  a  column  

•  Infinite  growth  (as  long  as  you  add  hosts)  

•  Less  used  data  on  slower  hosts  

•  Not  supported  in  (standard)  MySQL  

•  Par==oning  on  applica=on  level  (or  proxy)  

•  Alterna=vely:  NDB  

•  Uneven  usage  of  par==ons  

•  Once  wrihen  data  (mostly)  remains  on  the  par==on  

•  Parallel  queries  to  retrieve  data  from  all  shards  

Federated partitions (sharding)

(18)

•  Parallel  execu=on  of  sequen=al  jobs  

•  Limited  by  the  weakest  link  

•  As  fast  as  the  slowest  node  

•  Fix:  nonsequen=al  (asynchronous)  execu=on  

(19)

Typical LAMP stack

Client   Webserver   PHP   MySQL   Memcache   Webserver   PHP   Loadbalancer  

(20)

A-typical LAMP stack

Client   Webserver   PHP   MySQL   Memcache   Webserver   PHP   Loadbalancer   MQ   Jobs  

(21)

Spil Storage Platform

Abstrac2ng  the  storage  layer    

(22)

•  Dependent  on  one  storage  pla9orm  

•  No  more  pla9orm-­‐specific  query  language  

•  Differen=ate  writes    

•  Op=mis=c  (asynchronous)  

•  Pessimis=c  (synchronous)  

•  Shard  data  beher  

•  Par==on  on  user  and  func=on  

•  Cluster  informa=on  by  users,  not  by  func=on  

•  Global  expansion  

•  Par==on  on  geographic  loca=on  

•  Solve  uneven  usage  of  data  storage  

•  Move  data  from  shard  to  shard  

•  Anything  may/could/will  fail  eventually  

•  Not  designed  for  the  “happy”  flow  

(23)
(24)
(25)

New architecture overview

Server API Application Model Storage platform Client-side API Presentation layer

(26)

•  Everything  wrihen  in  Erlang  

•  Piqi  as  protocol   •  binary  

•  JSON  

•  XML  

•  SSP  u=lizes  local  caching  (memcache)  

•  Flexible  (persistent)  storage  layer   •  MySQL  (various  flavors)  

•  Membase/Couchbase  

•  Could  be  any  other  storage  product  

•  MQs  (DWH  updates)  

(27)

•  Predictable  

•  Reliable  

•  Decent  performance  

•  Easy  to  comprehend  

•  Excellent  eco  system   •  Libraries  

•  Monitoring  tools  

•  Knowledge  

(28)

•  Func=onal  language  

•  High  availability:  designed  for  telecom  solu=ons  

•  Excels  at  concurrency,  distribu=on,  fault  tolerance  

•  Do  more  with  less!  

•  Other  companies  using  Erlang:  

(29)

•  What  is  the  bucket  model?  

•  Each  record  has  one  unique  owner  ahribute  (GID)  

•  GID  (Global  IDen=fier)  iden=fying  different  types  

•  Bucket(s)  per  func=onality  

•  Bucket  is  structured  data  

•  Ahributes  contain  data  of  records  

•  Ahributes  do  not  have  to  correspond  to  schema  

(30)

$  curl  -­‐X  POST  -­‐H  'Accept:  applica=on/json'  -­‐H  \  

'Content-­‐Type:  applica=on/json'  -­‐-­‐data-­‐binary  "{\"gid\":  \  

288511851128422401}"  hhp://127.0.0.1:8777/demobucket/get   {      "records":  [          {              "gid":  288511851128422401,              "given_name":  "g",              "registered_on":  1,              "email":  "mail1",              "gender":  "m",  

           "birthdate":  {  "year":  1963,  "month":  6,  "day":  21  }          }  

   ],  

   "meta_info":  {  "total_ct":  1  }   }  

(31)

CREATE  TABLE  demobucket  (  

   gid  bigint(20)  unsigned  not  null,      given_name  varchar(64)  not  null,  

   registered_on  =nyint(3)  unsigned  default  0,      email  varchar(255)  not  null,  

   gender  enum(‘m’,  ‘f’,  ‘u’)  not  null  default  ‘m’,      birthdate  date  not  null,  

   PRIMARY  KEY(gid)   );  

(32)

CREATE  TABLE  demobucket  (  

   gid  bigint(20)  unsigned  not  null,      user_name  varchar(64)  not  null,  

   user_register  =mestamp  on  update   CURRENT_TIMESTAMP(),  

   user_emailaddress  varchar(255)  not  null,      user_gender  char(1)  not  null  default  ‘m’,      user_dob  varchar(10)  not  null,  

   PRIMARY  KEY(gid)   );  

(33)

CREATE  COLUMNFAMILY  demobucket  (      gid  int  PRIMARY  KEY,  

   given_name  varchar,      registered_on  =mestamp,      email  varchar,      gender  varchar,      birth_date  varchar   );  

(34)

demobucket:get(  #demobucket_get_input{  gid=12345,  filters=  [  

                         #filter{  ahr=  <<"gender">>        ,  op=  <<"=">>        ,  parms=  {string,  <<"f">>}},                            #filter{  ahr=  <<"registered_on">>,  op=  <<"sort">>,  parms=asc  },  

                         #filter{  ahr=  <<"gid">>,  op=  <<"limit">>,    parms={int,  10  }}                    ]}  )  

(35)
(36)

•  Nearest  datacenter  (DC)  to  the  end  user  

•  Satellite  DC  

•  Processing  and  caching  

•  Do  not  own/store  data  

•  Storage  DC    

•  Processing,  caching  and  persistent  storage  

•  Store  all  same  user  data  in  same  DC  

•  Par==on  on  user  globally  

•  Global  IDen=fier  per  user  

(37)

•  Contains  GIDs  and  their  master  DC  

•  GIDs  master  DC  predefined  

•  Migrated  GIDs  get  updated  

(38)

•  Globally  sharded  on  GID  

•  (local)  GID  Lookup  

How does this work?

GID lookup

Shard 1 Shard 2

Persistent storage

(39)
(40)

•  Spread  data  even  on  shards  

•  Migra=on  of  buckets  between  shards  

•  GID  migra=on  between  DCs  

•  Crea=ng  a  new  storage  DC  needs  data  migra=on  

•  Users  will  automa=cally  be  migrated  a‚er  visi=ng  

another  DC  many  =mes  

(41)

•  Versioning  on  bucket  defini=ons    

•  GIDs  are  assigned  to  a  bucket  version  

•  Data  in  old  bucket  versions  remain  (read  only)  

•  New  data  only  gets  wrihen  to  new  bucket  version  

•  Updates  migrate  data  to  new  bucket  version  

•  Migrates  can  be  triggered  

(42)

Seamless schema upgrades

Demobucket  v1   GID   1234   1235   1236   1237   1238   1239   name   Roy   Moss   Jen   Douglas   Denholm   Richmond   Demobucket  v2   GID               name               gender               GID   1241             name   Patricia             gender   f             GID   1241   1235           name   Patricia   Moss           gender   f   m           GID   1234     1236   1237   1238   1239   name   Roy     Jen   Douglas   Denholm   Richmond   GID   1234       1237   1238   1239   name   Roy       Douglas   Denholm   Richmond   GID   1241   1235   1236         name   Patricia   Moss   Jen         gender   f   m   f        

(43)

•  Every  cluster  (two  masters)  will  contain  two  shards   •  Data  wrihen  interleaved  

•  HA  for  both  shards  

•  No  warmup  needed  

•  Both  masters  ac=ve  and  “warmed  up”  

•  Slaves  added  (other  DC)  for  HA  and  backup  

Multi Master writes

SSP   Shard  1                                   Shard  2                                  

(44)

•  SPAPI  is  in  place  

•  SSP  is  (mostly)  running  in  shadow  mode  

•  GID  buckets  running  in  produc=on  

•  Ac=vity  feed  system  first  to  produc=on  

•  Satellite  DC  in  early  2013!  

(45)
(46)
(47)

•  Presenta=on  can  be  found  at:  

hhp://spil.com/perconalondon2012  

•  If  you  wish  to  contact  me:   [email protected]  

•  Don’t  forget  to  rate  my  talk!  

References

Related documents