• No results found

Mining the Cloud.! Building and supporting a Virtual Research Infrastructure!

N/A
N/A
Protected

Academic year: 2021

Share "Mining the Cloud.! Building and supporting a Virtual Research Infrastructure!"

Copied!
16
0
0

Loading.... (view fulltext now)

Full text

(1)

Mining the Cloud….!

!

Building and

supporting a Virtual

Research

Infrastructure!

Gary  Wroblewski  

Applica2on  Coordinator/Mgr.  Technical  Services  

The  Herbert  H.  and  Grace  A.  Dow  College  of  Health  Professions   Central  Michigan  University  

(2)

How  to  build  a  research  

infrastructure….?  

•  Get  a  big  grant.  

•  Build  a  high  speed  core  network  

– AHach  that  high  speed  network  to  buildings   where  your  researchers  are.  

•  AHach  systems  and  storage  

•  Buy  researchers  high  powered  desktops  

(3)

Fundamental  Ques2ons  

•  Can  I  build  on  exis2ng  CMU  ac2vi2es  like  VDI?  

•  Standard  HPC  does  not  work  for  my  faculty…  

•  Realiza2on  that  we  should  bring  the  

researchers  to  the  data,  not  move  the  data  to   the  researchers  

– BeHer  support  for  homogeneous  systems   – We  can  leverage  shared  investments  from  

different  units  

(4)

How  to  build  an  ICEBOX  

•  Buy  Dense,  Fast,  Capable  servers  

•  Buy  good,  fast  disc  arrays  

– …..and  fast  interconnects/networking  

•  Virtualize  servers  and  gold  image  of  desktop  

– Install  soKware  on  gold  image   – Clone  it  

•  Use  ACLs  to  allow  researchers  to  remotely  use  

vm’s  to  access  secure  high  speed  systems.  

(5)

Research  ICE  

What  is  Research  ICE?  

•  Cyberinfrastructure  designed  to  support  health  analysis   ac2vi2es.  

Who  is  Research  ICE  for?  

•  Inves2gators  at  or  collabora2ng  with  The  Herbert  H.  and  Grace   A.  Dow  College  of  Health  Professions  

•  What  will  Research  ICE  do?  

•  Support  interdisciplinary  collabora2on  and  research.  

•  Securely  provide  access  to  protected  health  informa2on  and   proac2vely  enforce  data  use  agreements.  

(6)

Shared  Governance:  Research  ICE    

Faculty  Priori2za2on  CommiHee  

Charter:  

 This  commiHee  shall  be  responsible  for  priori2zing   data  staging  and  developing  the  appropriate  policies  and   opera2onal  procedures  to  ensure  the  research  data  

available,  access  control,  and  collabora2on  environment   fulfills  expecta2ons  as  cri2cal  research  infrastructure.   Ini2ally  the  commiHee  will  focus  on  specific  targeted   projects  but  rapidly  expand  to  support  other  research  

groups  at  CMED  and  CHP.  The  Health  Technologies  Group   (HTG)  shall  be  responsible  for  ongoing  opera2ons  of  

these  resources  under  the  guidance  of  the  faculty  and   the  Dean’s  Advisory  Council.    

(7)

Acquisi2on  of  Private  Data  

•  Establish  a  repeatable  process  for  2mely  

acquisi2on  and  staging  of  third  party  data  for   analysis.  

•  Standardized  Data  Use  Agreements  (DUA)  

–  Define  expecta2ons  for  privacy,  permissible  use  or   disclosure,  and  limita2on  on  comingling  with  other   data  sets.  

–  Outline  security  mechanisms,  access  control  methods   employed,  and  specific  regulatory  obliga2ons.  

–  Clarify  the  requirements  upon  termina2on  of  the   rela2onship.  

(8)

What  is  in  the  ICE  “BOX”?  

•  An  integrated  infrastructure  for  conduc2ng  shared  analysis.   •  Commonly  reusable  data  sources  that  are  professionally  

maintained  and  integrated.  

•  Robust  security,  access  control,  and  audit  tracking   mechanisms.  

•  Access  to  common  analy2cal    and  data  visualiza2on  tools.   •  Modestly  large  storage  capacity,  50-­‐100  TB  

•  Adequate  computa2onal  resources  to  readily  manipulate   and  analyze    100  M  record  data  sets.  

(9)

Analy2cs  Workbench  Tools  

Quan%ta%ve  Resources  

•  JMP   •  SPSS   •  R  

•  SAS  Enterprise  Guide   •  Simula2on  soKware   •  Data  visualiza2on  tools  

–  Tableau  

Dedicated  Virtual  Worksta%ons  

•  Faculty    &  Principle   Inves2gators  

•  Graduate  students,  under   graduates,  and  GSAs  

•  Ability  to  work  both  

remotely  and  on  campus   •  Adequate  computa2onal  

performance   •  Large  data  sets  

(10)

Standardized  Virtual  Configura2ons  

Virtual  Student  Computing  Lab/s Research  Group  Virtual  Analytics  Evirnoment

Virtualization  Server  Fabric

Storage  &  Disk

Storage  &  Disk

Native  Drivers Network Network Virtual  Research   Workstation  1 Virtual disk Virtual   CPU  &   Memory Applications Guest  OS VM-­‐aware  drivers Virtual  Research   Workstation  2 Virtual disk Virtual   CPU  &   Memory Applications Guest  OS VM-­‐aware  drivers Virtual  Research   Team  Server Virtual disk Virtual   CPU  &   Memory Applications Guest  OS VM-­‐aware  drivers Firewall Firewall Virtual  Lab    1 Virtual disk Virtual   CPU  &   Memory Applications Guest  OS VM-­‐aware  drivers Virtual  Lab  2 Virtual disk Virtual   CPU  &   Memory Applications Guest  OS VM-­‐aware  drivers

Virtual  Class  Server Virtual disk Virtual   CPU  &   Memory Applications Guest  OS VM-­‐aware  drivers

(11)

Secure  Environment:  A  Logical  View  

Management network

CMU network/Internet

Virtual desktop network

Virtual Desktop Broker Vmware Host #1 Vmware Host #2 Fiber Switch SAN Storage Isolated Storage Network Processing Cluster Dell PowerEdge R715

2x AMD 8 Core processors per server Total of 32 cores X 2.4Ghz 96GB Ram per host(192GB total) VMWare ESXi 4.1

Storage Cluster

Xiotech ISE Storage Blade

12,000 IOPS 19.2 TB Raw capacity 4Gb FC storage fabric Researcher laptop Researcher laptop DMZ

Access control list Or Firewall Remote

(12)

Performance  Considera2ons  

•  Does  virtualiza2on  limit  our  performance?  

– ~10%  overhead……  

•  How  does  this  compare  to  a  standard  desktop  

that  runs  SAS?  

•  Data  storage  space?  

(13)

Security  

•  Shared  tenancy  of  data.  

– Are  ACL’s  enough  security?  

– NDA’s  for  confiden2al  data  sources.   – Backups?  

•  Develop  process  for  onboarding/disposal  of  

GID’s.  

•  Probably  much  beHer  control  than  having  

(14)

Support  

•  Schedule  maintenance  2me.  

•  Long  running  jobs…..days/weeks.  

•  Researcher  schedule….nights  and  weekends  

when  support  is  at  home!  

•  Training  requirements  for  researchers.  

(15)

Support…..  

•  SoKware  s2ll  maturing  

•  Commitment  from  administra2on  

•  Develop  oversight  commiHee  

– Faculty  and  staff  involved  in  research  

•  Virtual  infrastructure  is  easy  to  overtax!  

(16)

What’s  Next  

•  Con2nue  to  expand  systems  as  more  

researchers  get  involved.  

•  Build  more  performance  in….  

•  Extend  reach  to  other  campus  units  

•  Founda2on  for  grants.  

References

Related documents

and patient - - centric self management tools centric self management tools (HeartAge, LifeSensor Diabetes). (HeartAge, LifeSensor

In addition to using these virtual machines as sources for View desktop pools, you can use virtual machines to host the server components of VMware View, including Connection

 VMware ESX Host — ESX provides a virtualization layer that abstracts the processor, memory, storage, and networking resources of the physical host into multiple virtual

The analysis based on visual examination of photographs of vessels compared to static AIS data showed that in the majority (92.6%) of cases, the vessels whose

During the next ten years, real GDP in the developing nations of East Asia is expected to grow 7.6 percent annually, over two and a half times as fast as in the United States or

(iii) Region specific agro-climatology of the crop: It contains information about weather conditions experienced in the region, main crop growing seasons,

(1) To observe RCF crack and inclusions in a plate specimen, SRCL imaging was carried out at BL19B2 and BL46XU beam lines of SPRing-8, which is the brightest synchrotron

The questions could be grouped into five main “topics”: development (institutional understanding of development challenges and contribution to technical and public discussion