Data center modeling,
and energy efficient server management
National Institute of Advanced Industrial Science and Technology (AIST) Satoshi Itoh
Contents
• Virtualization
• Energy-saving scenario
• Data Center and Cloud computing
• Key technologies for Green Cloud
– Virtual Cluster system
– AIST 1 sec Live Migration
– Storage Live Migration
– Server Modeling
• Summary
• Two major trends related to data center
• Virtualization and Grid are essential technologies to realize
both Cloud, Green Data Center and Green Cloud
Green Cloud
Green
Data Center
• Energy-saving • Low carbonData Center
Cloud
• Service • UtilityGreen
Cloud
Virtualization / GridEnergy-saving scenario
• Pack the application (Service) into fewer physical
servers
• Power off the unused servers
Service A Service B Service C
Server 1 Server 2 Server 3
Service A Service B Service C
Opportunities in Coarser grain
• Find opportunities in Module, Room, and Data Center
levels
• Power off air conditioner and power supply
• Contribute significant energy-saving (doubles)
Room 1 Room 2 Room 1 Room 2
Data center 1 Data center 2 Data center 1 Data center 2
• MAPE loop : General concept of optimal management
– It is used to optimize utilization of IT equipment
– It can be applied for p
ower aware computing and optimal
energy management
• Keyword
– Energy-saving by virtualization
Monitor Analyze Plan ExecuteMAPE Loop for energy
Alert of energy
Counter Action
AP3 AP1 AP4
AP2 AP4
Technologies to support Green Cloud
Server1 Server2 Server3 Data Center 1 Data Center 2 AP1 Server4 Green Cloud management system AP1
• Cluster of virtual servers (Xen, KVM, VLAN) • Multi-site Cluster (VPN) • On demand storage (iSCSI) AP1 AP1 AP1 AP1 Virtual Cluster System
AP3 AP1 AP2
Technologies to support Green Cloud
Server1 Server2 Server3 Data Center 1
Server4
AP1
• Not only CPU load • But also power
consumption
AP1
AP1
Monitoring of CPU load, power
consumption AP1 Data Center 2 AP1 Green Cloud management system
AP2
AP3 AP1
AP2
Technologies to support Green Cloud
Server1 Server2 Server3 Data Center 1
Server4
AP1
• Estimate power consumption using the models of server and services.
• Choose the plan that makes lower power consumption
AP1 AP1 Modeling of server and services Optimal assignment planning AP3 AP1 AP2 AP2 AP2 AP2 AP1 Data Center 2 AP1 Green Cloud management system
AP3 AP1 AP2
Technologies to support Green Cloud
Server1 Server2 Server3 Data Center 1
Server4
AP1
• Some of services finish • Planning optimal
assignment again
• Migrate the service without stopping AP1 AP1 Optimal assignment planning AP3 AP1 AP2 AP2 AP2 AP2 Live migration AP2 AP1 Data Center 2 AP1 Green Cloud management system
Technologies to support Green Cloud
Server1 Server2 Server3 Data Center 1
Server4
AP1
• Shutdown unused server • Reduce the power
consumption
• Turn it on, when needed
AP1 Optimal assignment planning AP3 AP3 AP2 AP2 AP2 Remote power control AP1 Data Center 2 AP1 AP3 AP3 Green Cloud management system
AP3
AP3
AP3
Technologies to support Green Cloud
Server1 Server2 Server3 Data Center 1
Server4
AP1
• Some of services finish again • Migrate the service to
another site without stopping • Storage data is also copied • Shutdown unused room/DC • Reduce huge power
consumption Optimal assignment planning AP2 AP2 AP2 Data Center 2 AP1 AP1 Storage Live Migration AP1 Cooperation of IT equipment and facilities Green Cloud management system
AP1
AP3
Technologies to support Green Cloud
Server1 Server2 Server3 Data Center 1 Server4 AP3 Optimal assignment planning AP2 AP2 AP2 Data Center 2 AP1 Storage Live Migration AP1 Remote power control Live migration AP1 Modeling of server and services Virtual Cluster System Monitoring of
CPU load, power consumption AP3 Cooperation of IT equipment and facilities Optimal assignment planning Storage Live Migration Live migration Modeling of server and services Virtual Cluster System AIST Research target Green Cloud management system
Virtual cluster system
• Reservation of required resource for virtual cluster via portal • OS and required software are automatically installed at the
reserved time
– NPACI Rocks is a base
• Virtual Cluster is produced using three types of virtualization
technologies
– Server VMware Server / Xen / KVM – Network VLAN and VPN
– Storage iSCSI and LVM
• Prototype system is
available
http://www.rocksclusters.org/
Access from internet
Software and data provisioning
Live Migration
• The movement of a service from one physical machine to
another while continuously waked-up.
• Some of production software (VMware, Xen, KVM) can do,
if those machines share disk in a single domain
– At least 10 seconds are needed with 1GB memory to switch the host (It takes much more, if application updates frequently memory pages)
Server1 Server2
Shared Disk
1. Copy all memory pages to destination
2. Copy again updated memory pages during the previous copy
3. Repeat the 2nd step until the rest of
memory pages are enough small 4. Stop VM
5. Copy CPU registers, device states, and the rest of memory pages.
6. Resume VM at destination
VM
Service
VM Copy VM states faster than updates
AIST 1sec Live Migration
• Switch the execution host only in 1 second ! • Copy VM memory after relocation
1. Stop VM
2. Copy CPU and device states
to destination
3. Resume VM at destination
4. Copy memory pages on
demand VM Service VM STOP Server1 Server2 Shared Disk
AIST 1sec Live Migration
• Switch the execution host only in 1 second ! • Copy VM memory after relocation
1. Stop VM
2. Copy CPU and device states to destination
3. Resume VM at destination
4. Copy memory pages on
demand VM Service VM Server1 Server2 Shared Disk
AIST 1sec Live Migration
• Switch the execution host only in 1 second ! • Copy VM memory after relocation
1. Stop VM
2. Copy CPU and device states
to destination
3. Resume VM at destination
4. Copy memory pages on
demand VM Service VM Service Resume Server1 Server2 Shared Disk
AIST 1sec Live Migration
• Switch the execution host only in 1 second ! • Copy VM memory after relocation
– Trivial modification to VMM: Only add 200 lines to KVM – Transparent
• No special drivers and programs in VM • Support any guest operating systems
1. Stop VM
2. Copy CPU and device states
to destination
3. Resume VM at destination
4. Copy memory pages on
demand VM Service VM Service Server1 Server2
Shared Disk Copy accessed
memory pages
– Simple and Stable
Verification of the efficiency
• Relocate Web server of
SPECWeb2005(Banking)
• Existing method (Pre-copy)
– Migration was never completed – Update of memory pages is
faster than data transfer
• Proposed method (Post-copy)
– Host is switched in 1 sec – Response down for a while,
but it resumed in about 10 seconds
Environment VM (httpd) Shared Sstorage GbE GbE Live Migration
Intel Core2 Duo E6305 4GB RAM 1 VCPU 1GB RAM SPECWeb Client SPECWeb Back End Pre-copy Post-copy Netw ork Throu ghpu t (M by tes /s ) Numb e r of Res ponse s
• The movement of a service from one physical machine to
another in a different site while continuously waked-up.
• Copy Memory and disk images
– Copy memory image and activate the remote site virtual server
– Service accesses the disk in the original site and write it to the local disk – Whole of data is copied to the remote site and finally the service runs
at the remote site.
Multi-site storage live migration
Remote site Original site Server 1 Server 2 Remote disk VM 01100101110100 0100101100101 0010110101010 VM 01100101110100 0100101100101 0010110101010 Local disk Access and copy WAN
Combination of server and service
• Server consists of several components and has characteristics
– CPU, HDD, power supply, fan, …
– High density blade, Low-power HDD/processor, water cooling, SSD, ..
• Different application creates different workload of these
components
– Mail server, Web server, database server, …
• Server changes energy consumption according to the
application on it 22 Low power HDD-PC Mail Server Blade Server DB Server Low power HDD-PC DB Server Blade Server Mail Server
?
Modeling of server and service
• Modeling of Service (Software)
– De-composite to elemental processing
• Mail Server:CPU load ~ 30% Disk write ~ 70% • DB Server:CPU load ~ 70%
Disk write ~ 80%
• Modeling of Server (Hardware)
– De-composite to power consumption of components
• CPU load → power in CPU
• Store and access → power in Disk
• Data send and receive → power in NIC and CPU
Server (Hardware) Service (Software) CPU D is k PSU Com p u tin g CPU lo a d W ri te d isk Dat a tran sf e r F an C irc uit board Rea d d isk
Power consumption in 1U server
• DELL PowerEdge R300
– CPU: Dual CoreIntel Core 2Duo E6305 – Clock:1.60-1.86GHz – Memory:9GB – HDD:SATAⅡ 80GB (7200RPM) x 1 • Power consumption – Idle state:~76W – Disk access: ~4W – LINPACK: ~132W (12GFlops) – Fan(high room temperature):~13W
Idle state (Static) Fan normal HDD no access Memory PSU loss etc. Fan (high room temp.)
CPU in use Measured at AIST ~76W ~13W ~64W ~4W ~9W Floating point HDD Access
Items Energy per action unit
Network 64.8Ws/GB (Sender) 106.4Ws/GB (Receiver) Disk access 50Ws/GB (Read) 55Ws/GB (Write) Memory access 21.4Ws/GB (Read) ??Ws/GB (Write)
Processor ??W/GFlops or Ws/GFlop Idle status 73.6W W/Gbps W/Gbps Board etc ~ 15W ~37W ~ 6W Fan Normal HDD ~ 18W 3W/枚 memory
Power consumption in blade server
• DELL PowerEdge M1000e (Enclosure)
– 16 Blades per chassis – 6 Fans (12V5.6A) – 6 PSUs
– Power consumption (idle):~ 213W
• DELL PowerEdge M600 (Blade)
– CPU: Quad Core Intel Xeon E5420 x 2 – Clock:2.5GHz – Memory:8GB – HDD:SATAⅡ 80GB (7200RPM) (2.5”) • Power consumption – Idle state:~124W – Disk access: ~4W – LINPACK:~288W(70GFlops) 基板 他 51W HDD 1W 12W 4W/枚 メモリ 12W 60W CPU 60W Enc los ure part ~213W ~4600W (16枚) ~288W Blade part ~124W ~144W CPU in use 2197W 2624W (FP 320W) ??W CPU Active分 アイドル時 Fan通常回転 ブレードアイドル時 電源ロス、など ~20W (1 blade)
Idle state (Static) HDD no access etc.
Floating point
Power consumption in HDD
• Capability of storage: Store and Access • HDD consumes power when
– Disk spin → Store capability
– Read and Write processing → Access capability – Head seek → Access capability
172 172.5 173 173.5 174 174.5 175 175.5 176 176.5 177 1 9 17 25 33 41 49 57 65 73 81 89 97 105 113 121 129 137 145 153 161 169 177 ~ ~
data store capability power consumption while no data access ~8W
data access capability power consumption while data access ~4W po w er con sump tion → Time
power consumption / Disk size (W/GB) power consumption / Access speed (W/Gbps) Energy consumption / Access size (Ws/GB)
Power consumption in storage
• DELL EMC AX4-5F Disk array enclosure(DAE)
– 12 HDD in an enclosure
• Power consumption
– 390 VA, 360 W (maximum) – Idle state with 12 HDD : 300W – 8W / HDD → 12×8W=96W – 204W for enclosure Enclosure part ~204W 8-12W / HDD Store (Idle state) ~300W Access 4W/HDD Measured at AIST
Power consumption in Switch
• Force10 C300 – Size:13U – Capacity: 1.536 Tbps – Line card • slots: 8 • 48 GbE ports • 8 10GbE ports – Power consumption • Enclosure:305W– RPM(Routing Processor Module) x 2 – Fan x 6
– PSU (Max 1.4KW)) x 2
• 48 GbE Line Card: 103W / slot
– 48 ports wire-rate traffic: +4W
• Full install (48GbE x 8=384 GbE) → 1100W • 48ポート 103W, ~2W / port Enclo- sure part Line card part 305W 103~ 107W 1129 W 32W Idle state (no traffic) Traffic part (maximum 3%) power off 17.4W RPMx1 241.7W RPMx2 304.8W RPMx2+IFB 367.1W shutdown ports 359.7W link up 48 ports 407.5W
full short traffic 48 ports 411.2W full long traffic 48 ports 409.9W
Model and Metric
• Power consumption changes by combination of service and server • Estimate power consumption before assigning
• Metric for optimal assignment in the sense of power consumption
– Power (Energy) consumption / performance
• Simple case: LINPACK using 1U server and Blade server Low power HDD-PC Mail Server Blade Server DB Server CPU D is k PSU CPU D is k PSU Low power HDD-PC DB Server Blade Server Mail Server CPU D is k PSU CPU D is k PSU LINPACK Power cnsumption Power / Performance (Energy / Work)
1U Server 12GFlops 132W 11 W/GFlops(Ws/GFlop)
User’s responsibility(Green SLA)
• Responsibility for low carbon society extend from
provider side to user side.
– Users have to recognize power consumption what they use – They pay for it and use the data for carbon footprint
• Cloud
– Pay for service ( can include power cost in it )
– Power consumption of physical servers can be monitored Modeling of VM for power consumption is necessary
• Green SLA
– Contribute to energy-saving even though performance of the service drops
– SLA needs to include Green items (with performance, reliability, security etc. )
Topics for standard
• Necessity of new metrics
– How much does use of SaaS, ASP and Cloud contribute to
energy-saving ?
• energy consumption / service performance
– How greener is your data center than competitors / last
year’s record?
• energy consumption / performance of data center
– How much can service migration reduce energy
consumption?
• energy consumption / performance of server
• Green SLA
– Contribute to energy-saving even though performance of
the service drops
Summary
• Virtualization technologies can be used to save TCO
and Energy in data center.
• Energy-saving scenario in the multiple levels of data
center, server, rack, room, whole data center
• AIST is developing technology and middleware for
power aware management in Green Cloud
– Virtual Cluster system– Live (Storage) migration – Optimal assignment
– Modeling of server and data center – Green SLA