Increasing XenServer’s VM density
Jonathan Davies,
XenServer System Performance Lead
XenServer Engineering, Citrix
Cambridge, UK
Outline
1
Scalability expectations
2
Hard limits
3
Soft limits
Scalability expectations
Outline
1
Scalability expectations
2
Hard limits
3
Soft limits
Scalability expectations
Scalability expectations
Scalability expectations
Scalability expectations
XenServer’s VM density scalability
hard density
limit :-(
hardware's
theoretical capacity
XS 6.1
Scalability expectations
XenServer’s VM density scalability
hard density
limit :-(
hardware's
theoretical capacity
hard density
limit :
-)
practical density limit
(depending on nature of VMs)
XS 6.1
(and earlier)
Hard limits
Outline
1
Scalability expectations
2
Hard limits
3
Soft limits
Hard limits Enumerated causes of limitations
Hard limit 1: dom0 event channels
Cause of limitation
XenServer uses a 32-bit dom0
This means 1,024 dom0 event channels
#define MAX_EVTCHNS(d) \
(BITS_PER_EVTCHN_WORD(d) *
BITS_PER_EVTCHN_WORD(d))
Various VM functions use a dom0 event channel
VM density hard limit
225
VMs per host
(PV with 1 vCPU, 1 VIF, 1 VBD)
150
VMs per host
(HVM with 1 vCPU, 1 VIF, 3 VBDs)
Mitigation for XS 6.2
Hack for dom0 to enjoy 4,096 event channels
→
800
VMs per host
(PV with 1 vCPU, 1 VIF, 1 VBD)
→
570
VMs per host
(HVM with 1 vCPU, 1 VIF, 3 VBDs)
Mitigation for future
Change the
ABI
to provide unlimited event channels
Hard limits Enumerated causes of limitations
Hard limit 2: blktap2 device minor numbers
Cause of limitation
blktap2 only supports up to 1,024 minor numbers
(despite the kernel allowing up to 1,048,576)
#define MAX_BLKTAP_DEVICE
1024
Each virtual block device requires one device
VM density hard limit
341
VMs per host
(with 3 disks per VM)
Mitigation for XS 6.2
Double this constant to 2,048
→
682
VMs per host
(with 3 disks per VM)
Mitigation for future
Move away from blktap2 altogether?
Hard limits Enumerated causes of limitations
Hard limit 3: number of aio requests
Cause of limitation
Each blktap2 instance creates an asynchronous I/O
context for receiving 402 events.
Default system-wide number of aio requests was
444,416 in XS 6.1.
VM density hard limit
368
VMs per host
(with 3 disks per VM)
Mitigation for XS 6.2
Set fs.aio-max-nr to 1,048,576
→
869
VMs per host
(with 3 disks per VM)
Mitigation for future
Increase fs.aio-max-nr further
or use storage driver domains
Hard limits Enumerated causes of limitations
Hard limit 4: dom0 grant references
Cause of limitation
Windows VMs use receive-side copy (
RSC
) by
de-fault in XS 6.1.
netback allocates (at least) 22 grant-table entries per
virtual interface for
RSC
.
dom0 had a total of 8,192 grant-table entries in
XS 6.1.
VM density hard limit
372
VMs per host
(with 1 interface per VM)
Mitigation for XS 6.2
Don’t use
RSC
in Windows VMs anyway
Hard limits Enumerated causes of limitations
Hard limit 5: connections to xenstored
Cause of limitation
xenstored uses select(2), which can only listen on
1,024 file descriptors.
#define __FD_SETSIZE
1024
qemu opens 3 file descriptors to xenstored.
VM density hard limit
333
VMs per host
(HVM)
Mitigation for XS 6.2
Make two qemu watches share a connection
→
500
VMs per host
(HVM)
Mitigation for future
Upstream qemu doesn’t connect to xenstored
Hard limits Enumerated causes of limitations
Hard limit 6: connections to consoled
Cause of limitation
Similarly, consoled uses select(2)
Each PV domain opens 3 fds to consoled
VM density hard limit
341
VMs per host
(PV)
Mitigation for XS 6.2
Use poll(2) rather than select(2) in consoled
Hard limits Enumerated causes of limitations
Hard limit 7: dom0 low memory
Cause of limitation
Each running VM eats about 1
MB
of dom0 lowmem
VM density hard limit
around
650
VMs per host
Mitigation for future
Use a 64-bit dom0
Hard limits Summary of hard limits
Summary of hard limits
Limits on number of HVM guests with 1 vCPU, 1 VBD, 1 VIF (with PV drivers)
Limitation
XS 6.1
XS 6.2
Future
dom0 event channels
225
800
no limit
blktap minor numbers
1024
2048
no limit
aio requests
1105
2608
no limit
dom0 grant references
372
no limit
no limit
xenstored connections
333
500
no limit
consoled connections
no limit
no limit
no limit
dom0 low memory
650
650
no limit
Overall limit
225
500
very high
Hard limits Summary of hard limits
Summary of hard limits
Limits on number of HVM guests with 1 vCPU, 3 VBDs, 1 VIF (with PV drivers)
Limitation
XS 6.1
XS 6.2
Future
dom0 event channels
150
570
no limit
blktap minor numbers
341
682
no limit
aio requests
368
869
no limit
dom0 grant references
372
no limit
no limit
xenstored connections
333
500
no limit
consoled connections
no limit
no limit
no limit
dom0 low memory
650
650
no limit
Overall limit
150
500
very high
Hard limits Summary of hard limits
Summary of hard limits
Limits on number of PV guests with 1 vCPU, 1 VBD, 1 VIF
Limitation
XS 6.1
XS 6.2
Future
dom0 event channels
225
1000
no limit
blktap minor numbers
1024
2048
no limit
aio requests
368
869
no limit
dom0 grant references
no limit
no limit
no limit
xenstored connections
no limit
no limit
no limit
consoled connections
341
no limit
no limit
dom0 low memory
650
650
no limit
Overall limit
225
650
very high
Hard limits Summary of hard limits
Soft limits
Outline
1
Scalability expectations
2
Hard limits
3
Soft limits
Soft limits xenstored
High dom0 CPU utilisation by xenstored
top - 16:29:33 up 36 min,
1 user,
load average: 0.80, 0.56, 0.47
Tasks: 132 total,
1 running, 131 sleeping,
0 stopped,
0 zombie
Cpu(s): 40.1%us, 40.0%sy,
0.0%ni, 17.6%id,
0.0%wa,
0.0%hi,
0.0%si,
0.0%st
Mem:
4186504k total,
443480k used,
3743024k free,
23696k buffers
Swap:
524280k total,
0k used,
524280k free,
132504k cached
PID USER
PR
NI
VIRT
RES
SHR S %CPU %MEM
TIME+
COMMAND
7339 root
20
0
6732 2240
840 S 80.2
0.1
0:10.22 xenstored
6665 root
20
0
4344 2636
584 S
0.4
0.1
0:04.03 fe
7225 root
20
0 48892 5356 1736 S
0.3
0.1
0:03.35 xcp-rrdd
7269 root
20
0 23704 3684 1308 S
0.3
0.1
0:03.47 xcp-rrdd-iostat
7413 root
20
0
195m
21m 8932 S
0.3
0.5
0:10.28 xapi
7283 root
20
0
7492 4860 1200 S
0.3
0.1
0:08.65 xcp-rrdd-xenpm
10938 root
20
0 29808 1856
956 S
0.3
0.0
0:00.40 v6d
16403 root
20
0
2428 1104
824 R
0.3
0.0
0:02.31 top
1 root
20
0
2164
656
564 S
0.0
0.0
0:00.83 init
2 root
20
0
0
0
0 S
0.0
0.0
0:00.00 kthreadd
Soft limits xenstored
High dom0 CPU utilisation by xenstored
dom0 vCPUs
Soft limits xenstored
High dom0 CPU utilisation by xenstored
x
enstored's dom0 vCPU
domU vCPUs
Soft limits xenstored
High dom0 CPU utilisation by xenstored
Cause of limitation
xenstored CPU utilisation bottleneck
Mitigation for XS 6.2
Reduce xenstore use by XenServer’s toolstack:
remove some spurious writes
Soft limits qemu
High dom0 CPU utilisation due to qemu
top - 16:40:27 up
2:07,
1 user,
load average: 89.62, 87.22, 76.90
Tasks: 1015 total,
65 running, 950 sleeping,
0 stopped,
0 zombie
Cpu(s): 23.4%us, 55.5%sy,
0.0%ni,
4.8%id,
0.0%wa,
0.0%hi, 15.4%si,
0.5%st
Mem:
4180480k total,
1615840k used,
2564640k free,
3804k buffers
Swap:
524280k total,
0k used,
524280k free,
122852k cached
PID USER
PR
NI
VIRT
RES
SHR S %CPU %MEM
TIME+
COMMAND
7143 root
20
0
0
0
0 R 33.9
0.0
17:21.63 rpciod/0
6653 root
10 -10 12264 7796 1152 R 31.8
0.2
36:14.34 ovs-vswitchd
16496 tcpdump
20
0
5508 2132 1248 R 10.5
0.1
5:35.12 tcpdump
16970 root
20
0
2952 1552
736 R
6.3
0.0
0:00.11 top
997 65583
20
0 24696 4732 1572 S
3.1
0.1
0:56.30 qemu-dm
3195 65684
20
0 24632 4736 1572 S
3.1
0.1
0:27.34 qemu-dm
3497 65656
20
0 24760 4740 1576 R
3.1
0.1
0:28.65 qemu-dm
3562 65685
20
0 24696 4732 1572 S
3.1
0.1
0:26.97 qemu-dm
3993 65546
20
0 24888 4744 1580 S
3.1
0.1
0:53.19 qemu-dm
7597 65659
20
0 24632 4736 1576 S
3.1
0.1
0:28.86 qemu-dm
8150 65550
20
0 24760 4740 1580 R
3.1
0.1
0:51.71 qemu-dm
8679 65627
20
0 24632 4740 1576 R
3.1
0.1
0:31.18 qemu-dm
8974 65661
20
0 24568 4736 1572 S
3.1
0.1
0:27.97 qemu-dm
11937 root
20
0
0
0
0 S
3.1
0.0
1:12.92 nfsiod
12545 65556
20
0 24824 4748 1584 S
3.1
0.1
0:58.46 qemu-dm
Soft limits qemu
qemu burning dom0 CPU
200 idle Windows guests,
each qemu utilising 3% of a CPU
means
Soft limits qemu
What is qemu busy doing?
Emulated device
qemu events per VM per second
USB
221
CD-ROM
38
Buffered I/O & RTC timer
13
Parallel port
1
Serial port
1
VNC
1
qemu monitor
1
Mitigation for XS 6.2
Use an event-channel for buffered I/O notifications
Provide options to disable all emulated devices
Benchmarks
Outline
1
Scalability expectations
2
Hard limits
3
Soft limits
Benchmarks Bootstorm
Booting 90 Win7 VMs
XS 6.2 is
60% faster
0
500
1000
1500
2000
2500
0
25
50
75
100
125
150
175
200
E
la
pse
d t
im
e (
s)
VM index
Time to fully boot 90 VMs (25 at a time)
Tampa (XS 6.1)
Benchmarks Bootstorm
Booting 120 Win7 VMs
XS 6.2 is
75% faster
0
500
1000
1500
2000
2500
E
la
pse
d t
im
e (
s)
Time to fully boot 120 VMs (25 at a time)
Tampa (
XS 6.1)
Benchmarks Bootstorm
Booting 200 Win7 VMs
XS 6.1 c
an't
even get 200
VMs running!
It took XS 6.2 just
13 minutes to boot
200 VMs
(on this hardware)
0
500
1000
1500
2000
2500
0
25
50
75
100
125
150
175
200
E
la
pse
d t
im
e (
s)
VM index
Time to fully boot 200 VMs (25 at a time)
Tampa (XS 6.1)
Clearwater (XS 6.2)
Benchmarks LoginVSI
LoginVSI: number of usable Windows VMs
XS 6.
2
XS 6.1
n
u
mber
o
f VMs
per
fo
rmin
g a
c
c
epta
bl
y
Questions