Wi-Fi / WLAN
Performance Management and Optimization
Veli-Pekka Ketonen CTO, 7signal Solutions
Topics
1.
The Wi-Fi Performance Challenge2.
Factors Impacting Performance3.
The Wi-Fi Performance Cycle4.
10 step performance optimization flow5.
Selected example dataWi-Fi Networks are Everywhere!
Wi-Fi Networks are Everywhere!
But they are transitioning from “nice to have” to “must have”
Challenges with Mission Critical Wi-Fi Networks:
Connection issues with new devices & machines
Bottlenecks from increasing data traffic
Dropped or noisy voice calls
Challenging physical environments
Dependable Wi-Fi is Costly and Complex
Complexity of Network
Number of access points, clients, applications
Cost Needed to
Achieve Reliability Voice over Wi-Fi
BYOD
Guest Networks
Mobile Computing
$ Virtual Desktop based on complaints Reactive focus Video Apps
Improper Antenna Selection / Placement
Antenna gain pattern
Antenna gain direction
Behind metal grid?
Near to conductive or “dense” surface?
In common ceiling mounted APs, sideways down tilted patterns is most usefulDown tilted pattern Attenuation upwards Max gain sideways
180Mbit/s
RF power level is not that simple
RF power isn’t always what your datasheet and settings tell you
Impact of:
– AP/device model
– Rate/MCS
– HT 20/40/80
– Assumed MIMO gain
– Assumed diversity/STBC gain
– Antenna gain
– Channel #, regulation
– Passing the Type Approval
– Back annotation reliability
Lower output power and use
antenna gain to reach further with higher rates
Radio output (no antenna), HT40, highest MCS Antenna gain, +3 dB HT40 - > HT 20, +2 dB No high MCS/rates, + 3dB MIMO/TX div. gain, +3 dB
+17 dBm +14 dBm +11 dBm +8 dBm +20 dBm 300 Mbit/s 300 Mbit/s
WLAN Transmit Power Control (TPC) can create issues
Common implementationmeasures neighbor APs levels and keep them below a fixed value
Power levels may drift to end of the allowed range
Clients commonly use +10 - +15dBm power, running APs much lower levels causes imbalance to link budget. Both uplink and
Room Room Room Room Room Room Room Room Room Room Room Room High received neighbor
AP level may drive AP power down
..and cause lack of coverage here
Channel & Utilization Issues
Channel overlap
APs outside channel grid
HT conflicts
Amount of APs/SSIDsAllocate channels properly Use all spectrum you have
The most important way to increase capacity -- avoid interference and lower utilization!
Some devices do not support all 5 GHz channels, but…try really hard to use all available channels
Channel automation
parameters may help to make it converge towards a better channel plan
If not, use manual channel
1
1
1
1
1
6
Without a very good reason this should not
6
1
6
11
1
1
Sometimes channel automation is not working well and needs help
Continuous channel switching
More stable operation
Too high rates cause high retries
WLAN AP rate control oftenuses rates that are too high
This causes high amount ofretries, which have negative impact on performance
What can rates and retries tell you?
Retries =
HIGH
Data rates/MCS = HIGH
Retries =
LOW
Data rates/MCS = LOW
Good coverage, reliable operation,
high speed and capacity Unstable, high jitter, packet loss, limited capacity Speed limited, working ok Very slow, at the
coverage boundary Typical in
Non Wi-Fi Interference
Bluetooth
Microwave
Video camerasLegacy mode drives speed down
The largest impact from is 802.11b protection
When an AP detects an associated 802.11b client, AP turns on protection mode (in beacons and proberesponses). AP may turn this on also when it detects another AP using protection mode.
When protection mode is on, all clients need to start using either RTS/CTS or CTS-to-Shelf protection to avoid collisions
This introduces a significant overhead that usually limits throughputs and capacity remarkably
If –b support is off, it’s useful to try to remove devices completely. Otherwise they keep probing with –b ratesTCP does not like lost packets or delay
TCP uses a mechanism called slow start
If a packet loss occurs, TCP assumes that it is due tonetwork congestion and takes steps to rapidly reduce the offered load to the network
With slow start, TCP starts increasing rate again when consecutive acknowledgements are received properly
Slow-start may perform poorly with wireless networksRetries at different layers using TCP User Application (Layer 5-7) TCP (Layer 4) WLAN (Layer 1-2)
Not ACK’d within 2x RTT?
-> Resend w/ SLOW START
Not ACK’d?
-> Resend, 7-25 times
User may lose patience in 4-10s
varies
Desktop virtualization (used sometime to help with layer 1-4 problems)
User data
Retries at different layers using UDP User Application (Layer 5-7) UDP (Layer 4) WLAN (Layer 1-2)
UDP does not retransmit, permanently lost packet
VoIP call, etc.
Not ACK’d?
Layer 2 packet fragmentation makes radio more robust
Fragmenting packets increases robustness , but increases overhead
Aggregating (e.g. Block ACK), reduces robustness, but increases efficiency
Fragmentation threshold default value usually 2346B (>1500B, no fragmenting) #1, 1500 B #2, 1500 B ACK ACK #1, 750 B ACK #2, 750 B ACK #3, 750 B #4, 750 B ACK #1, 1500 B #1, Retry 1, 1500 B No ACK
(lost or any error)
If error is detected, content of the whole 1500B packet is lost and needs to be retransmitted
Probability of errors in smaller packet is lower and transmitting it has taken less
time in the first place
Higher QoS helps prioritize data
Voice (VO), Video (VI), Best Effort (BE) and Background (BK) classesAnswering the Wi-Fi Challenge
Wait for complaints
Limited view of network
Little historical data
Guess at service levels
Remote issues costly toresolve
Problem Solution
Proactive measurements
Check end-to-end performance
Analyze historical trends
Use metrics based reporting
Centralize diagnosis ofBending the Cost Curve
Complexity of Network
Number of access points, clients, applications
Cost Needed to
Achieve Reliability Voice over Wi-Fi
BYOD
Guest Networks
Mobile Computing
$ Virtual Desktop based on complaints Reactive focus Video Apps
Location Svcs
Proactive focus based on continuous
Performance Management with a Systematic Approach
Listen to AP / Client Traffic (Passive Tests)
Simulate Client Traffic (Active Tests) Access Point(s) Sensor Mgmt Station
The Eye’s Capabilities
Synthetic Tests • End-to-end view at the application layer
• Data and voice quality measurements (throughput, packet loss, latency, jitter)
Traffic Analysis • Radio frame header analysis for traffic flow between clients and APs.
• KPIs for each client, SSID, AP, band and antenna beam
RF Analysis • AP settings, capabilities, signal levels, channels and noise levels
• KPIs for each AP, channel and antenna beam
Spectrum Analysis • High resolution (280kHz) for ISM band
• Interference source analysis with compass directional data on beams
Full Packet Capture • Capture remotely
The Wi-Fi Performance Cycle
If you can’t measure it, you can’t manage it! - Peter Drucker Measure Analyze Optimize Verify Assure
4. Optimization flow,
The most important KPIs
Connection Success
Throughput
Packet Loss
Data rates
Retry rates
Utilization
Traffic volume
Channels
Signal level
Spectrum data
Latency
Jitter
Voice quality (MOS)End user metrics (active tests)
Layer 2 / Layer 1 metrics(passive tests)
Asses
s Op
timi
Optimization flow at a glance
•Ensure that APs and antennas are positioned correctly
•Collect baseline data for a few days, check WLAN SW release, upgrade
1. Preparations and baseline
•Maximize available spectrum, organize channels for max capacity potential •Use manual channel plan in dense areas
2. Channel plan
•Minimize utilization due to unnecessary 802.11 traffic
•# of SSIDs, standards, beaconing, probing, data rates, protection, etc.
3. Minimize utilization
•Adjust AP power levels & TPC settings for improved SNR at both ends
4. Adjust power levels
•Remove non-WLAN interference, as much as possible
•There is always interference, understand whether it has significant impact
5. Reduce non-WLAN interference
•Make radio more robust towards remaining interference/noise
•Increased power, dropping max MCS, fragmentation, directional antennas
6. Improve radio robustness
•QoS categories, AP power levels, load balancing, SSID strategy, roaming
7. Prioritize and balance traffic
•Ensure sufficient LAN/WAN capacity and performance are present
8. LAN/WAN capabilities
•Drivers, location, models, settings
9. Improve client operation
•If performance is not sufficient, consider HW changes
#1. Understand the baseline
Collect and review all radio parameter settings
Verify AP type, antenna performance and placement
Collect baseline performance data for 3-5 days– Understand peaks and valleys in performance
– Nighttime data is extremely useful - If empty network can’t provide good throughput, it won’t do that under load either!
Analyze and find likely bottlenecks
Draft a plan for optimization steps#2. Plan the channels carefully
Understand # of AP/channel in the whole area
Use maximum amount of radio spectrum & channels
Align all APs to a common channel grid (1, 6, 11, etc)
Fix HT bonding side, HT40+ or HT40-
Do not overlap bonded with main channel
If automation does not provide a balanced plan,assign channels manually
Rotate channels evenly within floor
Rotate with offset between floors#3. Minimize utilization
Reduce number of SSIDs/AP to max. 3-4–Note: Every SSID sends an own beacon, days and nights
–Its common that networks run high utilization w/o clients!
Remove 802.11b rates (1, 2, 5.5, 11) and their support
Remove low MCS and SS multiples
Increase beacon interval from 100ms to 300ms–Note: Some devices do not allow this. E.g. Vocera badges,
older VoIP phones and in general older equipment
Increase CCA threshold (RX SOP, or similar term)#4. Adjust power levels
Define a limited range for TPC algorithms instead ofdefault
Observe power level changes also from metrics. Dothey correlate with settings?
Assign 3-5 dB higher power range for 5 vs. 2.4 GHz
Use manual power levels if TPC noes not yield goodresults
If possible, do not exceed the power level that still supports all data rates/MCSs. Consider#5. Reduce non-Wi-Fi interference
Interference is present, always! Understand level of impact – How are end user metrics impacted?– Correlate spectrum data with metrics
Analyze spectrum, where does the noise come from?
Bluetooth is the most common non-WLAN source – Keyboard, mouse, headset, handheld readers– Many other potential sources especially at 2.4 GHz band
Remove sources when possible
Observe impact to throughput and other end user metrics when changes are made#6. Improve WLAN robustness
Remove highest rates/MCS (most sensitive)
Run voice SSIDs only -g/-a mode without –n
Use radio packet fragmentation#7. Prioritize and balance traffic
Separate SSIDs (but keep quantity to minimum)
Assign QoS classes with WMM (WirelessMultimedia Extensions)
Adjust relative AP power levels to move clients
Consider use of load balancing, band steering/selectand admission control features
#8. Ensure sufficient LAN/WAN capacity
Observe utilization at the switch/router interfaces
Observe packet loss metrics
Internet connection speed may be a bottleneck atremote sites
Routing data packets always to controller mayimpact performance
Understand what is sufficient throughput for end#9. Improve client operation
Review all client devices and understand where aretheir antennas
Ensure that antennas are not hidden within metalenclosures and have space to operate properly
Upgrade WLAN drivers
Turn roaming aggressiveness to medium or low
Adjust client power level#10. Physical changes to network
Move APs
Add APs
Upgrade APs
Use good quality and right type of external antennasEvery network can be
made perform well!
Uplink throughput
Average improved from ~11 to ~14 Mbit/s (27%)
The worst APs
improved from ~4 to
Downlink Throughput Antenna change ready Channel change Core LAN upgrade Power level change Codec changes
The worst APs
improved from 7 to 15 Mbit/s. (110%) Average improved from 13 to 17 Mbit/s (30%)
Packet loss
Antenna Channel Power level Codec Core LAN
From ~2.5% to
1st 2nd 3rd 4th 5th 6th 7th
Downlink throughput (daily)
Downlink
throughput daily averages have improved 50%
1st 2nd 3rd 4th 5th 6th 7th
Downlink throughput (hour)
Minimum values increase up to ~10x
1st) Disabling power saving
2nd) Disabling b-data rates , area 1
3rd) Disabling b-data rates in other locations
5th) New TxPwr settings in XXX and channel plan in YYY
TCP downlink throughput 1 2 3 4 5 1 2 3 4 900% improvement in 1st floor 100% improvement in ground floor AP power levels More channels Beacon 300ms HT40
HTTP downlink throughput
1 2 3 4 5
90%/50% improvements
Voice Quality (MOS), downlink, hourly
1 2 3 4 5 +0.25MOS in ground
Network latency (RTT)
1 2 3 4 5
50% improvement in 1st floor
Performance Dashboard Before Analysis and Optimization After Analysis and optimization
Summary
Wi-Fi is very sensitive to the surroundings andnetwork parameters, even though it somehow works almost no matter where you put it
Performance can often be improved significantlyby adjusting the network parameters