Protocol Verifica-on in 4G/3G
Networks
SIGCOMM’14: Control
-‐Plane
Inter-‐
Protocol
Interac-ons
Poll: How many of you experience …
Ø
Dropped call
Ø
Unknown missed calls
Ø
No mobile broadband
Ø
Out of service
Ø
Slow speed
Ø
…
Chunyi Peng @ OSU 3
SomeHmes, you are aware/unaware of …
Some might be rare or transient …
Some error/failures are even unnoHceable
Our concern:
Why do they occur?
Can they be avoided?
Chunyi Peng @ OSU 4
The Question:
Are there any design defects or
bugs in cellular networks?
Do control-plane protocols
function correctly?
A CriHcal Infrastructure
Ø
Offer data+voice for anyone, anyHme, anywhere
Ø
Control-‐Plane
essenHal to cellular networks
§ Voice and data sessions
§ Universal access (mobility)
§ Radio resource
§ Access control
§ Security
§ …
§
Cellular Network Architecture
6
3G Gateways 3G Base stations
Mobile Switching Center
Circuit Switching (CS)
Packet Switching (PS)
3G (PS + CS)
Mobility Management Entity (Control Node)
4G (PS only)
Control
Plane
in
Cellular
Network
Ø
Major control uHliHes
§ Radio resource control
§ Mobility management
§ ConnecHvity management
7
3G Gateways
Mobile Switching Center
Circuit Switching (CS)
Packet Switching (PS)
3G
Mobility Management Entity (Control Node)
4G
Control
Plane
in
Cellular
Network
Ø
Major control uHliHes
§ Radio resource control
§ Mobility management
§ ConnecHvity management
Ø
Layered protocol stack
Chunyi Peng @ OSU 8
Radio Resource Control (RRC) Mobility Management (MM) Connec-vity Management (CM)
3G Gateways Mobile Switching Center
CircuitSwitching(CS)
PacketSwitching(PS)
3G
Mobility Management Entity (Control Node)
Control
Plane
in
Cellular
Network
9
Radio Resource Control (RRC) Mobility Management (MM) Connec-vity Management (CM)
3G Gateways Mobile Switching Center
CircuitSwitching(CS)
PacketSwitching(PS)
3G
Mobility Management Entity (Control Node)
4G
Radio Resource Control (RRC)
CS Domain MM CM PS Domain MM CM
Ø
Layered protocol stack
Ø
Domains separated for
voice (CS) and data (PS)
Control
Plane
in
Cellular
Network
Ø
Layered protocol stack
Ø
Domains separated for
voice (CS) and data
(PS)
Ø
Hybrid 3G/4G systems
10
3G Gateways Mobile Switching Center
CircuitSwitching(CS)
PacketSwitching(PS)
3G
Mobility Management Entity (Control Node)
4G
Radio Resource Control (RRC)
CS Domain MM CM PS Domain MM CM PS Domain MM CM RRC 4G 3G
Problem: Complex
InteracHon VerificaHon
Ø
Protocols must work together to offer vital
control uHliHes
§ Rich paXerns along three dimensions Radio Resource Control
MM CM PS Domain MM CM PS Domain MM CM RRC CS Domain 3G 4G cross-layer cross-domain cross-system
Chunyi Peng @ OSU 12 4G Core Network Phone 4G Gateways BS HSS 3G Core Network PHY MAC PDCP IP L1 L2 L3 2 1 4G-PHY 4G-MAC 4G-RLC IP PDCP Data Plane 3G 4G LTE 1 Cross-Layer 2 Cross-Domain 3 Cross-System RLC Control Plane Data Plane Connectivity Management Mobility Management Radio Resource Control Session Management (SM) Session Management (ESM) Mobility Management (GMM)
Radio Resource Control (3G-RRC) 4G LTE 3G Call Control (CM/CC) Mobility Management (MM) Radio Resource Control (4G-RRC) Mobility Management (EMM) 3G Gateways 3 BS MSC Internet Internet 3G 4G MME PS Domain PS Domain CS Domain
Radio Resource Contol Mobility Management Connectivity Mangement
Telephony Network
Chunyi Peng @ OSU 13
Verification of
Control-plane protocol interaction
Each individual protocol may be well designed.
How about protocol interacHons?
Case study: Dialing a Voice Call
Chunyi Peng @ OSU 14
CM MM RRC IDLE CONN-ING! IDLE! IDLE! CONN-ED! MM_EST_REQ RR_EST_REQ 1! 1! 2! 2! 3! 3!
IDLE!RRC Conn setup 4!Conn Est. RRC 4!
RR_EST_CNF RR_REL_IND 5! 5! RRC Conn Est. 7! 13! WAIT-FOR-RR! CONN-ING! CONN-ED! 8! 6! 7! WAIT-FOR-MM! MM_EST_CNF MM Session Est. MM Session Est. 8! RR_DATA_REQ/IND MM_DATA_REQ/IND 9! 10! Call Setup MM Session Released RRC Conn Released MM-ACTIVE! IDLE Call Release WAIT-FOR-NET-CMD! 11! MM_REL_IND 12! 14! STATE PRIMITIVE Event CNF = confirmation CONN = connection EST. = established IND = indication REL = release REQ = request
Aborted call due to locaHon update
Chunyi Peng @ OSU 15
1 Establishment RRC Conn 2 3 WAIT-FOR-RR-LU Location Update Accept RRC Conn Released LU-INITIATED WAIT-FOR-NET-CMD RRC Conn Est. Req Idle
06:39:56.863 EVENT_UMTS_CALLS Call statistic: Mobile initiated normal voice
06:39:56.863 EVENT_MM_STATE MM State: MM_WAIT_FOR_RR_CONNECTION_MM,.. 06:39:58.992 EVENT_MM_STATE MM State: MM_IDLE,..Substate: MM_NORMAL_SERVICE 06:39:58.995 EVENT_MM_STATE MM State: MM_IDLE,.Sub:MM_ATTEMPTING_TO_UPDATE 06:39:58.995 EVENT_MM_STATE MM State: MM_WAIT_FOR_RR_CONNECTION_LU
06:39:59.002 EVENT_UMTS_CALLS_ Call statistic: MO call - mobile aborted 06:39:59.002 EVENT_UMTS_CALLS_ Call statistic: MO call – mobile aborted
06:39:59.725 EVENT_MM_STATE MM State: MM_LOCATION_UPDATE_INITIATED 06:39:59.975 EVENT_MM_STATE MM State: MM_LOCATION_UPDATE_INITIATED 06:40:00.945 EVENT_MM_STATE MM State: MM_LOCATION_UPDATE_INITIATED 06:40:00.946 EVENT_MM_STATE MM State: MM_WAIT_FOR_NETWORK_COMMAND 06:40:01.253 EVENT_MM_STATE MM State: MM_IDLE, ...
L oc ati on U pd ate
MM for call starts MM for call stops
Dialing MM for LU starts Dialing aborted
Delayed call due to locaHon update
Chunyi Peng @ OSU 16
06:16:41.418 AutoCaller Calling out
06:16:41.912 EVENT_UMTS_CALLS_STAT Call statistic: Mobile initiated normal voice
06:16:41.913 EVENT_MM_STATE MM State: MM_WAIT_FOR_RR_CONNECTION_MM, M 06:16:42.037 EVENT_MM_STATE MM State: MM_WAIT_FOR_OUTGOING_MM_CONNE 06:16:42.238 EVENT_MM_STATE MM State: MM_CONNECTION_ACTIVE, MM Update 06:16:48.886 EVENT_CM_CALL_STATE Conversation
06:16:49.440 EVENT_UMTS_CALLS_STAT Call statistic: MO call - ended
06:16:49.441 EVENT_MM_STATE MM State: MM_WAIT_FOR_NETWORK_COMMAND, M 06:16:49.894 AutoCaller Calling out
06:16:50.595 EVENT_MM_STATE MM State: MM_IDLE, MM Update Status: MM_UPDATE 06:16:54.632 EVENT_UMTS_CALLS_STAT Call statistic: Mobile initiated normal voice
06:16:54.635 EVENT_MM_STATE MM State: MM_WAIT_FOR_RR_CONNECTION_MM, M
hh:mm:ss:ms
Dialing 2nd call. MM starts for 2nd call.
1
st
cal
Challenges
Ø Complex interacHons in common scenarios
§ Inevitable interplay between radio, mobility, data/voice
§ Concurrent voice and data sessions
§ 3G/4G switch due to hybrid deployment, mobility, voice
Ø Two causes of problemaHc interacHons
§ Design defects
§ OperaHon/implementaHon slips
Ø Nearly closed network
§ No full-‐stack source code
§ No access of states at the core
Chunyi Peng @ OSU 17
3G Gateways MSC Circuit Switching (CS) PacketSwitching(PS) 3G MME 4G
Diagnosis over single layer/domain/system
is insufficient
“Black-‐box”
Limited informaHon can be leveraged
Single-‐type test fails to unveil both issues
Our SoluHon: CNetVerifier
Ø
Cellular-‐specific model checking
§ Extract full-‐stack cellular model from 3GPP standards
§ Create a variety of usage scenarios
§ Define desirable user-‐perspecHve properHes
§ Discover counterexamples for possible design defects
Chunyi Peng @ OSU 18
Model Checker Violated property Counterexamples Protocol Stacks Usage Sebngs Desirable ProperHes
Our SoluHon: CNetVerifier
Ø
Cellular-‐specific model checking
Ø
Phone-‐based experimental validaHon
§ Instrument end devices to verify design flaws
§ Discover operaHonal slips in real networks
Chunyi Peng @ OSU 19
Model Checker Violated property Counterexamples Protocol Stacks Usage Sebngs Desirable ProperHes Scenario Setup Opera-onal slips Design Flaws “Black-‐box”
Finding Overview
20
cross-layer
cross-domain
cross-system
II. Independent
operaHons
I. Necessary
cooperaHon
Improper Coopera-on: Cross-‐System
Ø
Scenario: run data services during 4Gà3Gà4G
21
3G
1. Setup 4G connecHvity to access internet 2. converted to 3G for seamless switch 4Gà3G: 4G conn. context is
RRC MM CM 3G PS MM CM 3G CS MM CM RRC 4G PS
þ
4G 4G Conn. Context 22.205.176.1 3G Conn. Context 22.205.176.1
3. 3Gà4G: 3G conn. context is
converted back to 4G
Improper Coopera-on: Cross-‐System
How and why?
Ø
ProblemaHc scenario:
3G context is
deac-vated
before returning to 4G
22
3G
1. 3G conn. context is deleted.
þ
4G 3G Conn. Context 131.179.176.1 2. 3G-‐>4G: No 4G context
without an exisHng 3G context
“Out-‐of-‐Service”
Causes of dele-on (in 3GPP)
¤ Low layer failures
¤ User disables data services ¤ No enough resources
¤ ….
Improper Coopera-on: Cross-‐System
How and why?
þ
Root cause
different PS conn. management in 3G+4G
3G (PS+CS): PS context is op-onal
4G (PS only): PS context is mandatory
Shared context is not well protected in 3G
Ø
Real-‐world impact
§ Occurs 3.1% in user study
§ “out-‐of-‐service” for up to 25s
Ø
Lessons: a design defect
§ Different demands of packet swHching in 3G & 4G
§ Desirable but not enforced: shared context should be
consistently protected in 4G & 3G
Ø
Proposed remedies
§ Avoid unnecessary 3G PS context deacHvaHon
§ Immediately enable 4G PS context reacHvaHon
24
þ
Improper Coopera-on: Cross-‐System
Improper Coopera-on:
Cross-‐domain, Cross-‐system
Ø
Scenario: voice calls for 4G via 3G CS (CS Fallback)
25 1. To make a call, 4G user à3G
þ
2. When the call ends, 3Gà4G
þ
RRC MM CM 3G PS MM CM 3G CS MM CM RRC 4G PS 4G 3G
Improper Coopera-on
: cross-‐domain+systemHow and Why?
Ø
ProblemaHc Scenario: Call
with
background
data
26
1. A call makes 4G à 3G;
Data is migrated to 3G, too
þ
2. When the call ends, No 3Gà4G (data is sHll on)
þ
4G
3G
Improper coopera-on
: cross-‐domain+systemHow and Why?
Ø
Unexpected loop in RRC state machine
27
þ
þ
User gets stuck in 3G, losing 4G.
RRC 3G PS 3G CS RRC 4G PS CONN-‐ED IDLE CONN-‐ED
IDLE
Voice only
Voice + Data
(certain seRng)
Root cause
3G-‐RRC state transiHon policy is inconsistent
with all cross-‐domain, inter-‐system opHons
Improper coopera-on
: cross-‐domain+system
Ø
Real-‐world impact
§ 62.1% 4G users being stuck in 3G aoer the call
§ Stuck in 3G for 39.6s in average
Ø
Lessons: a design defect
§ 3G CS and 3G PS are indirectly coupled in RRC
§ Inconsistent state transiHon with all 3Gà4G
opHons
Ø
Proposed remedies
§ Revise the RRC state transiHon for possible
sebngs
28
þ
þ
Improper coopera-on: Cross-‐Layer
How and why?
Ø
Problem Scenario: Messages are lost during the
registeraHon, followed up by locaHon update
29
þ
Attach complete Location update
Location update response (error)
MM 3G PS MM CM 3G CS 4G PS CM RRC CM MM RRC Attach request Attach accept Attach complete Deregistered Deregistered Registered Registered Deregistered
“out-‐of-‐service” right aoer being aXached
Deregistered
Upper-‐layer (MM) assumes underlying reliable in-‐
sequence signal transfer, but lower-‐layer (RRC)
Unnecessary Coupling: Cross-‐layer
Ø
Scenario: voice/data request with locaHon
update
30 MSC RRC MM CM MM CM 3G-‐CS MM CM RRC 3G-‐PS 4G-‐PS Location Update1. LocaHon update is triggered by MM (e.g., user moves)
2. Aoer locaHon update, user can send/receive voice and data
þ
Dial out
Unnecessary Coupling: Cross-‐layer
How and why?
Ø
ProblemaHc Scenario: voice/data request during
the locaHon update
31 3G Gateways 3G Base stations MSC RRC MM CM MM CM 3G-‐CS MM CM RRC 3G-‐PS 4G-‐PS Location Update
2. User dials out
Dial out
Outgoing call is delayed
1. LocaHon is triggered by MM (e.g., user moves)
“UpdaHng the locaHon”
þ
Unnecessary Coupling: Cross-‐layer
How and why?
“Without user loca/on, the cellular network
cannot route user voice/data.”
Outgoing
voice/data requests can be routed
without user locaHon
Root cause:
unnecessary
prioriHzaHon of
locaHon update over outgoing call/data
þ
Unnecessary Coupling: Cross-‐layer
Ø
Real-‐world Impact
§ up to 8.3s call delay and 4.1s data delay
Ø
Lessons: a design defect
§ outgoing data/voice requests and locaHon update
are independent, but they are arHficially correlated
Ø
Proposed remedies
§ Decouple locaHon update and outgoing data/
voice requests
§ E.g., two parallel MM threads for different
purposes
33
þ
MM 3G PS MM CM 3G CS MM CM RRC 4G PS CM RRC
Unnecessary Coupling: Cross-‐domain
Ø
Scenario: dial a call during data service in 3G
34 Circuit Switching (CS) Packet Switching (PS) 3G 10Mbps 10Mbps 2.5Mbps 2.5Mbps 12.2Kbps 12.2Kbps
1. Access internet at full rate 2. Dials a call
Data service rate declines up to 74%
Root cause: Voice and data have compeHng
demands on the channel, but they have to
share the radio channel
þ
Voice: low rate, low loss (e.g., 16QAM)
Data: high rate, loss tolerant (e.g., 64QAM)
Unnecessary Coupling: Cross-‐system
Ø
Scenario: 4G user tries to switch to 3G
35 3G 3G PS MM CM 3G CS CM RRC 4G PS CM RRC MM MM 4G
þ
1. Update 4G locaHon aoer 3Gà4G switch, and noHfy 3G MSC
2. 3G rejects the redundant update, so 4G later detaches the user
Detach
MSC unavailable
Root cause: 3G improperly propagates internal
failures to 4G
Summary: How Problem Happens?
Ø
The
layering
rule is not fully honored
§ FuncHons from layers are not cleanly decoupled
Ø
CS-‐voice and PS-‐data domain
were designed
to be independent, but indirectly correlated
§ Concurrent data and voice use, misconfiguraHon
Ø
Hybrid 3G/4G
raises distributed system issues
§ Context sharing, fault isolaHon and tolerance
36
Related Work
Ø
Protocol verificaHon for the Internet
§ Since 1990s
§ Single protocol with implementaHon
§ E.g., [Cohrs’89, SIGCOMM], [Holzmann’91], [Smith’96], TCP
[NSDI’04], RouHng[SIGCOMM’05], …
Ø
Emerging techniques for network verificaHon
§ E.g., Anteater [SIGCOMM’11], Head Space Analysis[NSDI’12], NICE
[NSDI’12], Alloy[SIGCOMM’13], NetCheck[NSDI’14], Sooware Dataplane [NSDI’14] …
Ø
Largely unexplored territory in cellular networks
§ Few efforts, e.g., 2G handoff [Orava’92], AuthenHcaHon [Tang’13]
37
Takeaway
Ø Cellular network’s control plane is more complex, so it
inevitably raises more issues § Real impact in real systems
Ø Uncover design and operaHon problems for signaling
protocol interacHons in three dimensions § Where might be the problems?
§ How to verify or even solve them?
• Real systems: too big, limited access (limitaHon of exisHng
techniques)
• Standard + empirical study
Ø More rigorous design and analysis efforts are needed
by the cellular and networking community