• No results found

Validating the System Behavior of Large-Scale Networked Computers

N/A
N/A
Protected

Academic year: 2021

Share "Validating the System Behavior of Large-Scale Networked Computers"

Copied!
18
0
0

Loading.... (view fulltext now)

Full text

(1)

Chen-Nee Chuah

Robust & Ubiquitous Networking (RUBINET) Lab http://www.ece.ucdavis.edu/rubinet

Electrical & Computer Engineering University of California, Davis

Validating the System Behavior of

Large-Scale Networked Computers

(2)

Networked Computers: Some Observations

Laptops Wireless

Sensors

Hand-held devices (PDAs, etc)

Smart home appliances

Super computer

Intelligent transportation system

Rover Mars Vehicle

Different capabilities/constraints

- Getting smaller & getting bigger

Different requirements

Explosive growth in numbers

We know how individual

component/layer behaves, but when they are inter-connected and start interacting with each other, we become clueless!

(3)

What do we care about when we design networks?

End-to-end behavior

– Reachability

– Performance in terms of delay, losses, throughput – Security

– Stability/fault-resilience of the end-to-end path – …

System-wide behavior

– Load distribution within a domain – Stability/Robustness/Survivability – Manageability

– Evolvability and other X-ities

(4)

How do we know when we get there?

 We know how to do the following fairly well:

– Prove correctness/completeness of stand-alone system or protocol

• E.g., algorithm complexity, convergence behavior

– Look at steady state, worst-case, and average scenario

• E.g., Queuing models

– Run simulations/experiments to show improvement of protocol/architecture Z over A, B, C, D ….

 What is lacking:

End-to-end Validation of the design solution or system behavior

• Is the system behavior what we really intended?

• How do we verify what type of behaviors/properties are ‘correct’ and what are ‘abnormal’?

Verification of the system ‘dynamics’, e.g., how different components or network layers interact

(5)

Challenges

 End-to-end system behavior depends on:

Physical topology

Routing protocols

BGP Policies

NAT boxes, firewalls, packet

filters, packet transformers Logical topology

Traffic Demand

 Messy dependency graphs => A lot to model if we truly want to understand and able to validate system behavior

(6)

Problem Areas

Validating

1. End-to-end network properties

– Example: end-to-end reachability and/or security

2. Interactions between multiple control loops (across

protocol layers or between multiple entities)

– Example: overlay/IP-layer routing

3. Measurement/monitoring methodologies

– How do we know we’re measuring the traffic features that are really important instead of distorting them?

(7)

End-to-End Reachability/Security

 When user A sends a packet from a source node S to a destination node D in the Internet

How do we verify there is indeed a route that exist between S and D?How do we verify that the packet follow a certain path that adheres to

inter-domain peering relationships?

How do we verify that only this end-to-end connection satisfy some higher-level security policy?

E.g. Only user A can reach D and other users are blocked?

 Answer depends on:

– Router configurations & BGP policies

(8)

Distributed Firewalls

ISP B ISP A

(9)

Validating End-to-End Reachability/Security

Effectiveness of firewalls depend on (mis)configuration!

– Policy violation

– Inconsistency: shadowing, generalization, …

How do we verify configuration of firewall rules?

– Borrow model checking techniques from software programming

Example static analysis approach

– Control flow analysis: possible flow path – Data flow analysis: catching anomalies

(10)

filter 1 accept

filter 2 accept

filter k drop

filter n accept

implicit drop start

IPX Model

- Multiple access list in sequential order

Built-In Chain X

User-Chain Y Chain Z

filter 1 drop

filter 2 return

filter K policy action Chain Y filter K+1 filter 1 accept

filter N drop accept

filter 2 return

filter K accept

filter N drop end start filter 1 filter K accept filter K+1 Chain Y filter N drop accept

IPtable / Netfilter Model - Modeled as function calls

(11)

Source Port > 1023

1023 < S. Port < 49152

0 1 100 102 108 104 110 106 108 110 0 1 100 102 104 106 108 110 0 1 108 110 0 1 sip[7] sip[6] sip[5] sip[4] sip[3] sip[2] sip[1] sip[0]

Binary Decision Diagram (BDD) Representations

Source IP = 10.0.0.0/8 Source Port < 49152

(12)

Network of Firewalls: Remaining Issues

 How do we validate/verify dynamic behavioral changes? – With multi-homing and dynamic load-balancing, the end-to-end

path and sequence of firewalls traversed could change over time – Adaptation of firewall rules on demand depending on applications  How do we optimize firewall configurations?

– Inter-firewall & inter-path optimization

• Must interface with routing plane

– Heavy traffic ‘accepted’ first?

(13)

#2: Interaction btw Multiple Control Loops

Example 1: Overlay/IP-layer Interactions

 Overlays compete with IP-layer to control routing decisions – ISPs & overlays are unaware of decisions made by the other layer – Multiple overlay networks co-exist and make independent decisions  Side Effects

(a) Challenges to ISP’s Traffic engineering (TE)

• Overlays shift and/or duplicate TM values, increasing the dynamic nature of the TM, making it harder to estimate

• Harder to estimate Traffic Matrix (TM) essential for most TE tasks.

(b) Multiple overlays can get synchronized

• Interfere with load balancing or failure restoration, leading to oscillations

(c) Coupling of multiple ASes

• Overlay Networks may respond to failures in an AS by shifting traffic in upstream AS.

(14)

(b) Race Conditions & Load Oscillations

Multiple overlays can get synchronized!

A B C D E F 5 25 25 5 5 25 15 20 20 20 20 H 5

20 20 20

X

20 20 20

20 20 20 20 20 20

Link load > 50 is overload

Overlay-1 Overlay-2

- Result of

• Periodic nature of path probing process

• Partial/full overlap of primary and alternate paths

(15)

Insights from Economic & Social Foundations

Related Studies

 Qiu et al investigate the performance of selfish routing of multiple co-existing overlays [QYZ03]

– Optimal average latency is achieved at the cost of overloading some links

 Liu et al model interaction between IP traffic engineering and overlay routing as two-player game [LZ+05]

Other example problems

 Tuning IGP routing protocol parameters – Stability vs. Fast convergence

(16)

#3:Measurement/Monitoring Methodologies

Network measurements/monitoring traditionally

useful for network design and traffic engineering

purposes

– E.g., how to select optimal set of IGP link weights to route all OD pairs given a topology to distribute loads evenly across network.

Increasingly important for anomaly detection &

security forensics

– E.g., online detection of DoS/DDoS attacks, worm/virus propagation, flash crowd, etc.

(17)

Challenges: high data speed, limited storage

– ‘Sampling’ is typically done to reduce overhead

Questions:

– What is the optimal sampling rate?

– Does sampling preserve the traffic features that are crucial for anomaly detection (in addition to volume estimation for TE)?

– Can we sample less if we collect measurements at more points?

(18)

Summary

1. Validate end-to-end security/reachability properties – Example: firewall

– Useful toolkit:

• Model checking from software programming

• Combinatorial optimization

2. Model system dynamics and interactions between entities – Example: overlay/IP-layer routing

– Borrow economic models: game theory

3. Verify measurement/monitoring methodologies

– How do we know we’re measuring the traffic features that are really important instead of distorting them?

References

Related documents

Service 1 returns the keystroke in the AL register. Next, at address lOB hex, the program uses the Call instruction to send control to a subroutine. Debug's Call is much like

It may be that Vãchaspatimisra lived two or three centuries before the time of Shrî Rãmãnujãchãrya but nothing definite can be deduced as to the period in which he,

We apply an alternative age-adjusted unemployment rate to Turkey between 1988 and 2009, and find that if the relative composition of the labor force had not changed, the

The BRAF V600E (c.1799T &gt;A) and a low level of the BRAF V600V (c.1800G &gt;A) mutation are found in the initially diagnosed papillary thyroid carcinoma (A) and poorly

through residence hall living, chapel services, athletics, small groups, Student Council, and many, many more ministries and activities. You name it, we probably have something to

[r]

The magnitude and duration of the immune response required for the clearance of established tumors are critical aspects in the efficacy of therapeutic cancer vaccines. The

The future of space exploration endeavors depends highly on the utilization of in- situ resources. Large scale exploration road maps take advantage of local resources to reduce