Understanding Flow and Packet
Deduplication
© 2012 Riverbed Technology. All rights reserved.
Riverbed®, Cloud Steelhead®, Granite™, Interceptor®, RiOS®, Steelhead®, Think Fast®, Virtual Steelhead®, Whitewater®, Mazu®, Cascade®, Cascade Pilot™, Shark®, AirPcap®, SkipWare®, TurboCap®, WinPcap®, Wireshark®, and Stingray™ are trademarks or registered trademarks of Riverbed Technology, Inc. in the United States and other countries. Riverbed and any Riverbed product or service name or logo used herein are trademarks of Riverbed Technology. All other trademarks used herein belong to their respective owners. The trademarks and logos displayed herein cannot be used without the prior written consent of Riverbed Technology or their respective owners.
Akamai® and the Akamai wave logo are registered trademarks of Akamai Technologies, Inc. SureRoute is a service mark of Akamai. Apple and Mac are registered trademarks of Apple, Incorporated in the United States and in other countries. Cisco is a registered trademark of Cisco Systems, Inc. and its affiliates in the United States and in other countries. EMC, Symmetrix, and SRDF are registered trademarks of EMC Corporation and its affiliates in the United States and in other countries. IBM, iSeries, and AS/400 are registered trademarks of IBM Corporation and its affiliates in the United States and in other countries. Linux is a trademark of Linus Torvalds in the United States and in other countries. Microsoft, Windows, Vista, Outlook, and Internet Explorer are trademarks or registered trademarks of Microsoft Corporation in the United States and in other countries. Oracle and JInitiator are trademarks or registered trademarks of Oracle Corporation in the United States and in other countries. UNIX is a registered trademark in the United States and in other countries, exclusively licensed through X/Open Company, Ltd. VMware, ESX, ESXi are trademarks or registered trademarks of VMware, Incorporated in the United States and in other countries.
This paper describes two different concepts used by the Riverbed® Cascade® product family architecture – Flow Deduplication which is used by Riverbed® Cascade® Gateway software and Riverbed® Cascade® Profiler appliances; and Packet deduplication – which is used by Riverbed® Cascade® Shark products and Riverbed® Cascade® Sensor appliances.
What is a flow?
A Flow is a set of IP packets in the network that all share a common set of attributes Typical flow is based on the 5-tuple:
1. Source IP 2. Destination IP 3. Protocol 4. Source Port 5. Destination Port
It also includes additional information such as
Number of bytes transmitted
Number of packets transmitted
Inbound and Outbound interfaces
COS/QOS markings
TCP Flags used
In general, a flow is unidirectional, e.g. describing only half of a TCP connection. A flow may be defined by only a subset of available attributes, such as just <SrcIP, DstIP>
Who exports a flow?
Most Enterprise-class routers
Cascade Sensor appliance
Cascade Shark appliance
Some Switches (Layer 3 Switch)
Wan Optimizers (Riverbed® Steelhead® products, Juniper)
Some other devices (Packeteer, nprobe)
Types of flow Riverbed Cascade supports
Type of Flow Description Supported Vendors
Netflow v5
Widely in use, supported by multiple vendors
Fixed content flow record with basic counters/info
“Generally” supports ingress only
Cisco
Riverbed
Netflow v9 Drastic increase in available fields Templates allow customization of data collected
Official support for ingress and egress flows
Cisco
Riverbed
J-flow NetFlow like variants – generally look like NetFlow v5 Juniper
Bluecoat Packeteer
FDR Includes flow record values plus layer-7 identifier Bluecoat
S-Flow sFlow uses sampled packets for network monitoring HP Brocade
Extreme Networks
IPFIX (IP Flow Information Export)
Similar to the Netflow Protocol
IPFIX considers a flow to be any number of packets observed in a specific timeslot and sharing a number of properties eg. Same source, same destination, same protocol etc
VMware Netflow Similar to Netflow v5 VxLAN (Virtual Extensible LAN) information VMware
Steelhead Cascade Flow
Performance Metrics - Network RTT / Response Time
WAN Interface Identification
TCP Retransmissions
Riverbed
Cascade Sensor Flow
Netflow variant for Cascade use
Includes L7 Application tag
Includes Performance Metrics = Network RTT / Response Time
TCP Retransmissions
Riverbed
Cascade Shark Flow
Netflow like variant for Cascade use
Includes Performance Metrics = Network RTT / Response Time
Includes TCP Retransmissions
Riverbed
What is Flow Deduplication?
Flow deduplication is the process of collating and normalizing reports from multiple sources about the same flow. Multiple flow exporters in the network can report to the same flow collector (i.e. Cascade Gateway), which can result into multiple flow records describing the same network traffic.
Why Flow Deduplication and Coalescing
A typical client to server connection will traverse several segments, often both LAN and WAN of the network. Each segment has the possibility for congestion or packet loss, and each router may cause queuing loss or QoS changes. And a connection may take an asymmetric path, meeting client to server path isn’t the same as server to client path. All of these factors are part of the daily realities that engineers must deal with during the course of troubleshooting.
While all vendors enable you to report on the traffic seen at a given observation point (a NetFlow source or a probe on the wire), this leaves operators with 2,3…8+ individual reports to examine for each connection. Even the simple example diagram above would give operators 4 different values, based on which observation point was being reported on. This process of needing to know what path a conversation took, and manually reconciling reports from each observation point are a very cumbersome and lengthy process.
When Cascade Profiler appliance sees multiple flow exporters in the network, each reporting the same conversation, it automatically deduplicates that traffic into a single record. Note that Cascade is conscious to preserve any per interface data. Further, Cascade recognizes data beyond NetFlow may have value. Cascade Sharks, Sensors, WAN optimizers, Shapers, Capture appliances, load balancers and more may all also see the conversation and have valuable additional data to share about the conversation. This process of integrating additional connection metrics, items like network round trip time, server delay, layer-7 application name and more is called data-coalescing. Cascade can examine a conversation end-to-end, and report the path
taken in each direction, as well as byte counts, QoS markings, round trip times, etc. And because some of these factors such as round trip time are pervasive for the connection, even an interface that did not measure RTT (a simple NetFlow exporter) can still be aware of it when reporting.
Benefits of flow deduplication and coalescing End-to-end visibility for a connection
Report on any element or component in the network (IP Address of Server, TCP-Port, Application, QOS, etc. or any combination there-of) without having to first select an interface or observation point
Identify the path a conversation has taken as well as all metrics along the way in a single report
Accurately report on a conversation even if it takes an asymmetric path
Identify changing QoS tags per hop and in each direction
Greatly simplified and more powerful monitoring – with conversations now identified as a single entity, anomaly and policy based alarming are much simpler to configure, more comprehensive, and eliminate duplicate notification of the same event
Continuous drill down and pivoting between different data views
With the ability to associate all records of a conversation together, manipulation through that data can take on many new dimensions not available when you must report interface by interface
Simplified and automated WAN optimization bandwidth reduction reporting
Shared knowledge as reported by different sources
Tagging by one source associates the tag with the flows from all other sources in the aggregate flow
Minimizes storage requirements - common information is stored only once
Why Packet Deduplication?
It is common that a single packet capturing device (such as Cascade Shark or Cascade Sensor) may be fed by copies of the same data in the same network. For an easy deployment model it is typically to use SPAN (also referred to as Port Mirror) technology to collect multiple VLANS or multiplied ports on a switch/router to the same packet capturing device. The problem with this model is that it packet capturing device may see the exact same IP packet multiple times even though there was no errant network behavior. This happens because the vendor switch being monitored may send a copy of the IP packet as it enters one VLAN (or port), and a second copy of the same IP packet as it leaves the VLAN (or port) or as it enters the next VLAN. If all this traffic is going to the same packet capturing device, the device may sense that data is being retransmitted by the sender and may also over-count the volume of data associated with the conversation.
Riverbed Cascade Sensor/Shark uses packet deduplication methodology to avoid counting IP packet multiple times and count it as a retransmission. This is an optional feature which may be enabled or disabled on a per-port basis. By enabling this feature when multiple VLANs are SPANed – it assures conversations have correct packet counts and that only true retransmissions are reported.
About Riverbed
Riverbed delivers performance for the globally connected enterprise. With Riverbed, enterprises can successfully and intelligently implement strategic initiatives such as virtualization, consolidation, cloud computing, and disaster recovery without fear of compromising performance. By giving enterprises the platform they need to understand, optimize and consolidate their IT, Riverbed helps enterprises to build a fast, fluid and dynamic IT architecture that aligns with the business needs of the organization. Additional information about Riverbed (NASDAQ: RVBD) is available at www.riverbed.com
Riverbed Technology, Inc.
199 Fremont Street San Francisco, CA 94105 Tel: (415) 247-8800
www.riverbed.com
Riverbed Technology Ltd.
One Thames Valley Wokingham Road, Level 2 Bracknell. RG42 1NG United Kingdom Tel: +44 1344 31 7100
Riverbed Technology Pte. Ltd.
391A Orchard Road #22-06/10 Ngee Ann City Tower A Singapore 238873 Tel: +65 6508-7400
Riverbed Technology K.K.
Shiba-Koen Plaza Building 9F 3-6-9, Shiba, Minato-ku Tokyo, Japan 105-0014 Tel: +81 3 5419 1990