Value-Added Services and Service Chaining: Deployment Considerations and Challenges

(1)

An Industry Whitepaper

Executive Summary

In telecommunications, value-added services (VAS) come in many forms, including consumer and enterprise, and those that

generate incremental revenue and those that do not.

To deploy a single VAS, there are two practical options available to a CSP: integration and redirection.

A topic closely related to VAS is service chaining (i.e., service function chaining), a technique for selecting and steering data traffic flows through various ‘service functions’ that is being investigated and developed by the Internet Engineering Task Force (IETF) Network Working Group. In order to realize the full promise and potential of VAS and service chaining, the IETF has identified a number of challenges that a VAS deployment approach must overcome.

These challenges, or problem areas, provide a framework by which potential enablement solutions can be evaluated and compared.

An integrated approach is viable if a CSP has a firm

understanding of precisely what service functions they want to deploy, if the list of service functions is very small and unlikely to change, and if the integrated service functions are of acceptable quality as to fulfill the requirements.

However, the redirection-based enablement is the superior option overall: when done correctly, redirection can overcome all of the challenges identified by the IETF. Most importantly, redirection preserves choice and flexibility – the CSP is free to choose any vendor for any service function, and can introduce or

Value-Added Services and Service Chaining:

Deployment Considerations and Challenges

(2)

Introduction to Value-Added Services and Service

Chaining

A value-added service (VAS) is a service that is not a core requirement. In telecommunications, VAS come in many forms, including consumer and enterprise, and those that generate incremental revenue and those that do not. A single service might fall into multiple categories: for instance, a parental control function might be available for an incremental subscription fee at one communications service provider (CSP), but might be included as a market differentiator at another CSP. Alternatively, it might cost extra to subscribers in a basic service tier, but be included for those in a higher tier.

There are also services that a CSP might have to implement due to regulatory requirement (e.g., URL filtering with the Internet Watch Foundation); functionally, these can be considered equivalent to VAS. A topic closely related to VAS is service chaining (i.e., service function chaining), a technique for selecting and steering data traffic flows through various ‘service functions’ that is being investigated and developed by the Internet Engineering Task Force (IETF) Network Working Group. A service function need not necessarily be a value-added service, but it certainly can be, and the general challenges associated with service chaining apply to enabling value-added services.

In order to realize the full promise and potential of VAS and service chaining, there are a number of challenges that must be overcome, some of which are described in the work-in-progress Internet-Draft document “Network Service Chaining Problem Statement”1_.

To keep things as simple as possible, this whitepaper uses the same definitions found in the “Network Service Chaining Problem Statement” Internet-Draft:

• Service Function: A function that is responsible for specific treatment of received packets. A service function can act at the network layer or other OSI layers. A service function can be a virtual instance or be embedded in a physical network element. One of multiple service functions can be embedded in the same network element. Multiple instances of the service function can be enabled in the same administrative domain. A non-exhaustive list of service functions includes: firewalls, WAN and application acceleration, Deep Packet Inspection (DPI), server load balancers, NAT44 [RFC3022], NAT64 [RFC6146], HOST_ID injection [RFC6967], HTTP Header Enrichment functions, TCP optimizer, etc. The generic term "L4-L7 services" is often used to describe many service functions.

• Service Function Chain (SFC): A service function chain defines an ordered set of service functions that must be applied to packets and/or layer-2 frames selected as a result of classification. The implied order may not be a linear progression as nodes may copy to more than one branch. The term service chain is often used as shorthand for service function chain. • Service Function Path (SFP): The instantiation of a service function chain in the network.

Packets follow a service function path from a classifier through the required instances of service functions in the network.

• Service Node (SN): Physical or virtual element that hosts one or more service functions.

Deployment Architecture Alternatives

To deploy a single VAS, there are three main options available to a CSP: redirection (Figure 1), integration (Figure 2), and dedicated inline (Figure 3).

1_{Which can be found here:}_{http://datatracker.ietf.org/doc/draft-ietf-sfc-problem-statement/}

(3)

Figure 1 - Redirection: the VAS deployments (service functions) are separate from the data path, and a data path component (e.g., PCEF) redirects traffic

Figure 2 - Integration: the service functions are integrated within an element already in the data path, and redirection is local

Figure 3 - Inline: the service functions are all deployed in the data path; this deployment is not a practical option

The inline model can be immediately disregarded for at least two reasons: the added complexity and risk of having many inline devices, and the strict/fixed order in which traffic passes through the service functions.

(4)

Considerations for Service Chain Enablement

The IETF has identified 12 ‘Problem Areas’ that represent the primary challenges related to service chaining. The italicized text in the subsections below is reproduced from the IETF problem statement draft; the remainder explains the implications for operators when they are choosing how to deploy value-added service chains.

Topological Dependencies

Network service deployments are often coupled to network topology, whether it be real or virtualized, or a hybrid of the two. Such dependency imposes constraints on the service delivery, potentially inhibiting the network operator from optimally utilizing service resources, and reduces the flexibility. This limits scale, capacity, and redundancy across network resources.

These topologies serve only to "insert" the service function (i.e., ensure that traffic traverses a service function); they are not required from a native packet delivery perspective. For example, firewalls often require an "in" and "out" layer-2 segment and adding a new firewall requires changing the topology (i.e., adding new layer-2 segments). As more service functions are required - often with strict ordering - topology changes are needed before and after each service function resulting in complex network changes and device configuration. In such topologies, all traffic, whether a service function needs to be applied or not, often passes through the same strict order.

The topological coupling limits placement and selection of service functions: service functions are "fixed" in place by topology and therefore placement and service function selection taking into account network topology information is not viable. Furthermore, altering the services traversed, or their order, based on flow direction is not possible. A common example is web servers using a server load balancer as the default gateway. When the web service responds to non-load balanced traffic (e.g., administrative or backup operations) all traffic from the server must traverse the load balancer forcing network administrators to create complex routing schemes or create additional interfaces to provide an alternate topology.

The key takeaway from this problem area is that to reduce complexity of introducing new service functions, and to maintain flexibility (e.g., to support real or virtualized functions, to support varied placement and selection of service functions), the enabling deployment must abstract the service function from the physical network topology.

The redirection model can achieve this requirement, while the integrated solution fails - only those service functions that can be integrated are supported, so there is a dependency that extends far beyond topology and ultimately limits placement, selection, and format (e.g., physical or virtual) of available service functions.

Configuration Complexity

A direct consequence of topological dependencies is the complexity of the entire configuration, specifically in deploying service function chains. Simple actions such as changing the order of the service functions in a service function chain require changes to

(5)

the topology. Changes to the topology are avoided by the network operator once installed, configured and deployed in production environments fearing misconfiguration and downtime. All of this leads to very static service delivery deployments. Furthermore, the speed at which these topological changes can be made is not rapid or dynamic enough as it often requires manual intervention, or use of slow provisioning systems.

It is imperative that the enablement solution must maintain flexibility, ease of configuration, and ease of reordering of service functions. Again, a redirection-based deployment can theoretically fulfill these requirements (e.g., by abstracting the service function from the physical network topology and

providing a simple means of adding, removing, and changing the order of service functions), while an integrated solution fails due to the restrictive nature of the implementation.

Constrained High Availability

An effect of topological dependency is constrained service function high availability. Worse, when modified, inadvertent non-high availability or downtime can result. Since traffic reaches many service functions based on network topology, alternate, or redundant service functions must be placed in the same topology as the primary service. Ideally, how a CSP chooses to deploy service functions should not impact the availability of those functions; that is, the service functions should be highly available regardless of the enablement mechanism. In reality though, this is not the case; both models introduce risk.

The redirection-based deployment has the benefit of decoupling the availability of each service

function from the others: every service function is available, or not, based on its own merits. However, all are dependent upon the redirection mechanism working correctly. Should that mechanism go down, then the entire service chain becomes unavailable. Therefore, the redirection platform itself must have a reliable means of achieving high availability.

Provided integrated service functions are decoupled from each other within the integrated platform, then the availability issues are of practical equivalence to the redirection model.

In either scenario, there should be a health-check mechanism to detect if/when a service function is no longer available and to omit that service function from the service chain.

Consistent Ordering of Service Functions

Service functions are typically independent; service function_1 (SF1)...service function_n (SFn) are unrelated and there is no notion at the service layer that SF1 occurs before SF2. However, to an administrator many service functions have a strict ordering that must be in place, yet the administrator has no consistent way to impose and verify the ordering of the service functions that are used to deliver a given service.

Service function chains today are most typically built through manual configuration processes. These are slow and error prone. With the advent of newer service deployment models the control and policy planes provide not only connectivity state, but will also be increasingly utilized for the creation of network services. Such control/management planes could be centralized, or be distributed.

Essentially, the solution must allow the CSP to define and control a specific and consistent (subject to conscious decisions to change) ordering of service functions within the chain.

(6)

Both deployment architectures should be able to provide an interface through which the CSP can modify the order in which service functions are applied, and through which the operator can verify the ordering after the fact.

Application of Service Policy

Service functions rely on topology information such as VLANs or packet (re)classification to determine service policy selection, i.e. the service function specific action taken. Topology information is increasingly less viable due to scaling, tenancy, and complexity reasons. The topological information is often stale, providing the operator with

inaccurate placement that can result in suboptimal resource utilization. Per-service function packet classification is inefficient and prone to errors, duplicating functionality across service functions. Furthermore packet classification is often too coarse, lacking the ability to determine class of traffic with enough detail.

This problem area essentially translates to a requirement that there be an effective global (rather than at each service function) means of determining what packets should go to what service functions. In theory, there should be no variation between the integrated approach and the redirection-based approach.

Transport Dependence

Service functions can and will be deployed in networks with a range of transports, including under and overlays. The coupling of service functions to topology requires service functions to support many transport encapsulations or for a transport gateway function to be present.

This problem area imposes a number of requirements upon the enablement mechanism: • It must be completely agnostic of the access technology, or combination of access

technologies, within the network

• It must be able to apply redirection to traffic that is tunneled • It must be able to apply redirection to traffic that is encapsulated

To support these last two requirements, the enablement platform must therefore be able to remove and reapply the headers. In theory, there is no reason why the redirection-based architecture and the integrated architecture should perform differently; however, in practice the CSP must ask pointed questions to ensure fulfillment of these requirements.

Elastic Service Delivery

Given that the current state of the art for adding/removing service functions largely centers around VLANs and routing changes, rapid changes to the service deployment can be hard to realize due to the risk and complexity of such changes.

In theory, this problem area substantially favors enablement via redirection. The redirection model enables rapid changes to the service chain (e.g., by safely adding or removing service functions outside of the data path and then simply changing the configuration on the redirection platform) and also provides a significant degree of elastic service delivery (since each service function can be scaled independently of the others).

(7)

With an integrated design, processing consumed by one service function automatically makes that processing capacity unavailable to other service functions.

Traffic Selection Criteria

Traffic selection is coarse, that is, all traffic on a particular segment traverse service functions whether the traffic requires service enforcement or not. This lack of traffic selection is largely due to the topological nature of service deployment since the forwarding topology dictates how (and what) data traverses service function(s). In some deployments, more granular traffic selection is achieved using policy routing or access control filtering. This results in operationally complex configurations and is still relatively inflexible.

This problem area presupposes that there is no means of efficiently and effectively determining what traffic should go to what service functions, but this is not the case. What is true, however, is that the degrees of efficiency and effectiveness vary greatly.

It is very important to note that even with integrated solutions, the VAS component typically resides on a separate blade or processing group; as a result, the platform must still redirect traffic to these processors, even though the redirection is at a process-level or internal to a larger chassis.

In the theoretical best case, only traffic that meets criteria specific to a service function gets sent to that service function. Such criteria might include traffic pertaining to a particular subscriber,

subscriber segment, device, application, protocol, CDN, delivery route, video resolution, video provider, etc.

Regardless of the means of implementing the service chain - whether via an integrated solution or through redirection - there is a general requirement to send only pertinent traffic to a particular service function. CSPs, then, need to inquire as to how a particular vendor determines what traffic gets sent to what service function, as efficiency varies enormously from vendor to vendor.

For instance, on one end of the spectrum are inefficient and rudimentary port-based redirections that simply forward all traffic on a particular port (e.g., Port 80 for HTTP) to the service functions. At the other end of the spectrum are highly efficient truly intelligent redirection systems that consider

application, subscriber identity and entitlement, and other relevant factors (e.g., video provider, video resolution, video container, etc.); these intelligent systems are often protected by patents.

Somewhere in the middle are systems that apply some level of heuristic guessing to be more precise than the port-based systems without getting near the efficiency of, or infringing on the patents of, the advanced systems.

Limited End-to-End Service Visibility

Troubleshooting service related issues is a complex process that involves both network-specific and service-network-specific expertise. This is especially the case when service function chains span multiple DCs, or across administrative boundaries. Furthermore, the physical and virtual environments (network and service), can be highly divergent in terms of topology and that topological variance adds to these challenges.

Whether enabled through redirection or integration, it is imperative that the platform provide visibility into service function availability and performance. In theory, the integrated approach has an

(8)

through a single interface2_{. In the redirection model, the redirection platform would be able to provide}

visibility into redirection metrics and anything that can be extracted via API from the other service functions, but it is likely that the service functions themselves would have dedicated troubleshooting and diagnostic interfaces.

Per-Service (re)Classification

Classification occurs at each service function independent from previously applied service functions. More importantly, the classification functionality often differs per service function and service functions may not leverage the results from other service functions. This problem area presupposes that the service chain is not configured in an end-to-end manner, but there is no reason why this cannot be the case (regardless of whether the service functions are integrated or enabled via redirection).

Symmetric Traffic Flows

Service function chains may be unidirectional or bidirectional depending on the state requirements of the service functions. In a unidirectional chain traffic is passed through a set of service functions in one forwarding direction only. Bidirectional chains require traffic to be passed through a set of service functions in both forwarding directions. Many common service functions such as DPI and firewall often require bidirectional chaining in order to ensure flow state is consistent.

Existing service deployment models provide a static approach to realizing forward and reverse service function chain association most often requiring complex configuration of each network device throughout the SFC.

Provided that the enablement platform (whether integrated or redirection-based) can resolve network asymmetry, then the service functions themselves will not be exposed to asymmetric traffic and the problems that it poses3_.

Multi-Vendor Service Functions

Deploying service functions from multiple vendors often require per-vendor expertise: insertion models differ, there are limited common attributes and inter-vendor service functions do not share information.

Perhaps more than any other problem area, this one strongly favors the redirection-based architecture. With an integrated approach, the CSP can only choose from those service functions that are already integrated (or could be integrated via additional effort). Practically, this restriction prevents the CSP from choosing between a range of best-of-breed options to select the optimal choice.

Redirection preserves the CSP’s ability to choose service functions from any vendor, provided they can interoperate with the redirection platform. In practice, interoperation necessitates meeting some fairly low requirements (although CSPs should be mindful that this does vary).

2_{This might not necessarily be the case, however: if the integrated solutions have been acquired (as opposed to built), then}

there might well still be multiple management interfaces.

3_{A comprehensive explanation of routing asymmetry and its implications for network policy control is available in the Sandvine}

whitepaper Applying Network Policy Control to Asymmetric Traffic: Considerations and Solutions

(9)

Conclusion

In order to realize the full promise and potential of VAS and service chaining, there are a number of challenges that must be overcome – many of these are included in the “Network Service Chaining Problem Statement”, from the IETF.

Practically, there are two approaches that can be used to implement service function chains: • Integration: the service functions are integrated within an element already in the data path,

and redirection is local

• Redirection: the VAS deployments (service functions) are separate from the data path, and a data path component (e.g., PCEF) redirects traffic

Using the problem areas outlined in the IETF document as a guide, it is apparent that both approaches are promising.

An integrated approach is viable if a CSP has a firm understanding of precisely what service functions they want to deploy, if the list of service functions is very small and unlikely to change, and if the integrated service functions are of acceptable quality as to fulfill the requirements.

However, the redirection-based enablement is the superior option overall: when done correctly, redirection can overcome all of the challenges identified in the IETF problem statement. Most

importantly, redirection preserves choice and flexibility – the CSP is free to choose any vendor for any service function, and can introduce or remove service functions as needs change over time.

In order to make educated choice about VAS and service chain enablement, CSPs must ask pointed questions of their potential platform vendors.

Requirements for Service Chain Enablement

The table below summarizes the high-level requirements that emerge from each IETF problem area.

Consideration Requirement

Topological

Dependencies VAS-enablement platform should abstract the service functions from the physical network topology Configuration

Complexity VAS-enablement platform must maintain flexibility, ease of configuration, and ease of reordering service functions Constrained High

Availability

Service functions should be highly available regardless of the

VAS-enablement mechanism; said alternatively, the VAS-VAS-enablement should be decoupled from the availability of the service functions

VAS-enablement platform must be highly available

VAS-enablement platform should have a health-check mechanism to detect availability and health of service functions

Consistent Ordering of

Service Functions VAS-enablement platform must allow the CSP to define and control a specific and consistent ordering of service functions within the service function chain Application of Service

Policy

VAS-enablement platform must provide an effective global (rather than at each service function) means of determining what packets should go to what service functions.

Transport Dependence

VAS-enablement platform must function completely agnostic of the network’s access technologies

VAS-enablement platform must be able to redirect traffic that is tunneled, with header removal and reapplication

(10)

encapsulated, with header removal and reapplication Elastic Service

Delivery VAS-enablement platform must accommodate rapid changes to the service chain

Traffic Selection Criteria

VAS-enablement platform must provide an efficient means of redirecting only relevant traffic to each service function; practically, this means that the redirection should be based on a combination of application, protocol, provider, subscriber, and other factors relevant to the service function Limited End-to-End

Service Visibility VAS-enablement platform must provide visibility into service function availability and performance Per-Service

(re)Classification VAS-enablement platform should provide end-to-end configuration of the service chain, to avoid the need for per-service (re)classification Symmetric Traffic

Flows VAS-enablement platform must be able to provide redirection for all appropriate traffic when intersecting asymmetric traffic routes Multi-Vendor Service

Functions VAS-enablement platform must not restrict the CSP’s choice of service function vendors

Additional Resources

In addition to the resources cited in the footnotes throughout this document, please consider reading the Sandvine technology showcase Enabling Service Function Chains and Value-Added Services with Sandvine Divert, available on www.sandvine.com.

Value-Added Services and Service Chaining: Deployment Considerations and Challenges

An Industry Whitepaper

Contents

Executive Summary

Value-Added Services and Service Chaining:

Deployment Considerations and Challenges

Introduction to Value-Added Services and Service

Chaining

Deployment Architecture Alternatives

Considerations for Service Chain Enablement

Topological Dependencies

Configuration Complexity

Constrained High Availability

Consistent Ordering of Service Functions

Application of Service Policy

Transport Dependence

Elastic Service Delivery

Traffic Selection Criteria

Limited End-to-End Service Visibility

Per-Service (re)Classification

Symmetric Traffic Flows

Multi-Vendor Service Functions

Conclusion

Requirements for Service Chain Enablement

Additional Resources