• No results found

Is Your Elephant a Gazelle?

N/A
N/A
Protected

Academic year: 2021

Share "Is Your Elephant a Gazelle?"

Copied!
17
0
0

Loading.... (view fulltext now)

Full text

(1)

Fosdem 2021

Is Your Elephant a Gazelle?

Fan Zhang

(2)

Agenda

What is elephant flow and where is the

bottleneck to secure them with open-source

IPsec solutions?

FD.io VPP and VPP IPsec

VPP Synchronous Crypto Infra Introduction

VPP Asynchronous Crypto Infra Introduction

Scale Single IPsec Flow Even More

(3)

Fosdem 2021 3

What is Elephant Flow (EP)?

Extremely

large

continuous flow in

Internet

4.7% packets in

total, takes

41.3%

bandwidth

How Userspace

data plane handles

IPsec:

• Isolated and limited

per core processing resource, including stack and crypto.

• Flow-to-core affinity.

This makes IPsec EP

(4)

Pain Points of Processing IPsec EP

Crypto processing requires large amount cycles, while EP is mostly

large packets

Flow-to-core affinity always make one core extremely busy, while

other cores relaxing.

A perfect core extremely powerful to handle the flow also means

wasting the cycles most of the time.

Load-balancing single flow to multiple cores will cause race

condition when anti-replay is enabled.

We propose our answer to resolve the problems with FD.io VPP

(5)

Fosdem 2021 5

5

Development Toolkit for … Extensible network functions

Framework, API & Libraries Packet Processing Pipeline, configuration driven, composable

& extensible Wide Device Support Select Native Drivers &

Libraries & Sample Code Wide Protocol Support

SDK Model Plugin Model

Realized through Support for

Interfaces Networking

Extensible Integrations

Out of

the box Cloud Infra, Discrete Appliance, Virtual Network Functions … Software

Architecture

Kubernetes AppliancesDiscrete

(6)

FD.io VPP IPsec

Open-Source Production-ready IPsec Implementation.

Capable of single server 1Tb IPsec processing.

Supports AH, ESP (tunnel and transport), ESPoUDP, ESPoGRE.

Supports major crypto algorithms (CBC, HMAC-SHA*,

AES-GCM).

Supports multiple crypto engine plugins.

Supports both CPU based crypto (VAES) and Lookaside HW

accelerations (QAT)

(7)

Fosdem 2021 7

FD.io VPP Native Crypto Infra (before VPP 20.05)

• A generic infrastructure to provide symmetric crypto

service within VPP

• Provides generic API and multiple crypto plugin

engines supporting:

• Key management (add, delete, and update)

• Crypto operation (cipher, hash, AEAD)

• Advantages • Performance • Availability • Flexible • Disadvantages • No HW offload support

• Single IPsec Flow crypto Scaling not possible

Ip4-midchain Esp4-encrypt-tun Interface-output SA Lookup Process crypto VPP Crypto Infra Engine Operation Handler Drop fail pkts 7

(8)

Scale VPP IPsec Single Flow Throughput With

Crypto Offload

IPsec = packet processing (pps sensitive) + crypto (bps sensitive)

Offload crypto workload to

• Dedicated HW (e.g. QAT)

• Dedicated CPU core(s)

helps gaining more cycles to packet I/O and stack processing.

To support both, we need a generic asynchronous crypto

(9)

Fosdem 2021 9

VPP Async Crypto Infra

Released in VPP 20.05

Share the same key

management as synchronous

crypto infra.

Provides Generic Enqueue and

Dequeue Handler.

User graph node enqueues the

packets to the target engine.

A dedicated dispatch graph

node will handle dequeue.

Ip4-midchain Esp4-encrypt-tun Interface-output SA Lookup Enqueue crypto VPP Crypto Async Infra Engine Enqueue Handler Drop fail pkts Crypto Dispatch Esp4-encrypt-tun-post VPP Crypto Async Infra Engine Dequeue Handler Cryptodev/ QAT 9

(10)

Adding QAT Hardware acceleration with DPDK

Cryptodev

• New DPDK Cryptodev RAW API

• A more compact data structure.

• Raw buffer pointer and physical address as input.

• More sophisticated enqueue/dequeue control method.

• Customizable status field set callback function design.

• ~15% performance improvement.

• New DPDK Cryptodev Raw API will be released in DPDK 20.11

• The change has already been merged in VPP 20.09 as a DPDK 20.08 patch.

(11)

Fosdem 2021 11

Elephant flow without QAT? SW-scheduler

Crypto Engine

• A pure SW crypto engine that

utilizes dedicated CPU cores to process crypto workload.

• Crypto worker threads

actively scan the frame queue, mark unprocessed

frame as “WIP”, and

processed frame as

“Complete”.

• Dispatch dequeue first N

“Complete” frames.

Crypto worker core 1

Crypto worker core 2

Enqueue Handler Store (no lock)

Find 1st frame status as IDLE mark as WIP atomic

Find 2nd frame status as IDLE mark as WIP atomic Mark processed frame status as COMPLETE

Mark processed frame status as COMPLETE

RX thread Crypto Dispatch

Dequeue Handler

Dequeue 1st and 2nd frame (no lock)

Op handler

(12)

… Also cloud friendly!

• Crypto Dispatch Node running in

polling mode can achieve best possible performance, but it is

unfriendly to cloud-native use case.

• That’s why we made it supporting

interrupt mode.

• Active polling within an interrupt

handling

• Precise signaling when a crypto

frame is enqueued/processed.

RX thread

Crypto Dispatch

Enqueue Handler

Crypto worker core 2 Crypto Dispatch Crypto worker core 1 Crypto Dispatch

Signal

Signal

Signal

Signal

(13)

Fosdem 2021 13

Can We Scale Single IPsec Flow Even More?

▪ With Async Crypto we achieved single

IPsec flow processing capability of up to 40Gbps.

▪ Even with crypto offloaded, there are

still heavy I/O and stack processing left.

▪ Intel® DLB or DPDK SW eventdev offers

the way to distribute the packets to

multiple CPU cores. The packet ordering is maintained from RX to TX.

▪ With the help of DLB or SW eventdev,

we may load-balance most single flow IPsec workload to more cores.

0 DLB queue

1

2

3

4

NIC DLB queue 5 NIC

(14)

Can We Scale Single IPsec Flow Even More? (cont.)

▪ Only non-distributable workload (SQN

update and check) is handled by a single core.

▪ Load-balancing more workload to other

CPU cores helps regaining more cycles to receive packets.

• IPsec Stack Processing • Crypto

• Tx (post IPsec)

▪ Development ongoing, estimate to finish

and upstream EOY 2021.

▪ Our goal is to achieve 100Gbps single

IPsec flow processing capability.

SA Lookup

Update SA SQN Add Tunnel Header

+ ICV Sync Crypto Pre-ipsec Pre-ipsec SA Lookup Update SA SQN Add Tunnel Header + ICV Sync Crypto Post-IPsec Post-IPsec Core 1 (RX) Core 2 ~ N-1 (Ipsec) Core N (TX) Event 1 ENQ Event 2 ENQ Event 1 DEQ Event 2 DEQ

(15)

Fosdem 2021 15

Summary

• VPP Synchronous Crypto Infra provides amazing performance to process IPsec

workload, but fails to scale with bigger flow.

• We provided asynchronous crypto infra to make SW and HW offloading

possible to scale IPsec single flow throughput. The infra supports interrupt mode to make it cloud-native friendly.

• We also provided Cryptodev and SW scheduler async crypto engines.

• Both async crypto engines helped to achieve 40Gbps IPsec elephant flow

processing.

• To scale the single IPsec flow even further, we process offload both crypto and

most IPsec stack to other cores with Intel® DLB or DPDK Eventdev. 15

(16)

Thank you very much!

Q&A

For questions please contact: roy.fan.zhang@intel.com

(17)

References

Related documents

SPACs, crypto ledgers, crypto coins, crypto-joke-coins, NFTs, Beeple, meme stocks, RobinHood stocks, GameStop stock shorts or stocks of socks and secondhand sneakers.. And

Tempe industrial wastewater also has a phosphate content that can be absorbed by plant roots, although in a low amount it can increase the growth of corn cobs in terms of a

From the way we understand crypto jacking it is the utilisation of victim’s electronic resources without the consent of the user and leverage the processing

It is therefore fitting in recognition of the hard and innovative work we have done, and for its historical significance, that we will be signing the new Treaty of Basseterre to

Select Objects → Crypto Validation Credentials to create a Crypto Validation Credential object using the Crypto Certificate object that is based on the root certificate.. Provide a

Pitney Bowes Business Insight’s DOC1 document composition solution can centrally manage all customer communications, including transactional, on-demand and interactive documents, in

“[T]he sequence in which the forums obtained jurisdiction is irrelevant because this litigation and [the state court case] are not identical.” Rivera-Plug, 983 F.2d at 321 (arriving

BA allows the block val- idation protocol to reach an agreement on the unique block that can reference an earlier block in the blockchain, and allows the transaction validation