The DPDK is a set of libraries and drivers for fast packet processing and runs mostly in Linux userland. It is a set of libraries that provide the so called "Environment Abstraction Layer" (EAL). The EAL hides the details of the environment and provides a standard programming interface. Common use cases are around special solutions for instance network function virtualization and advanced high-throughput network switching. The DPDK uses a run-to-completion model for fast data plane performance and accesses devices via polling to eliminate the latency of interrupt processing at the tradeoff of higher cpu consumption. It was designed to run on any processors. The first supported CPU was Intel x86 and it is now extended to IBM Power 8, EZchip TILE-Gx and ARM.
Ubuntu currently supports DPDK version 2.2 and provides some infrastructure to ease its usability.
5.1. Prerequisites
This package is currently compiled for the lowest possible CPU requirements. Which still requires at least SSE3 to be supported by the CPU.
The list of upstream DPDK supported network cards can be found at supported NICs18. But a lot of those are disabled by default in the upstream Project as they are not yet in a stable state. The subset of network cards that DPDK has enabled in the package as available in Ubuntu 16.04 is:
Intel
• e100019 (82540, 82545, 82546)
• e1000e20 (82571..82574, 82583, ICH8..ICH10, PCH..PCH2)
• igb21 (82575..82576, 82580, I210, I211, I350, I354, DH89xx)
• ixgbe22 (82598..82599, X540, X550)
• i40e23 (X710, XL710, X722)
• fm10k24 (FM10420) Chelsio
• cxgbe25 (Terminator 5) Cisco
• enic26 (UCS Virtual Interface Card) Paravirtualization
• virtio-net27 (QEMU)
• vmxnet328 Others
• af_packet29 (Linux AF_PACKET socket)
• ring30 (memory)
On top it experimentally enables the following two PMD drivers as they represent (virtual) devices that are very accessible to end users.
Paravirtualization
• xenvirt31 (Xen) Others
• pcap32 (file or kernel driver)
Cards have to be unassigned from their kernel driver and instead be assigned to uio_pci_generic of vfio-pci.
uio_pci_generic is older and usually getting to work more easily. The newer vfio-pci requires that you actiavte the following kernel parameters to enable iommu.
iommu=pt intel_iommu=on
On top for vfio-pci you then have to configure and assign the iommu groups accordingly.
Manual configuration can be done via sysfs or with the tool
dpdk_nic_bind
An exception to the rule that you need to reassign drivers is virtio-pci. There you have to take care vice versa by always providing --pci-blacklist PCI-ID to the EAL commandline of your DPDK application. Otherwise DPDK will "grab" all virtio-pci ices and you lost connectivity to your guest.
5.2. DPDK Device configuration
The package dpdk provides init scripts that ease configuration of device assignment and huge pages. It also makes them persistent accross reboots.
The following is an example of the file /etc/dpdk/interfaces configuring two ports of a network card. One with uio_pci_generic and the other one with vfio-pci
27 http://dpdk.org/doc/guides/nics/virtio.html
# <bus> Currently only "pci" is supported
# <id> Device ID on the specified bus
# <driver> Driver to bind against (vfio-pci or uio_pci_generic)
#
# Note that depending on your network card and what you want to set up also the
# drivers ixgbe or virtio-pci might apply, but these are the default drivers
# and therefore have not to be rebound as dpdk interfaces.
#
# Be aware that those two drivers are part of linux-image-extra-<VERSION>
# package in case you run into missing module issues.
#
# <bus> <id> <driver>
pci 0000:04:00.0 uio_pci_generic pci 0000:04:00.1 vfio-pci
Cards are identified by their PCI-ID. If you are unsure you might use the tool dpdk_nic_bind to show the current available devices and the drivers they are assigned to.
dpdk_nic_bind --status
Network devices using DPDK-compatible driver
============================================
0000:04:00.0 'Ethernet Controller 10-Gigabit X540-AT2' drv=uio_pci_generic unused=ixgbe
Network devices using kernel driver
===================================
0000:02:00.0 'NetXtreme BCM5719 Gigabit Ethernet PCIe' if=eth0 drv=tg3 unused=uio_pci_generic *Active*
0000:02:00.1 'NetXtreme BCM5719 Gigabit Ethernet PCIe' if=eth1 drv=tg3 unused=uio_pci_generic
0000:02:00.2 'NetXtreme BCM5719 Gigabit Ethernet PCIe' if=eth2 drv=tg3 unused=uio_pci_generic
0000:02:00.3 'NetXtreme BCM5719 Gigabit Ethernet PCIe' if=eth3 drv=tg3 unused=uio_pci_generic
0000:04:00.1 'Ethernet Controller 10-Gigabit X540-AT2' if=eth5 drv=ixgbe unused=uio_pci_generic
DPDK makes heavy use of huge pages to eliminate pressure on the TLB. Therefore hugepages have to be configured in your system.
The dpdk package has a config file and scripts that try to ease hugepage configuration for dpdk in the form of /etc/dpdk/dpdk.conf. If you have more consumers of hugepages than dpdk in your system or very special requirements how your hugepages are going to be set up you likely want to allocate/control them by yourself.
If not this can be a great simplification to get dpdk to run for yourself.
Here an example configuring 1024 Hugepages of 2M each.
NR_2M_PAGES=1024
This file also supports configuring the larger 1G hugepages (or a mix of both). It will make sure there are proper hugetlbfs mountpoints for DPDK to find both sizes no matter what your default huge page size is. The config file itself holds more details on certain corner cases and a few hints if you want to allocate hugepages manually via a kernel parameter.
5.4. Compile DPDK Applications
Currently there are not a lot consumers of the DPDK library that are stable and released. OpenVswitch-DPDK being an exception to that (see below), but in general it is very likely that you might want / have to compile an app against the library.
You will often find guides that tell you to fetch the dpdk sources, build them to your needs and eventually build your application based on dpdk by setting values RTE_* for the build system. Since Ubunutu provides an already compiled DPDK for you can can skip all that. To simplify setting the proper variables you can source the file /usr/share/dpdk/dpdk-sdk-env.sh before building your application. Here an excerpt building the l2fwd example application delivered with the dpdk-doc package.
sudo apt-get install dpdk-dev libdpdk-dev . /usr/share/dpdk/dpdk-sdk-env.sh
make -C /usr/share/dpdk/examples/l2fwd
Depending on what you build it might be a good addition to install all of dpdk build dependencies before the make.
sudo apt-get install build-dep dpdk
5.5. OpenVswitch-DPDK
Being a library it doesn't do a lot on its own, so it depends on emerging projects making use of it. One consumer of the library that already is bundled in the Ubuntu 16.04 release is OpenVswitch with DPDK support in the package openvswitch-switch-dpdk.
Here an example how to install and configure a basic OpenVswitch DPDK.
sudo apt-get install openvswitch-switch-dpdk
sudo update-alternatives --set ovs-vswitchd /usr/lib/openvswitch-switch-dpdk/ovs-vswitchd-dpdk
echo "DPDK_OPTS='--dpdk -c 0x1 -n 4 -m 2048'" | sudo tee -a /etc/default/openvswitch-switch sudo service openvswitch-switch restart
Note: please remember that depending on your environment you have to:
• virtio based environment: add blacklists/whitelists to the DPDK_OPTS
• all other environments: assign devices to DPDK compatible drivers (see above)
Now your running OpenVswitch supports all OpenVswitch usually does plus DPDK port types. Here an example how to create a bridge and - instead of a normal external port - add an external DPDK port to it.
ovs-vsctl add-br ovsdpdkbr0 -- set bridge ovsdpdkbr0 datapath_type=netdev ovs-vsctl add-port ovsdpdkbr0 dpdk0 -- set Interface dpdk0 type=dpdk
5.6. OpenVswitch DPDK to KVM Guests
If you are not building some sort of SDN switch or NFV on top of DPDK it is very likely that you want to forward traffic to KVM guests. The good news is, that with the new qemu/libvirt/dpdk/openvswitch versions in Ubuntu 16.04 this is no more about manually appending commandline string. This chapter covers a basic configuration how to connect a KVM guest to a OpenVswitch-DPDK instance.
The Guest has to be backed by shared hugepages for DPDK/vhost_user to work. This is also supported via libvirt, just add the following snippet to your virsh xml (or the equivalent virsh interface you use). Those xmls can for example be used to easily spawn guests with "uvt-kvm create".
<numa>
<cell id='0' cpus='0' memory='6291456' unit='KiB' memAccess='shared'/>
</numa>
The new and recommended way to get to a KVM guest is using vhost_user. This will cause DPDK to create a socket that qemu will connect the guest to. Here an example how to add such a port to the bridge you created (see above).
ovs-vsctl add-port ovsdpdkbr0 vhost-user-1 -- set Interface vhost-user-1 type=dpdkvhostuser
This will create a vhost_user socket at /var/run/openvswitch/vhost-user-1
To let libvirt/kvm consume this socket and create a guest virtio network device for it add a snippet like this to your guest definition as the network definition.
This section is plagued by a bug (LP #154656533) and instructions may not work as intended.
Current workarounds are listed in the Bug.
<interface type='vhostuser'>
5.7. DPDK in KVM Guests
If you have no access to supported network cards you can still work with DPDK by using its support for virtio. Create guests backed by hugepages (see above).
On top of that there it is required to have at least SSE3. The default CPU model qemu/libvirt uses is only up to SSE2. So you will have to define a model that passed the proper feature flag - and of course have a Host system that supportes it. An example can be found in following snippet to your virsh xml (or the equivalent virsh interface you use).
<cpu mode='host-passthrough'>
This example is rather offensive and passes all host features. That in turn makes the guest not very migratable as the target would need all the features as well. A "softer" way is to just add sse3 to the default model like the following example.
Also virtio nowadays supports multiqueue which DPDK in turn can exploit for better speed. To modify a normal virtio definition to have multiple queues add the following to your interface definition. This is about enhancing a normal virtio nic to have multiple queues, to later on be consumed e.g. by DPDK in the guest.
<driver name="vhost" queues="4"/>
33 https://bugs.launchpad.net/ubuntu/+source/dpdk/+bug/1546565
5.8. Support and Troubleshooting
DPDK is a fast evolving project. In any case of a search for support and further guides it is highly recommended to first check if they apply to the current version.
• DPDK Mailing Lists34
• For OpenVswitch-DPDK OpenStack Mailing Lists35
• Known issues in DPDK Launchpad Area36
• Join the IRC channels #DPDK or #openvswitch on freenode.
5.9. Resources
• DPDK Documentation37
• Release Notes matching the version packages in Ubuntu 16.0438
• Linux DPDK User Getting Started39
• DPDK Api Documentation40
• OpenVswitch DPDK installation41
• Wikipedias definition of DPDK42
34 http://dpdk.org/ml
35 http://openvswitch.org/mlists
36 https://bugs.launchpad.net/ubuntu/+source/dpdk 37 http://dpdk.org/doc
38 http://dpdk.org/doc/guides/rel_notes/release_2_2.html 39 http://dpdk.org/doc/guides/linux_gsg/index.html 40 http://dpdk.org/doc/api/
41 https://github.com/openvswitch/ovs/blob/branch-2.5/INSTALL.DPDK.md 42 https://en.wikipedia.org/wiki/Data_Plane_Development_Kit