• No results found

HIGH-PERFORMANCE GPU VIDEO ENCODING ABHIJIT PATAIT SR. MANAGER, NVIDIA

N/A
N/A
Protected

Academic year: 2021

Share "HIGH-PERFORMANCE GPU VIDEO ENCODING ABHIJIT PATAIT SR. MANAGER, NVIDIA"

Copied!
30
0
0

Loading.... (view fulltext now)

Full text

(1)

ABHIJIT PATAIT

SR. MANAGER, NVIDIA

HIGH-PERFORMANCE GPU

VIDEO ENCODING

(2)

AGENDA

GPU Video Encoding Overview

NVIDIA Video Encoding Capabilities

Kepler, Maxwell Gen 1, Maxwell Gen 2

Software API

Performance & Quality Roadmap

(3)
(4)

BENEFITS

Low power

Fixed function hardware, free CPU Reduced memory transfers

Low latency

High performance Higher density Scalability

Automatic benefit from improvements in hardware

Ease of programming

(5)
(6)

MAIN FEATURES

Feature Benefits

H.264 base, main, high profiles Wide range of use-cases

H.265/HEVC main profile Lower bitrates at same quality High performance (4K @ 60 fps) “Blazing-speed” encoding

YUV 4:2:0 and 4:4:4 support High quality encoding without chroma subsampling

QP maps Customizable quality, region of interest encoding 4K encoding in hardware High resolution encode

API - NV Encode SDK & GRID SDK Flexible, Win/Linux, DirectX/CUDA

(7)

FEATURE COMPARISON

Kepler Maxwell Gen 1 (GM10x) Maxwell Gen 2 (GM20x)

H.264 only H.264 only H.264 and HEVC/H.265

Planar 4:4:4 & proprietary 4:4:4; no lossless encoding Standard 4:4:4 and H.264 lossless encoding Standard 4:4:4 and H.264 lossless encoding ~240 fps 2-pass encoding @ 720p ~500 fps 2-pass encoding @ 720p ~900 fps 2-pass encoding @ 720p GRID K340/K520, K1/K2, Quadro, Tesla K10/K20

Maxwell-based GRID & Quadro products

TBA

GeForce – 2 full-speed encode sessions/system

GeForce – 2 full-speed encode sessions/system

GeForce – 2 full-speed encode sessions/system

NV Encode SDK 1.0-5.0 (Now) NV Encode SDK 4.0+ (Now) NV Encode SDK 5.0+ (Now)

(8)

WHAT’S NEW – HARDWARE

HEVC

8-bit encoding Main8 profile

Optimized for low-latency applications (I and P frames) > 300 fps at very high quality 720p

H.264

Improved performance (~80% higher compared to 1st Gen Maxwell)

(9)

WHAT’S NEW - SOFTWARE

NVENC SDK 5.0

NVIDIA GPU driver 347.18 and above HEVC

Unified API for H.264 and HEVC Linux & Windows

Intra refresh, ref-pic invalidation, etc. for H.264 and HEVC

Support for all NVENC hardware up to GM20x Adaptive quantization

Quality improvements

(10)
(11)

USING NVENC

NVENC

SDK

• No capture • Transcoding • Archiving • Video editing • CUDA pre-process + encoding

• Granular encoder settings

• D3D, CUDA interop

Direct

Encode

GRID

SDK

• Capture + encode

• Optimized for low-latency apps

• Capture + CUDA

pre-process + encoding

• Encoder settings

optimized for streaming

• D3D, CUDA interop

Capture +

Encode

(12)

DIRECT ENCODE (NVENC SDK)

Client application NVENC API NVENC Driver DirectX Driver CUDA Driver

NVENC firmware + hardware Initialize,

Configure HW

HW Encode Encoded

(13)

CAPTURE & ENCODE (GRID SDK)

Client application NvFBC/NvIFR NVENC Driver DirectX/OGL Driver NVENC Hardware Capture YUV GPU 3D Engine DX/OGL Present Encode Encoded Bitstream

(14)

NVENC SDK (1/2)

Available on NVIDIA developer zone

https://developer.nvidia.com/nvidia-video-codec-sdk

Current release: 5.0

Interface header, documentation, sample application .dll/.so included in the driver

Unified API for Windows and Linux

Works on x86/x64

API’s, presets, rate control modes for

Low-latency streaming Transcoding

(15)

NVENC SDK (2/2)

Unified API for H.264 and HEVC

Flexibility

Dynamic resolution/bitrate change Low-level encoder settings

Windows, Linux, DirectX, CUDA, OGL (via CUDA) Works on GeForce (2 sessions/system)

Error concealment

Reference picture invalidation Intra-refresh

Greater flexibility for quality/performance trade-off Lossless encoding only in NVENC SDK

(16)

GRID SDK ENCODE

NDA only – older release available on NV developer zone

https://developer.nvidia.com/grid-app-game-streaming

Current release: 3.1 (Now – NDA), 2.3 (Public)

Interface header, documentation, sample apps .dll/.so included in the driver

Windows and Linux

Works on x86/x64 Presets and API’s for

Remote graphics (Cloud gaming, remote desktop, capture & stream) Optimized for low latency

(17)
(18)

H.264 QUALITY – 1-PASS ENCODING

34 36 38 40 42 44 46 6 8 10 12 15 18 20 P SN R ( dB ) bitrate (Mbps)

H.264 quality with 1-pass rate control

Default LL-Default HP HQ BD LL-HQ

(19)

H.264 QUALITY – 2-PASS ENCODING

34 36 38 40 42 44 46 6 8 10 12 15 18 20 P SN R ( dB ) bitrate (Mbps)

H.264 quality with 2-pass rate control

Default LL-Default HP HQ BD LL-HQ

(20)

COMPARISON: 1-PASS VS 2-PASS

37.5 38 38.5 39 39.5 40 40.5 Default LL-Default HP HQ BD LL-HQ P SN R ( dB ) Encoder preset

H.264 quality comparison: 1-pass vs 2-pass

1-pa ss 2-pa ss 1-pa ss 1-pa ss 1-pa ss 1-pa ss 1-pa ss 2-pa ss 2-pa ss 2-pa ss 2-pa ss 2-pa ss

(21)

BITRATE SAVINGS

39.5 dB 41.0 dB 42.0 dB 6 8 9.8 4 6 8

Bitrate savings - Default preset

39.5 dB 41.0 dB 42.0 dB 5.8 7.8 9.7 3.9 5.8 7.9

Bitrate savings - HQ preset

33% 25% 18% Bitrate savings 33% 26% 19% H .26 4 H .26 4 H .26 4 H .26 4 H .26 4 H .26 4 H EVC HEVC H

(22)

H.264 VS HEVC

(23)

H.264 VS HEVC

(24)
(25)

H.264 PERFORMANCE – GM20X

1080p, GM20x 1 pa ss 2 pa ss 2 pa ss 2 pa ss 2 pa ss 1 pa ss 1 pa ss 1 pa ss fps 50 fps 100 fps 150 fps 200 fps 250 fps 300 fps 350 fps 400 fps 450 fps 500 fps HP LL-HP HQ LL-HQ Single pass 464 fps 342 fps 291 fps 293 fps Two pass 306 fps 246 fps 171 fps 181 fps Enc ode F P S H.264 Performance (1080p) 1 pa ss 1 pa ss 1 pa ss 1 pa ss 2 pa ss 2 pa ss 2 pa ss 2 pa ss

(26)

H.264/HEVC PERF COMPARISON

fps 50 fps 100 fps 150 fps 200 fps 250 fps 300 fps 350 fps HP LL-HP HQ LL-HQ H.264 306 fps 246 fps 171 fps 181 fps H.265 220 fps 214 fps 102 fps 153 fps Enc ode F P S

H.264/HEVC Performance: 2-pass

H .26 4 H .26 4 H .26 4 H .26 4 H EVC H EVC H EVC H EVC

(27)

PERFORMANCE - TREND

fps 200 fps 400 fps 600 fps 800 fps 1000 fps 1200 fps 1400 fps 1600 fps 1800 fps

Kepler (2011) Maxwell Gen 1 (2013) Maxwell Gen 2 (2014) Future

(28)
(29)

ROADMAP

Core GPU chip IP

Motion estimation only mode – 2H2015

SAO, 10/12-bit, HEVC B-frames Lossless/4:4:4

Improved quality for screen content encoding ME performance and quality enhancements

Today: 4K@60fps

(30)

References

Related documents

Dramatic Writing Resume Writer: Within / Without (Stage Play. Produced by Akpik Theatre, Yellowknife 2013), Chicken Hearts and Baby Onions (Stage Play. Premiered at New

In an effort to more clearly communicate CompTIA’s exam policies on use of unauthorized study materials, CompTIA directs all certification candidates to the CompTIA Certification

Eastern Europe, including the Soviet Union (When discussing what occurred after the recognition of Slovenia and Croatia, I will only refer to Russia due to its

• Payment of $1,000 for each eligible modification meeting guidelines established under the Modification Program (or $1,500 for each eligible modification meeting the guidelines

This product, as supplied, does not contain any hazardous materials with occupational exposure limits established by the region specific regulatory bodies.. Biological

dedicated to improving palliative care services in First Nations communities.  Goal: to improve end-of-life care in First Nations communities through developing palliative care

of main plate should be between 0.6 to 0.8times inside width of shackle, - mm Minimum Clearance inside the shackle to be 0.5 times sling dia.. Date

In 2010-2011, the corporation, through its mortgage lending program, will continue to support the Department of Health in the development of new long term beds and enable more