• No results found

Multimedia Communications

N/A
N/A
Protected

Academic year: 2021

Share "Multimedia Communications"

Copied!
41
0
0

Loading.... (view fulltext now)

Full text

(1)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Multimedia

Communications

Dr.

Ing.

 

Aljoscha

 

Smolic

(2)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

MMC

 

Overview

1. Introduction

2. Fundamentals (Signal Processing,

Information Theorie)

3. Speech Processing & Coding

4. Audio Processing & Coding

5. Still Image Coding (JPEG, etc.)

6. Video Coding (MPEG, etc.)

7. MPEG-4 Multimedia Framework, MPEG-7

8. 3D Video and Free Viewpoint Video

MPEG

4/7

 

Overview

MPEG-4 Face Animation

Layered Video Coding

MPEG-4 Multimedia Framework

3D Mesh Compression

(3)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Materials

Introduction

 

to

 

MPEG

7:

 

Multimedia

 

Content

 

Description

 

Interface

,

 

B.

 

S.

 

Manjunath (Editor),

 

Philippe

 

Salembier (Editor),

 

Thomas

 

Sikora (Editor)

 

,

 

ISBN:

 

978

0

471

48678

7,

 

Hardcover,

 

396

 

pages,

 

April

 

2002.

The

 

MPEG

4

 

book

,

 

Fernando

 

C.

 

N.

 

Pereira,Touradj Ebrahimi,

 

ISBN

10: 0130616214,

 

ISBN

13:

 

9780130616210,

 

Publisher:

 

Prentice

 

Hall,

 

Copyright:

 

2003,

 

Format:

 

Paper;

 

896

 

pp

J.

R.

 

Ohm

Multimedia

 

Communication

 

Technology

Springer

Verlag

Materials

http://www.chiariglione.org/mpeg/standards/mpeg‐7/mpeg‐7.htm

http://www.chiariglione.org/mpeg/tutorials/papers/IEEEMM_mp7

overview_withcopyrigth.pdf

http://www.chiariglione.org/mpeg/standards/mpeg‐4/mpeg‐4.htm

(4)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

MPEG

4

 

Face

 

Animation

Face

 

&

 

Body

 

Animation

(5)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Head

 

Model

 

Adaptation

Generic head model

Adapted head model

Adaptation

Courtesy of Prof. Peter Eisert, HU Berlin, Fraunhofer HHI

MPEG

4

 

FDP,

 

FAP

FDP: Face Definition Parameters

FAP: Face Animation Parameters

Peter_Vektor

(6)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

MPEG

4

 

FDP,

 

FAP

Facial feature tracking

Courtesy of Prof. Peter Eisert, HU Berlin, Fraunhofer HHI

Model-based Video Coding

(7)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Model

 

Based

 

Coding

1 kbit/s !!!

Courtesy of Prof. Peter Eisert, HU Berlin, Fraunhofer HHI

Model

 

Based

 

Coding

1 kbit/s !!!

(8)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Model

 

Based

 

Coding

1 kbit/s !!!

Courtesy of Prof. Peter Eisert, HU Berlin, Fraunhofer HHI

Character

 

Animation

(9)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Character

 

Animation

Courtesy of Prof. Peter Eisert, HU Berlin, Fraunhofer HHI

Character

 

Animation

(10)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Character

 

Animation

Courtesy of Prof. Peter Eisert, HU Berlin, Fraunhofer HHI

Text

 

Driven

 

Animation

Hello Peter, how are you doing today?

(11)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

A E O Z B

Voice signal

Phonem

Visem

[a:], [æ]

Voice controlled lip movement

Courtesy of Prof. Peter Eisert, HU Berlin, Fraunhofer HHI

Emotions by high level feature control

(12)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

SMS

 

to

 

Video

• SMS

 

is

 

sent

 

to

 

the

 

provider

• User

 

chooses

 

a

 

character

 

(real

 

person

 

or

 

cartoon

like)

• MMS

 

(video

 

animation)

 

is

 

generated

 

and

 

sent

 

to

 

the

 

receiver

• There,

 

SMS

 

is

 

read

 

by

 

the

 

chosen

 

character

Courtesy of Prof. Peter Eisert, HU Berlin, Fraunhofer HHI

User

 

Selection

• Different

 

characters

 

can

 

be

 

selected

• Additional

 

variations

 

with

 

emoticons:

  

:

)

  

:

(,….

• Characters

 

created

 

from

 

single

 

image

(13)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Animated

 

Text

 

Messages

Courtesy of Prof. Peter Eisert, HU Berlin, Fraunhofer HHI

Text

 

Driven

 

Animation

The Borg have assimilated many species

with many mythologies to explain such moments of clarity.

I have always dismissed them as trivial.

Perhaps I was wrong

(14)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

SMS-2-Video Service

Character:

US English, GB English, German, French And other are possible as well.

Language:

“Hello, how are you doing today? You have received 5 emails and 2 phone calls. Would you like me to read your emails for you?”

Text: Emotion:

Input:

video @

< 100 kbit/sec

Text-To-Speech Server Animation Server

http://www.hhi.fraunhofer.de/en/departments/image-processing/applications/text2video-conversion/

Courtesy of Prof. Peter Eisert, HU Berlin, Fraunhofer HHI

Text

 

Driven

 

Animation

(15)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Layered

 

Video

 

Coding

Layered Coding

Goal:

Better image quality compared to standard

block-based MC/DCT

Method:

(16)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Layered

 

Coding

Shoulder region

-2-D sprite coding

Facial region

-3-D wire-grid coding

Background

-Static or 2-D sprite coding

Model failure region

-Standard coding (MPEG-4 VOP)

(17)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

3-D Coding of Facial Region

Sprite Coding of 2-D Regions

(18)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Region Mask for Sequence “Foreman”

2nd original frame

Region mask

2nd Reconstructed Frame of Sequence “Foreman”

(19)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

9th Reconstructed Frame of Sequence “Foreman”

Layered coder

H.263

H.263

Layered coder

Frame Difference of 9th Frame of Sequence “Foreman”

(20)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

MPEG

4

 

Multimedia

 

Framework

(21)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

MPEG

4

 

Scene

 

Composition

MPEG

4

 

Concept

• Not only coding of media

• Definition of an audio-visual scene (2D/3D), e.g. distribution of

AV-objects in a virtual 3D room

• AV-scene consists of audio, video and synthetic objects => the

scene is

composed

• Described in a specific script language (BInary Format for

Scenes, BIFS, superset of VRML)

(22)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

MPEG

4

Multimedia Standard

Audio-visual Scene, consists of

Audio

Video (arbitrary shape)

Still images

2D/3D computer graphics

Text

Interaction mechanisms

AV-Scenes are composed and rendered

Scene Description:

BIFS (Binary Format for Scenes)

MPEG-4 Scene

BIFS Scene graph

(23)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

MPEG

4

 

BIFS

BIFSConfig { nodeIDbits 10 routeIDbits 10 protoIDbits 10 isCommandStream TRUE pixelMetric FALSE hasSize FALSE pixelWidth 0 pixelHeight 0 } ObjectDescriptor { objectDescriptorID 100 streamType JPEG fileName "images/auckland_0.jpg" } ObjectDescriptor { objectDescriptorID 101 streamType JPEG fileName "images/auckland_1.jpg" } ObjectDescriptor { objectDescriptorID 102 streamType JPEG fileName "images/auckland_2.jpg" } ...

MPEG

4

 

BIFS

Group { children[ NavigationInfo {

type [ "ROTATE", "WALK", "EXAMINE", "ANY" ] headlight FALSE } #NavigationInfo Viewpoint { fieldOfView 0.33 position 0 0 0 orientation 0 1 0 0 } #Viewpoint

DEF Cyl Transform { translation 0 -5 0 children [ DEF Sw0 Switch { whichChoice -1 choice [ Shape { appearance Appearance { texture ImageTexture

(24)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

• Superset of VRML

• Text or binary (compressed)

• Streamable, updatable, timing model

• Includes audio and video

• Face and body animation

• Etc.

MPEG

4

 

BIFS

(25)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Omni-directional Video

slide 50

er

(26)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Omni

directional

c

amera

 

shown

 

at

 

IBC

 

2007

Auflösung vom 5000 x 2000 Pixel mit 5 HD-Kameras

Ralf

S

chäf

er Folie 51

Scene

 

Creation

 

and

 

Video

 

Tiling

• Cylindrical

 

or

 

spherical

 

geometry

 

as

 

approximation

 

of

 

planar

 

tiles

• Tile

 

size

 

and

 

number

 

depend

 

on

 

rendering

 

viewpoint

 

and

 

graphics

 

hardware

(27)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Visibility Sensors

visibility sensor

Pre-fetching of

neighbouring patches

Unloading of patches

not contributing to

screen view

current view on screen

• Usage

 

of

 

Head

 

Tracker

 

allows

 

comfortable

 

naviagtion

• Head

Mounted

 

Display

 

can

 

also

 

be

 

used

 

for

 

scene

 

visualization

Immersive Omni-directional Video with HMD

(28)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

3D

 

Mesh

 

Compression

3D

 

Mesh

 

Compression

Humanoid

 

sequence

provided

 

by

 

Vrije

 

Universiteit

 

Brussel

 

(VUB)

 

consisting

 

of

 

117

 

keyframes

 

in

 

different

 

resolutions

(29)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Dynamic 3D Mesh Compression

Intra/Inter

Switch

Octree

Clustering

Scal./

Quant.

m t

( )

+

+

MPEG-4

3DMC

Reconstr./

Inv Scal.

Octree

Reconstr.

)

(

ˆ

t

o

o t

( )

d t

( )

Arithmet.

Coding

Memory

-y t

( )

0

)

(

ˆ

t

d

)

(

ˆ

t

m

)

1

(

ˆ

t

m

)

1

(

ˆ

t

m

Evaluation

 

– Humanoid,

 

1940

 

vertices

Original

AFX-IC

60,1 kbit/s

D3DMC

62,7 kBit/s

(30)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Chicken Crossing

915 kbit/s

507 kbit/s

original

400 time-consistent meshes with 3030 vertices

Evaluation

 

– Chicken,

 

3030

 

vertices

(31)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

MPEG

7

MPEG

7

Tremendous amount of multimedia is available and

growing

Search for content gets more and more difficult

Automatic tools to assist search are necessary, search

enigines for the Internet

Metadata are tagged to multimedia data for content

description and classification

(32)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

MPEG

7

Simplest

 

form:

 

descriptive

 

text

Manual

 

generation

Often

 

this

 

is

 

produced

 

during

 

production

 

anyway

 

(playlist,

 

story

 

board,

 

cast

 

list,

 

scripts,

 

etc.)

Has

 

to

 

be

 

associated

 

with

 

the

 

data

 

in

 

a

 

standardized

 

way

Metadata

 

Tagging

Martina Schmidt, Fernschachgegnerin, [email protected] Ärmelkanal, 23.10.2001 Temperatur 13°C Aussichtspunkt, 162 m, geformt aus Kalk während der Eiszeit ca. 10000 v.u.Z.

(33)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

MPEG

7

Automatic extraction of signal-based features

Visual: color, shape, texture, motion, etc.

Audio: harmony, melody, frequency features, etc.

Features are captured in compact form (few bits) in so

called „Descriptors“

Groups of „Descriptors“ can be combined to „Description

Schemes“

MPEG

7

 

System

MPEG-7-Inhaltsbeschreibung Description Scheme 1 Description Scheme 2 Descriptor 1 Descriptor 2 Descriptor 3 Beschreibende Parameter Merkmalsextraktion Ähnlichkeitsanalyse Anwendung

(34)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Visual

 

Descriptors

(35)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Query

by

Example,

 

Color

Beispielbild

gefundene Bilder

Datenbank mit 5000 Bildern

(36)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Visualization

 

of

 

TV

 

Channels

 

by

 

Color

(37)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Query

by

Example,

 

Texture

(38)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Region

 

Shape

 

Descriptor

(39)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Name Parametersatz

Transformation

Translation

a

1

,

b

1 0 1 1 0 1 1

y

b

y

x

a

x

4-Parameter

a

1

,

a

2

,

a

3

,

b

1 0 2 0 3 1 1 0 3 0 2 1 1

y

a

x

a

b

y

y

a

x

a

a

x

Affin

a

1

,

a

2

,

a

3

,

b

1

,

b

2

,

b

3

0 3 0 2 1 1 0 3 0 2 1 1

y

b

x

b

b

y

y

a

x

a

a

x

Perspektivisch

a

1

,

a

2

,

a

3

,

b

1

,

b

2

,

b

3

,

c

1

,

c

2 0 2 0 1 0 3 0 2 1 1 0 2 0 1 0 3 0 2 1 1

1

1

y

c

x

c

y

b

x

b

b

y

y

c

x

c

y

a

x

a

a

x

Parabolisch

,

,

,

,

,

,

,

,

,

,

,

6 5 4 3 2 1 6 5 4 3 2 1

b

b

b

b

b

b

a

a

a

a

a

a

0 0 6 2 0 5 2 0 4 0 3 0 2 1 1 0 0 6 2 0 5 2 0 4 0 3 0 2 1 1

y

x

b

y

b

x

b

y

b

x

b

b

y

y

x

a

y

a

x

a

y

a

x

a

a

x

Parametric 2D Motion Models

Influence of Different Parameters

-150 -100 -50 0 50 100 150 -150 -100 -50 0 50 100 150 x y -150 -100 -50 0 50 100 150 -150 -100 -50 0 50 100 150 x y -150 -100 -50 0 50 100 150 -150 -100 -50 0 50 100 150 x y

a

1

: x-Translation

a

2

: x-Skalierung

b

3

: y-Skalierung

-150 -100 -50 0 50 100 150 -150 -100 -50 0 50 100 150 x y

(40)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Search for different types of global motion by estimation and

evaluation of motion parameters

E.g. „Translation to the left“, „zoom out“

a

2,

b

3

MPEG-7: Parametric Motion Descriptor

a

1

System for Motion-Based Retrieval

Determination of shot boundaries (SG), and key frames (KF) as

visual representative

Estimation of average global motion over a shot

Storage of extracated metadata with the video in a data base

Videosignal KF KF KF SG SG SG PMD PMD PMD SG MPEG-7 Datenbank

(41)

Aljoscha Smolic Multimedia Communications

FOR

 

CLASS

 

USE

 

ONLY

DO

 

NOT

 

DISTRIBUTE

Selection of data

base

Selection of

processing

modes

Results

Selected

example

Video Search Engine

MPEG

7

 

Audio

Melody

 

DS,

 

captures

 

characteristics

 

of

 

melodies

 

in

 

compact

 

form

Allows

 

search

 

by

 

melody:

• http://www.musicline.de/de/melodiesuche

Extractor

 

of

 

MPEG

7

 

descriptors:

References

Related documents