DS504/CS586: Big Data Analytics Deep Generative models (I) --Generative Adversarial Networks (GANs) Prof. Yanhua Li

(1)

DS504/CS586: Big Data Analytics

Deep Generative models (I)

--

Generative Adversarial Networks (GANs)

Prof. Yanhua Li

Welcome to

Time: 6:00pm –8:50pm R Zoom Lecture

(2)

Next Week Arrangement

No class next Thursday

https://www.wpi.edu/sites/default/files/2019/03/21

/GR_2019-20.pdf

(3)

3

Project 2

Deadline: extended to next Thursday 3/18/2021.

Leaderboard by now:

https://github.com/yanhuata/DS504CS586-S21/tree/master/project2

Office hours next week.

1. Monday 10-11AM Prof Li

2. Monday 2-3PM TA Guojun (extra office hour)

3. Wednesday 2-3PM TA Guojun

(4)

4

Team project #5

Team list: available on Canvas

• 6 teams (Now) • Starts today

•Week 9 (3/25 R), Proposal due

Week 13 (4/22 R), Progressive report due Week 15 (5/4 T), Due. (Upload it to Canvas) • https://github.com/yanhuata/DS504CS586-S21/tree/master/project5

•Open project: You define it with your teammates. •We provided a few examples from AI conference competitions (See the above link).

•Discuss with your teammates for the idea, and submit a project proposal by 3/25

•Recommend scheduling an appointment with Prof Li to discuss your idea.

(5)

Urban Sensing & Data Acquisition

Participatory Sensing, Crowd Sensing, Mobile Sensing

Traffic Road Networks POIs Air Quality Human mobility Meteorolo gy Social Media Energy

Urban Data Management

Spatio-temporal index, streaming, trajectory, and graph data management,...

Urban Data Analytics

Data Mining, Machine Learning, Visualization

Service Providing

Improve urban planning, Ease Traffic Congestion, Save Energy, Reduce Air Pollution, ...

Urban Computing: concepts, methodologies, and applications.

Zheng, Y., et al. ACM transactions on Intelligent Systems and Technology.

Deep Learning

Generative adversarial Networks (GANs)

Flow based Generative models

Meta-Learning

Adversarial attacks Explainable AI (XAI)

(6)

Deep Generative Models

(7)

Introduction of Generative

Adversarial Network (GAN)

(8)

Mihaela Rosca, Balaji Lakshminarayanan, David Warde-Farley, Shakir Mohamed, “Variational Approaches for Auto-Encoding Generative Adversarial Networks”, arXiv, 2017

All Kinds of GAN …

https://github.com/hindupuravinash/the-gan-zoo

GAN ACGAN BGAN DCGAN EBGAN fGAN GoGAN CGAN

……

(9)

ICASSP

Keyword search on session index page, so session names are included.

Number of papers whose titles include the keyword

0 5 10 15 20 25 30 35 40 45 2012 2013 2014 2015 2016 2017 2018 generative adversarial reinforcement

GAN becomes a very

important technology.

(10)

Outline

Basic Idea of GAN

GAN as structured learning

Can Generator learn by itself?

Can Discriminator generate?

A little bit theory

(11)

Generation

NN Generator

Image Generation

Sentence Generation

NN Generator

We will control what to generate latter. → Conditional Generation

0.1 −0.1 ⋮ 0.7 −0.3 0.1 ⋮ 0.9 0.1 −0.1 ⋮ 0.2 −0.3 0.1 ⋮ 0.5 Good morning. How are you?

0.3 −0.1 ⋮ −0.7 0.3 −0.1 ⋮ −0.7 Good afternoon. In a specific range

(12)

Basic Idea of GAN

Generator

It is a neural network

(NN), or a function.

Generator 0.1 −3 ⋮ 2.4 0.9 image vector Generator 3 −3 ⋮ 2.4 0.9 Generator 0.1 2.1 ⋮ 5.4 0.9 Generator 0.1 −3 ⋮ 2.4 3.5 high dimensional vector

Powered by: http://mattya.github.io/chainer-DCGAN/

Each dimension of input vector

represents some characteristics. Longer hair

(13)

Discri-minator

scalar

image

Basic Idea of GAN

It is a neural network

(NN), or a function.

Larger value means real, smaller value means fake.

Discri-minator Discri-minator Discri-minator

1.0

0.1

Discri-minator

0.1

(14)

Basic Idea of GAN

Brown veins

Butterflies are

not brown not have veinsButterflies do ……..

Generator

(15)

Basic Idea of GAN

NN Generator v1 Discri-minator v1 Real images: NN Generator v2 Discri-minator v2 NN Generator v3 Discri-minator v3 This is where the term

“adversarial” comes from.

You can explain the process in different ways…….

(16)

Generator v3

Generator v2

Basic Idea of GAN

Generator(student) Discriminator(teacher)

Generator v1 Discriminator v1 Discriminator v2 Two eyes Back-&-White

(17)

• _{Initialize generator and discriminator}

• _{In each training iteration:}

D G sample generated objects G

Algorithm

D Update ve cto r ve cto r ve cto r ve cto r 0 0 0 0 1 1 1 1 randomly sampled Database

Step 1: Fix generator G, and update discriminator D

Discriminator learns to assign high scores to real objects

and low scores to generated objects.

(18)

• _{Initialize generator and discriminator}

• _{In each training iteration:}

D G

Algorithm

Step 2: Fix discriminator D, and update generator G

Discri-minator NN Generator vector 0.13 hidden layer

update

fix

Gradient Ascent

large network

(19)

• In each training iteration:

• Sample m examples ",, "-, … , ". from database • _{Sample m noise samples %},, %-, … , %. from a

distribution

• Obtaining generated data &",, &"-, … , &". , &"/ = ( %/ • Update discriminator parameters )₀ to maximize

• *+ = _., ∑_/1,. -./0 "/ + _., ∑_/1,. -./ 1 − 0 &"/ • )₀ ← )₀ + 56 *+ )₀

• Sample m noise samples %,, %-, … , %. from a distribution

• Update generator parameters )₂ to maximize • *+ = _., ∑_/1,. -./ 0 ( %/ • )₂ ← )₂ + 56 *+ )₂

Algorithm

Learning D Learning G

(20)

Anime Face Generation

100 updates

(21)

Anime Face Generation

(22)

Anime Face Generation

(23)

Anime Face Generation

(24)

Anime Face Generation

(25)

Anime Face Generation

(26)

Anime Face Generation

(27)

The faces

generated by

machine.

(28)

0.0 0.0

G

0.9 0.9

G

0.1 0.1

G

0.2 0.2

G

0.3 0.3

G

0.4 0.4

G

0.5 0.5

G

0.6 0.6

G

0.7 0.7

G

0.8 0.8

G

(29)

Outline

Basic Idea of GAN

GAN as structured learning

Can Generator learn by itself?

Can Discriminator generate?

A little bit theory

(30)

Structured Learning

Y

X

f

:

®

Regression: output a scalar

Classification: output a “class”

Structured Learning/Prediction: output a

sequence, a matrix, a graph, a tree ……

(one-hot vector)

1 0 0

Machine learning is to find a function f

Output is composed of components with dependency

0 1 0 0 0 1

(31)

Output Sequence

(what a user says)

Y

X

f

:

®

“机器学习” “Machine learning”

:

X

Y

:

"Thanks for joining me”

(response of machine)

:

X

Y

:

Machine Translation Speech Recognition Chat-bot (speech)

:

X

Y

:

(transcription) (sentence of language 1) (sentence of language 2)

(32)

Output Matrix

f

:

X

®

Y

ref: https://arxiv.org/pdf/1605.05396.pdf

“this white and yellow flower have thin white petals and a round yellow stamen”

:

X

_Y

_:

Text to Image Image to Image

:

X

_Y

_:

Colorization: Ref: https://arxiv.org/pdf/1611.07004v1.pdf

(33)

Why Structured Learning

Challenging?

• _{One-shot/Zero-shot Learning:}

• _{In classification, each class has some examples.}

• _{In structured learning,}

• _{If you consider each possible output as a}

“class” ……

• _{Since the output space is huge, most “classes”}

do not have any training data.

• _{Machine has to create new stuff during}

testing.

• _{Need more intelligence}

(34)

Why Structured Learning

Challenging?

• _{Machine has to learn to do planning}

• Machine generates objects component-by-component, but it should have a big picture in its mind.

• Because the output components have dependency, they should be considered globally.

Image

Generation

Sentence

(35)

Structured Learning Approach

Bottom Up Top Down Generative Adversarial Network (GAN)

Generator

Discriminator

Learn to generate the object at the component level

Evaluating the

whole object, and find the best one

(36)

Outline

Basic Idea of GAN

GAN as structured learning

Can Generator learn by itself?

Can Discriminator generate?

A little bit theory

(37)

Generator

Image:

code: 0.1

0.9

(where does they come from?) NN Generator 0.1 −0.5 0.2 −0.1 0.30.2 NN Generator co de vectors co de cod_e 0.1 0.9 image As close as possible NN Classifier A_, A -⋮ As close as possible 1 0 ⋮

(38)

Generator

Encoder in auto-encoder provides the code J

C NN Encoder NN Generator co de vectors co de cod_e Image: code:

(where does they come from?) 0.1 0.9 0.1 −0.5 0.2 −0.1 0.30.2

(39)

Auto-encoder

NN Encoder NN Decoder code Compact representation of the input object

code Can reconstruct _{the original object}

Learn together

28 X 28 = 784 Low dimension C

Trainable

NN Encoder DecoderNN

(40)

Auto-encoder

As close as possible NN Encoder _DecoderNN co de NN Decoder co de Randomly generate

a vector as code

Image ?

= Generator

(41)

Auto-encoder

NN Decoder code 2D -1.5 _1.5 −1.5 0 NN Decoder 1.5 0 NN Decoder (real examples) image

(42)

Auto-encoder

-1.5 _1.5

(43)

Auto-encoder

NN Generator co de vectors co de cod_e NN Generator a b _GeneratorNN NN Generator

?

a b 0.5x _{+ 0.5x} Image: code:

(where does them come from?) 0.1 0.9 0.1 −0.5 0.2 −0.1 0.30.2

(44)

Auto-encoder

Variational Auto-encoder

(VAE)

NN Encoder input NN Decoder output m₁ m₂ m₃ D_, D -D₃ E₃ E_, E -From a normal distribution C₃ C_, C -X + Minimize reconstruction error exp C_/ = E"F D_/ ×E_/ + H_/ NN Encoder _DecoderNN code input output = Generator

(45)

What do we miss?

G as close as possible Target Generated Image

It will be fine if the generator can truly copy the target image. What if the generator makes some mistakes …….

(46)

What do we miss?

Target

1 pixel error 1 pixel error

6 pixel errors 6 pixel errors Not good Not good

(47)

What do we miss?

Not good fine

The relation between the components are critical.

Although highly correlated, they cannot influence each other. Need deep structure to catch the relation between

components. Layer L-1

……

Layer L

……

Each neural in output layer corresponds to a pixel.

ink

empty

Hi, Neighbor, Let's keep the

same color No.

(48)

Outline

Basic Idea of GAN

GAN as structured learning

Can Generator learn by itself?

Can Discriminator generate?

A little bit theory

(49)

Discriminator

• _{Discriminator is a function D (network, can deep)}

• _{Input x: an object x (e.g. an image)}

• Output D(x): scalar which represents how “good” an object x is

R

:

D

X

®

D

_1.0

D

0.1 Can we use the discriminator to generate objects?

Hard.

Evaluation function, Potential

Function, Energy Function …

(50)

Discriminator

• _{Suppose we already have a good discriminator}

D(x) …

How to learn the discriminator?

Enumerate all possible x !!!

(51)

Generator v.s. Discriminator

• _Generator

• _Pros:

• Easy to generate even with deep model

• _Cons:

• Imitate the appearance • Hard to learn the

correlation between components

• Discriminator

• _Pros:

• _{Considering the big} picture

• _Cons:

• Generation is not always feasible

• Especially when your model is deep • How to do negative

(52)

Generator + Discriminator

• _{General Algorithm}

• _{Given a set of positive examples, randomly}

generate a set of

negative examples

.

• _{In each iteration}

• Learn a discriminator D that can discriminate

positive and negative examples.

• Generate negative examples by

discriminator D

D

( )

x

D

x

X xÎ

=

arg

max

~

v.s. G

x~

₌

(53)

Benefit of GAN

• _{From Discriminator’s point of view}

• _{Using generator to generate negative samples}

• _{From Generator’s point of view}

• _{Still generate the object}

component-by-component

• _{But it is learned from the discriminator with}

global view.

( )

x

D

x

X xÎ

=

arg

max

~

G

x~

₌

efficient

(54)

(Variational) Auto-encoder

x₁ x₂ G %_, %_- "",

(55)

-GAN

https://arxiv.org/a bs/1512.09300

VAE GAN

(56)

Outline

Basic Idea of GAN

GAN as structured learning

Can Generator learn by itself?

Can Discriminator generate?

A little bit theory

(57)

Generator

• _{A generator G is a network. The network defines a} probability distribution I₄

generator

G

% " = ( % Normal Distribution I₄(") I₀₅₆₅ " as close as possible

How to compute the divergence?

!

∗

= #$% min

"

)*+ ,

"

, ,

#$%$

Divergence between distributions I₄ and I₀₅₆₅ ": an image (a high-dimensional vector)

(58)

Discriminator

!

∗

= #$% min

"

)*+ ,

"

, ,

#$%$

Although we do not know the distributions of I₄ and I₀₅₆₅, we can sample from them.

sample G ve cto r ve cto r ve cto r ve cto r sample from normal Database Sampling from L₇ Sampling from L_89:9

(59)

Discriminator

!

∗

= #$% min

"

)*+ ,

"

, ,

#$%$

Discriminator

: data sampled from I0565

: data sampled from I₄

train

+ (, 0 = M_;∼=_!"#" -./0 " + M_;∼=_$ -./ 1 − 0 "

Example Objective Function for D

(G is fixed) 0∗ = NO/ max

? + 0, (

Training:

Using the example objective function is exactly the same as training a binary classifier.

The maximum objective value is related to JS divergence.

(60)

Discriminator

!

∗

= #$% min

"

)*+ ,

"

, ,

#$%$

Discriminator

: data sampled from I0565

: data sampled from I₄

train hard to discriminate small divergence

Discriminator

train easy to discriminate large divergence 0∗ = NO/ max ? + 0, ( Training:

(61)

!

∗

= #$% min

"

)*+ ,

max

&

0 !, )

"

, ,

#$%$

The maximum objective value is related to JS divergence.

• Initialize generator and discriminator • In each training iteration:

Step 1: Fix generator G, and update discriminator D Step 2: Fix discriminator D, and update generator G

0∗ = NO/ max

? + 0, (

(62)

Outline

Basic Idea of GAN

GAN as structured learning

Can Generator learn by itself?

Can Discriminator generate?

A little bit theory

(63)

Domain X

Domain Y

male _female

It is good.

It’s a good day. I love you.

It is bad.

It’s a bad day. I don’t love you.

Unsupervised Conditional Generation

Transform an object from one domain to another

without paired data (e.g. style transfer)

G

(64)

Unsupervised

Conditional Generation

• _{Approach 1: Direct Transformation}

• _{Approach 2: Projection to Common Space}

?

!

_'→) Domain X _{Domain Y} For texture or color change

12

_'

)1

₎ Encoder of domain X Decoder of domain Y

Larger change, only keep the semantics

Domain Y

Domain X Image

(65)

?

Direct Transformation

!

_'→) Domain X Domain Y

)

₎ Domain Y Domain X scalar Input image belongs to domain Y or not Become similar to domain Y

(66)

Direct Transformation

!

)

₎ Domain Y Domain X scalar Input image belongs to domain Y or not Become similar to domain Y

Not what we want! ignore input

(67)

Direct Transformation

!

)

₎ Domain X scalar Input image belongs to domain Y or not Become similar to domain Y

Not what we want! ignore input

[Tomer Galanti, et al. ICLR, 2018]

The issue can be avoided by network design. Simpler generator makes the input and

(68)

Direct Transformation

!

)

₎ Domain X scalar Input image belongs to domain Y or not Become similar to domain Y Encoder

Network pre-trained NetworkEncoder as close as

possible

(69)

Unsupervised

Conditional Generation

• _{Approach 1: Direct Transformation}

• _{Approach 2: Projection to Common Space}

?

!

_'→) Domain X _{Domain Y} For texture or color change

12

_'

)1

₎ Encoder of domain X Decoder of domain Y

Larger change, only keep the semantics

Domain Y

Domain X Image

(70)

Domain X _{Domain Y}

12

_'

12

₎

)1

₎

)1

_' image image image image Face Attribute

Projection to Common Space

(71)

12

_'

12

₎

)1

)

)1

_' image image image image

)

_'

)

₎ Discriminator of X domain Discriminator of Y domain

Projection to Common Space

Training

Cycle Consistency:

Used in ComboGAN

[Asha Anoosheh, et al., arXiv, 2017] Minimizing reconstruction error

(72)

Reference

• Jun-Yan Zhu, Taesung Park, Phillip Isola, Alexei A. Efros, Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, ICCV, 2017 • Zili Yi, Hao Zhang, Ping Tan, Minglun Gong, DualGAN: Unsupervised Dual

Learning for Image-to-Image Translation, ICCV, 2017

• Tomer Galanti, Lior Wolf, Sagie Benaim, The Role of Minimal Complexity Functions in Unsupervised Learning of Semantic Mappings, ICLR, 2018

• Yaniv Taigman, Adam Polyak, Lior Wolf, Unsupervised Cross-Domain Image Generation, ICLR, 2017

• Asha Anoosheh, Eirikur Agustsson, Radu Timofte, Luc Van Gool, ComboGAN: Unrestrained Scalability for Image Domain Translation, arXiv, 2017

• Amélie Royer, Konstantinos Bousmalis, Stephan Gouws, Fred Bertsch, Inbar Mosseri, Forrester Cole, Kevin Murphy, XGAN: Unsupervised Image-to-Image Translation for Many-to-Many Mappings, arXiv, 2017

(73)