• No results found

DS504/CS586: Big Data Analytics Deep Generative models (I) --Generative Adversarial Networks (GANs) Prof. Yanhua Li

N/A
N/A
Protected

Academic year: 2021

Share "DS504/CS586: Big Data Analytics Deep Generative models (I) --Generative Adversarial Networks (GANs) Prof. Yanhua Li"

Copied!
73
0
0

Loading.... (view fulltext now)

Full text

(1)

DS504/CS586: Big Data Analytics

Deep Generative models (I)

--

Generative Adversarial Networks (GANs)

Prof. Yanhua Li

Welcome to

Time: 6:00pm –8:50pm R Zoom Lecture

(2)

Next Week Arrangement

No class next Thursday

https://www.wpi.edu/sites/default/files/2019/03/21

/GR_2019-20.pdf

(3)

3

Project 2

Deadline: extended to next Thursday 3/18/2021.

Leaderboard by now:

https://github.com/yanhuata/DS504CS586-S21/tree/master/project2

Office hours next week.

1. Monday 10-11AM Prof Li

2. Monday 2-3PM TA Guojun (extra office hour)

3. Wednesday 2-3PM TA Guojun

(4)

4

Team project #5

Team list: available on Canvas

• 6 teams (Now) • Starts today

•Week 9 (3/25 R), Proposal due

Week 13 (4/22 R), Progressive report due Week 15 (5/4 T), Due. (Upload it to Canvas) • https://github.com/yanhuata/DS504CS586-S21/tree/master/project5

•Open project: You define it with your teammates. •We provided a few examples from AI conference competitions (See the above link).

•Discuss with your teammates for the idea, and submit a project proposal by 3/25

•Recommend scheduling an appointment with Prof Li to discuss your idea.

(5)

Urban Sensing & Data Acquisition

Participatory Sensing, Crowd Sensing, Mobile Sensing

Traffic Road Networks POIs Air Quality Human mobility Meteorolo gy Social Media Energy

Urban Data Management

Spatio-temporal index, streaming, trajectory, and graph data management,...

Urban Data Analytics

Data Mining, Machine Learning, Visualization

Service Providing

Improve urban planning, Ease Traffic Congestion, Save Energy, Reduce Air Pollution, ...

Urban Computing: concepts, methodologies, and applications.

Zheng, Y., et al. ACM transactions on Intelligent Systems and Technology.

Deep Learning

Generative adversarial Networks (GANs)

Flow based Generative models

Meta-Learning

Adversarial attacks Explainable AI (XAI)

(6)

Deep Generative Models

(7)

Introduction of Generative

Adversarial Network (GAN)

(8)

Mihaela Rosca, Balaji Lakshminarayanan, David Warde-Farley, Shakir Mohamed, “Variational Approaches for Auto-Encoding Generative Adversarial Networks”, arXiv, 2017

All Kinds of GAN …

https://github.com/hindupuravinash/the-gan-zoo

GAN ACGAN BGAN DCGAN EBGAN fGAN GoGAN CGAN

……

(9)

ICASSP

Keyword search on session index page, so session names are included.

Number of papers whose titles include the keyword

0 5 10 15 20 25 30 35 40 45 2012 2013 2014 2015 2016 2017 2018 generative adversarial reinforcement

GAN becomes a very

important technology.

(10)

Outline

Basic Idea of GAN

GAN as structured learning

Can Generator learn by itself?

Can Discriminator generate?

A little bit theory

(11)

Generation

NN Generator

Image Generation

Sentence Generation

NN Generator

We will control what to generate latter. → Conditional Generation

0.1 −0.1 ⋮ 0.7 −0.3 0.1 ⋮ 0.9 0.1 −0.1 ⋮ 0.2 −0.3 0.1 ⋮ 0.5 Good morning. How are you?

0.3 −0.1 ⋮ −0.7 0.3 −0.1 ⋮ −0.7 Good afternoon. In a specific range

(12)

Basic Idea of GAN

Generator

It is a neural network

(NN), or a function.

Generator 0.1 −3 ⋮ 2.4 0.9 image vector Generator 3 −3 ⋮ 2.4 0.9 Generator 0.1 2.1 ⋮ 5.4 0.9 Generator 0.1 −3 ⋮ 2.4 3.5 high dimensional vector

Powered by: http://mattya.github.io/chainer-DCGAN/

Each dimension of input vector

represents some characteristics. Longer hair

(13)

Discri-minator

scalar

image

Basic Idea of GAN

It is a neural network

(NN), or a function.

Larger value means real, smaller value means fake.

Discri-minator Discri-minator Discri-minator

1.0

1.0

0.1

Discri-minator

0.1

(14)

Basic Idea of GAN

Brown veins

Butterflies are

not brown not have veinsButterflies do ……..

Generator

(15)

Basic Idea of GAN

NN Generator v1 Discri-minator v1 Real images: NN Generator v2 Discri-minator v2 NN Generator v3 Discri-minator v3 This is where the term

“adversarial” comes from.

You can explain the process in different ways…….

(16)

Generator v3

Generator v2

Basic Idea of GAN

Generator(student) Discriminator(teacher)

Generator v1 Discriminator v1 Discriminator v2 Two eyes Back-&-White

(17)

Initialize generator and discriminator

In each training iteration:

D G sample generated objects G

Algorithm

D Update ve cto r ve cto r ve cto r ve cto r 0 0 0 0 1 1 1 1 randomly sampled Database

Step 1: Fix generator G, and update discriminator D

Discriminator learns to assign high scores to real objects

and low scores to generated objects.

(18)

Initialize generator and discriminator

In each training iteration:

D G

Algorithm

Step 2: Fix discriminator D, and update generator G

Discri-minator NN Generator vector 0.13 hidden layer

update

fix

Gradient Ascent

large network

(19)

• In each training iteration:

• Sample m examples ",, "-, … , ". from database • Sample m noise samples %,, %-, … , %. from a

distribution

• Obtaining generated data &",, &"-, … , &". , &"/ = ( %/ • Update discriminator parameters )0 to maximize

• *+ = ., ∑/1,. -./0 "/ + ., ∑/1,. -./ 1 − 0 &"/ • )0 ← )0 + 56 *+ )0

• Sample m noise samples %,, %-, … , %. from a distribution

• Update generator parameters )2 to maximize • *+ = ., ∑/1,. -./ 0 ( %/ • )2 ← )2 + 56 *+ )2

Algorithm

Learning D Learning G

(20)

Anime Face Generation

100 updates

(21)

Anime Face Generation

(22)

Anime Face Generation

(23)

Anime Face Generation

(24)

Anime Face Generation

(25)

Anime Face Generation

(26)

Anime Face Generation

(27)

The faces

generated by

machine.

(28)

0.0 0.0

G

0.9 0.9

G

0.1 0.1

G

0.2 0.2

G

0.3 0.3

G

0.4 0.4

G

0.5 0.5

G

0.6 0.6

G

0.7 0.7

G

0.8 0.8

G

(29)

Outline

Basic Idea of GAN

GAN as structured learning

Can Generator learn by itself?

Can Discriminator generate?

A little bit theory

(30)

Structured Learning

Y

X

f

:

®

Regression: output a scalar

Classification: output a “class”

Structured Learning/Prediction: output a

sequence, a matrix, a graph, a tree ……

(one-hot vector)

1 0 0

Machine learning is to find a function f

Output is composed of components with dependency

0 1 0 0 0 1

(31)

Output Sequence

(what a user says)

Y

X

f

:

®

“机器学习” “Machine learning”

:

X

Y

:

"Thanks for joining me”

(response of machine)

:

X

Y

:

Machine Translation Speech Recognition Chat-bot (speech)

:

X

Y

:

(transcription) (sentence of language 1) (sentence of language 2)

(32)

Output Matrix

f

:

X

®

Y

ref: https://arxiv.org/pdf/1605.05396.pdf

“this white and yellow flower have thin white petals and a round yellow stamen”

:

X

Y

:

Text to Image Image to Image

:

X

Y

:

Colorization: Ref: https://arxiv.org/pdf/1611.07004v1.pdf

(33)

Why Structured Learning

Challenging?

One-shot/Zero-shot Learning:

In classification, each class has some examples.

In structured learning,

If you consider each possible output as a

“class” ……

Since the output space is huge, most “classes”

do not have any training data.

Machine has to create new stuff during

testing.

Need more intelligence

(34)

Why Structured Learning

Challenging?

Machine has to learn to do planning

• Machine generates objects component-by-component, but it should have a big picture in its mind.

• Because the output components have dependency, they should be considered globally.

Image

Generation

Sentence

(35)

Structured Learning Approach

Bottom Up Top Down Generative Adversarial Network (GAN)

Generator

Discriminator

Learn to generate the object at the component level

Evaluating the

whole object, and find the best one

(36)

Outline

Basic Idea of GAN

GAN as structured learning

Can Generator learn by itself?

Can Discriminator generate?

A little bit theory

(37)

Generator

Image:

code: 0.1

0.9

(where does they come from?) NN Generator 0.1 −0.5 0.2 −0.1 0.30.2 NN Generator co de vectors co de code 0.1 0.9 image As close as possible NN Classifier A, A -⋮ As close as possible 1 0 ⋮

(38)

Generator

Encoder in auto-encoder provides the code J

C NN Encoder NN Generator co de vectors co de code Image: code:

(where does they come from?) 0.1 0.9 0.1 −0.5 0.2 −0.1 0.30.2

(39)

Auto-encoder

NN Encoder NN Decoder code Compact representation of the input object

code Can reconstruct the original object

Learn together

28 X 28 = 784 Low dimension C

Trainable

NN Encoder DecoderNN

(40)

Auto-encoder

As close as possible NN Encoder DecoderNN co de NN Decoder co de Randomly generate

a vector as code

Image ?

= Generator

(41)

Auto-encoder

NN Decoder code 2D -1.5 1.5 −1.5 0 NN Decoder 1.5 0 NN Decoder (real examples) image

(42)

Auto-encoder

-1.5 1.5

(43)

Auto-encoder

NN Generator co de vectors co de code NN Generator a b GeneratorNN NN Generator

?

a b 0.5x + 0.5x Image: code:

(where does them come from?) 0.1 0.9 0.1 −0.5 0.2 −0.1 0.30.2

(44)

Auto-encoder

Variational Auto-encoder

(VAE)

NN Encoder input NN Decoder output m1 m2 m3 D, D -D3 E3 E, E -From a normal distribution C3 C, C -X + Minimize reconstruction error exp C/ = E"F D/ ×E/ + H/ NN Encoder DecoderNN code input output = Generator

(45)

What do we miss?

G as close as possible Target Generated Image

It will be fine if the generator can truly copy the target image. What if the generator makes some mistakes …….

(46)

What do we miss?

Target

1 pixel error 1 pixel error

6 pixel errors 6 pixel errors Not good Not good

(47)

What do we miss?

Not good fine

The relation between the components are critical.

Although highly correlated, they cannot influence each other. Need deep structure to catch the relation between

components. Layer L-1

……

Layer L

……

……

……

……

Each neural in output layer corresponds to a pixel.

ink

empty

Hi, Neighbor, Let's keep the

same color No.

(48)

Outline

Basic Idea of GAN

GAN as structured learning

Can Generator learn by itself?

Can Discriminator generate?

A little bit theory

(49)

Discriminator

Discriminator is a function D (network, can deep)

Input x: an object x (e.g. an image)

• Output D(x): scalar which represents how “good” an object x is

R

:

D

X

®

D

1.0

D

0.1

Can we use the discriminator to generate objects?

Hard.

Evaluation function, Potential

Function, Energy Function …

(50)

Discriminator

Suppose we already have a good discriminator

D(x) …

How to learn the discriminator?

Enumerate all possible x !!!

(51)

Generator v.s. Discriminator

Generator

Pros:

• Easy to generate even with deep model

Cons:

• Imitate the appearance • Hard to learn the

correlation between components

Discriminator

Pros:

Considering the big picture

Cons:

• Generation is not always feasible

• Especially when your model is deep • How to do negative

(52)

Generator + Discriminator

General Algorithm

Given a set of positive examples, randomly

generate a set of

negative examples

.

In each iteration

Learn a discriminator D that can discriminate

positive and negative examples.

Generate negative examples by

discriminator D

D

( )

x

D

x

X xÎ

=

arg

max

~

v.s. G

x~

=

(53)

Benefit of GAN

From Discriminator’s point of view

Using generator to generate negative samples

From Generator’s point of view

Still generate the object

component-by-component

But it is learned from the discriminator with

global view.

( )

x

D

x

X xÎ

=

arg

max

~

G

x~

=

efficient

(54)

(Variational) Auto-encoder

x1 x2 G %, %- "",

(55)

-GAN

https://arxiv.org/a bs/1512.09300

VAE GAN

(56)

Outline

Basic Idea of GAN

GAN as structured learning

Can Generator learn by itself?

Can Discriminator generate?

A little bit theory

(57)

Generator

A generator G is a network. The network defines a probability distribution I4

generator

G

% " = ( % Normal Distribution I4(") I0565 " as close as possible

How to compute the divergence?

!

= #$% min

"

)*+ ,

"

, ,

#$%$

Divergence between distributions I4 and I0565 ": an image (a high-dimensional vector)

(58)

Discriminator

!

= #$% min

"

)*+ ,

"

, ,

#$%$

Although we do not know the distributions of I4 and I0565, we can sample from them.

sample G ve cto r ve cto r ve cto r ve cto r sample from normal Database Sampling from L7 Sampling from L89:9

(59)

Discriminator

!

= #$% min

"

)*+ ,

"

, ,

#$%$

Discriminator

: data sampled from I0565

: data sampled from I4

train

+ (, 0 = M;∼=!"#" -./0 " + M;∼=$ -./ 1 − 0 "

Example Objective Function for D

(G is fixed) 0∗ = NO/ max

? + 0, (

Training:

Using the example objective function is exactly the same as training a binary classifier.

The maximum objective value is related to JS divergence.

(60)

Discriminator

!

= #$% min

"

)*+ ,

"

, ,

#$%$

Discriminator

: data sampled from I0565

: data sampled from I4

train hard to discriminate small divergence

Discriminator

train easy to discriminate large divergence 0∗ = NO/ max ? + 0, ( Training:

(61)

!

= #$% min

"

)*+ ,

max

&

0 !, )

"

, ,

#$%$

The maximum objective value is related to JS divergence.

• Initialize generator and discriminator • In each training iteration:

Step 1: Fix generator G, and update discriminator D Step 2: Fix discriminator D, and update generator G

0∗ = NO/ max

? + 0, (

(62)

Outline

Basic Idea of GAN

GAN as structured learning

Can Generator learn by itself?

Can Discriminator generate?

A little bit theory

(63)

Domain X

Domain Y

male female

It is good.

It’s a good day. I love you.

It is bad.

It’s a bad day. I don’t love you.

Unsupervised Conditional Generation

Transform an object from one domain to another

without paired data (e.g. style transfer)

G

(64)

Unsupervised

Conditional Generation

Approach 1: Direct Transformation

Approach 2: Projection to Common Space

?

!

'→) Domain X Domain Y For texture or color change

12

'

)1

) Encoder of domain X Decoder of domain Y

Larger change, only keep the semantics

Domain Y

Domain X Image

(65)

?

Direct Transformation

!

'→) Domain X Domain Y

)

) Domain Y Domain X scalar Input image belongs to domain Y or not Become similar to domain Y

(66)

Direct Transformation

!

'→) Domain X Domain Y

)

) Domain Y Domain X scalar Input image belongs to domain Y or not Become similar to domain Y

Not what we want! ignore input

(67)

Direct Transformation

!

'→) Domain X Domain Y

)

) Domain X scalar Input image belongs to domain Y or not Become similar to domain Y

Not what we want! ignore input

[Tomer Galanti, et al. ICLR, 2018]

The issue can be avoided by network design. Simpler generator makes the input and

(68)

Direct Transformation

!

'→) Domain X Domain Y

)

) Domain X scalar Input image belongs to domain Y or not Become similar to domain Y Encoder

Network pre-trained NetworkEncoder as close as

possible

(69)

Unsupervised

Conditional Generation

Approach 1: Direct Transformation

Approach 2: Projection to Common Space

?

!

'→) Domain X Domain Y For texture or color change

12

'

)1

) Encoder of domain X Decoder of domain Y

Larger change, only keep the semantics

Domain Y

Domain X Image

(70)

Domain X Domain Y

12

'

12

)

)1

)

)1

' image image image image Face Attribute

Projection to Common Space

(71)

12

'

12

)

)1

)

)1

' image image image image

)

'

)

) Discriminator of X domain Discriminator of Y domain

Projection to Common Space

Training

Cycle Consistency:

Used in ComboGAN

[Asha Anoosheh, et al., arXiv, 2017] Minimizing reconstruction error

(72)

Reference

• Jun-Yan Zhu, Taesung Park, Phillip Isola, Alexei A. Efros, Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, ICCV, 2017 • Zili Yi, Hao Zhang, Ping Tan, Minglun Gong, DualGAN: Unsupervised Dual

Learning for Image-to-Image Translation, ICCV, 2017

• Tomer Galanti, Lior Wolf, Sagie Benaim, The Role of Minimal Complexity Functions in Unsupervised Learning of Semantic Mappings, ICLR, 2018

• Yaniv Taigman, Adam Polyak, Lior Wolf, Unsupervised Cross-Domain Image Generation, ICLR, 2017

• Asha Anoosheh, Eirikur Agustsson, Radu Timofte, Luc Van Gool, ComboGAN: Unrestrained Scalability for Image Domain Translation, arXiv, 2017

• Amélie Royer, Konstantinos Bousmalis, Stephan Gouws, Fred Bertsch, Inbar Mosseri, Forrester Cole, Kevin Murphy, XGAN: Unsupervised Image-to-Image Translation for Many-to-Many Mappings, arXiv, 2017

(73)

References

Related documents

Sorin, V., et al., Creating Artificial Images for Radiology Applications Using Generative. Adversarial Networks (GANs) - A

Troponin T and CK-MB Had Higher Peak Values in Type 1 MI vs.. Peak Troponin T was Disproportionately Higher Than Peak CK-MB in Type 1

The regression kink ( RK ) sample consists of first-time payday loan borrowers living in states offering payday loans in $1 or $10 increments who are paid biweekly or

By default, these jumpers enable Wakeup power to the USB ports. The circuit facilitates external power up from a USB device source. Switching the jumper caps to [2-3] disables

In September 2007, Attorney General Cuomo subpoenaed the executives of these energy companies for information on whether disclosures to investors in filings with the SEC

For the past five years I have attended AECT (Association for Educational Communications and Technology) and AERA (American Educational Research Association) as well as a number

Sedangkan, untuk pengujian data sinyal suara dengan menggunakan satu kalimat yang terdiri dari 5 kata, terhadap 50 data yang diuji, sistem berhasil mengenali dengan baik

[0010] Hence, a track system in accordance with the principles of the present invention generally comprises a drive wheel configured to be mounted to an axle of a vehicle, a support