DS504/CS586: Big Data Analytics
Deep Generative models (I)
--
Generative Adversarial Networks (GANs)
Prof. Yanhua Li
Welcome to
Time: 6:00pm –8:50pm R Zoom Lecture
Next Week Arrangement
No class next Thursday
https://www.wpi.edu/sites/default/files/2019/03/21
/GR_2019-20.pdf
3
Project 2
Deadline: extended to next Thursday 3/18/2021.
Leaderboard by now:
https://github.com/yanhuata/DS504CS586-S21/tree/master/project2
Office hours next week.
1. Monday 10-11AM Prof Li
2. Monday 2-3PM TA Guojun (extra office hour)
3. Wednesday 2-3PM TA Guojun
4
Team project #5
Team list: available on Canvas
• 6 teams (Now) • Starts today
•Week 9 (3/25 R), Proposal due
Week 13 (4/22 R), Progressive report due Week 15 (5/4 T), Due. (Upload it to Canvas) • https://github.com/yanhuata/DS504CS586-S21/tree/master/project5
•Open project: You define it with your teammates. •We provided a few examples from AI conference competitions (See the above link).
•Discuss with your teammates for the idea, and submit a project proposal by 3/25
•Recommend scheduling an appointment with Prof Li to discuss your idea.
Urban Sensing & Data Acquisition
Participatory Sensing, Crowd Sensing, Mobile Sensing
Traffic Road Networks POIs Air Quality Human mobility Meteorolo gy Social Media Energy
Urban Data Management
Spatio-temporal index, streaming, trajectory, and graph data management,...
Urban Data Analytics
Data Mining, Machine Learning, Visualization
Service Providing
Improve urban planning, Ease Traffic Congestion, Save Energy, Reduce Air Pollution, ...
Urban Computing: concepts, methodologies, and applications.
Zheng, Y., et al. ACM transactions on Intelligent Systems and Technology.
Deep Learning
Generative adversarial Networks (GANs)
Flow based Generative models
Meta-Learning
Adversarial attacks Explainable AI (XAI)
Deep Generative Models
Introduction of Generative
Adversarial Network (GAN)
Mihaela Rosca, Balaji Lakshminarayanan, David Warde-Farley, Shakir Mohamed, “Variational Approaches for Auto-Encoding Generative Adversarial Networks”, arXiv, 2017
All Kinds of GAN …
https://github.com/hindupuravinash/the-gan-zooGAN ACGAN BGAN DCGAN EBGAN fGAN GoGAN CGAN
……
ICASSP
Keyword search on session index page, so session names are included.Number of papers whose titles include the keyword
0 5 10 15 20 25 30 35 40 45 2012 2013 2014 2015 2016 2017 2018 generative adversarial reinforcement
GAN becomes a very
important technology.
Outline
Basic Idea of GAN
GAN as structured learning
Can Generator learn by itself?
Can Discriminator generate?
A little bit theory
Generation
NN GeneratorImage Generation
Sentence Generation
NN GeneratorWe will control what to generate latter. → Conditional Generation
0.1 −0.1 ⋮ 0.7 −0.3 0.1 ⋮ 0.9 0.1 −0.1 ⋮ 0.2 −0.3 0.1 ⋮ 0.5 Good morning. How are you?
0.3 −0.1 ⋮ −0.7 0.3 −0.1 ⋮ −0.7 Good afternoon. In a specific range
Basic Idea of GAN
GeneratorIt is a neural network
(NN), or a function.
Generator 0.1 −3 ⋮ 2.4 0.9 image vector Generator 3 −3 ⋮ 2.4 0.9 Generator 0.1 2.1 ⋮ 5.4 0.9 Generator 0.1 −3 ⋮ 2.4 3.5 high dimensional vectorPowered by: http://mattya.github.io/chainer-DCGAN/
Each dimension of input vector
represents some characteristics. Longer hair
Discri-minator
scalar
image
Basic Idea of GAN
It is a neural network
(NN), or a function.
Larger value means real, smaller value means fake.
Discri-minator Discri-minator Discri-minator
1.0
1.0
0.1
Discri-minator0.1
Basic Idea of GAN
Brown veins
Butterflies are
not brown not have veinsButterflies do ……..
Generator
Basic Idea of GAN
NN Generator v1 Discri-minator v1 Real images: NN Generator v2 Discri-minator v2 NN Generator v3 Discri-minator v3 This is where the term“adversarial” comes from.
You can explain the process in different ways…….
Generator v3
Generator v2
Basic Idea of GAN
Generator(student) Discriminator(teacher)Generator v1 Discriminator v1 Discriminator v2 Two eyes Back-&-White
•
Initialize generator and discriminator
•
In each training iteration:
D G sample generated objects G
Algorithm
D Update ve cto r ve cto r ve cto r ve cto r 0 0 0 0 1 1 1 1 randomly sampled DatabaseStep 1: Fix generator G, and update discriminator D
Discriminator learns to assign high scores to real objects
and low scores to generated objects.
•
Initialize generator and discriminator
•
In each training iteration:
D G
Algorithm
Step 2: Fix discriminator D, and update generator G
Discri-minator NN Generator vector 0.13 hidden layer
update
fix
Gradient Ascentlarge network
• In each training iteration:
• Sample m examples ",, "-, … , ". from database • Sample m noise samples %,, %-, … , %. from a
distribution
• Obtaining generated data &",, &"-, … , &". , &"/ = ( %/ • Update discriminator parameters )0 to maximize
• *+ = ., ∑/1,. -./0 "/ + ., ∑/1,. -./ 1 − 0 &"/ • )0 ← )0 + 56 *+ )0
• Sample m noise samples %,, %-, … , %. from a distribution
• Update generator parameters )2 to maximize • *+ = ., ∑/1,. -./ 0 ( %/ • )2 ← )2 + 56 *+ )2
Algorithm
Learning D Learning GAnime Face Generation
100 updates
Anime Face Generation
Anime Face Generation
Anime Face Generation
Anime Face Generation
Anime Face Generation
Anime Face Generation
The faces
generated by
machine.
0.0 0.0
G
0.9 0.9G
0.1 0.1G
0.2 0.2G
0.3 0.3G
0.4 0.4G
0.5 0.5G
0.6 0.6G
0.7 0.7G
0.8 0.8G
Outline
Basic Idea of GAN
GAN as structured learning
Can Generator learn by itself?
Can Discriminator generate?
A little bit theory
Structured Learning
Y
X
f
:
®
Regression: output a scalar
Classification: output a “class”
Structured Learning/Prediction: output a
sequence, a matrix, a graph, a tree ……
(one-hot vector)
1 0 0
Machine learning is to find a function f
Output is composed of components with dependency
0 1 0 0 0 1
Output Sequence
(what a user says)
Y
X
f
:
®
“机器学习” “Machine learning”:
X
Y
:
"Thanks for joining me”
(response of machine)
:
X
Y
:
Machine Translation Speech Recognition Chat-bot (speech):
X
Y
:
(transcription) (sentence of language 1) (sentence of language 2)Output Matrix
f
:
X
®
Y
ref: https://arxiv.org/pdf/1605.05396.pdf
“this white and yellow flower have thin white petals and a round yellow stamen”
:
X
Y
:
Text to Image Image to Image:
X
Y
:
Colorization: Ref: https://arxiv.org/pdf/1611.07004v1.pdfWhy Structured Learning
Challenging?
•
One-shot/Zero-shot Learning:
•
In classification, each class has some examples.
•
In structured learning,
•
If you consider each possible output as a
“class” ……
•
Since the output space is huge, most “classes”
do not have any training data.
•
Machine has to create new stuff during
testing.
•
Need more intelligence
Why Structured Learning
Challenging?
•
Machine has to learn to do planning
• Machine generates objects component-by-component, but it should have a big picture in its mind.
• Because the output components have dependency, they should be considered globally.
Image
Generation
Sentence
Structured Learning Approach
Bottom Up Top Down Generative Adversarial Network (GAN)Generator
Discriminator
Learn to generate the object at the component levelEvaluating the
whole object, and find the best one
Outline
Basic Idea of GAN
GAN as structured learning
Can Generator learn by itself?
Can Discriminator generate?
A little bit theory
Generator
Image:
code: 0.1
0.9
(where does they come from?) NN Generator 0.1 −0.5 0.2 −0.1 0.30.2 NN Generator co de vectors co de code 0.1 0.9 image As close as possible NN Classifier A, A -⋮ As close as possible 1 0 ⋮
Generator
Encoder in auto-encoder provides the code J
C NN Encoder NN Generator co de vectors co de code Image: code:
(where does they come from?) 0.1 0.9 0.1 −0.5 0.2 −0.1 0.30.2
Auto-encoder
NN Encoder NN Decoder code Compact representation of the input objectcode Can reconstruct the original object
Learn together
28 X 28 = 784 Low dimension CTrainable
NN Encoder DecoderNNAuto-encoder
As close as possible NN Encoder DecoderNN co de NN Decoder co de Randomly generatea vector as code
Image ?
= Generator
Auto-encoder
NN Decoder code 2D -1.5 1.5 −1.5 0 NN Decoder 1.5 0 NN Decoder (real examples) imageAuto-encoder
-1.5 1.5
Auto-encoder
NN Generator co de vectors co de code NN Generator a b GeneratorNN NN Generator?
a b 0.5x + 0.5x Image: code:(where does them come from?) 0.1 0.9 0.1 −0.5 0.2 −0.1 0.30.2
Auto-encoder
Variational Auto-encoder
(VAE)
NN Encoder input NN Decoder output m1 m2 m3 D, D -D3 E3 E, E -From a normal distribution C3 C, C -X + Minimize reconstruction error exp C/ = E"F D/ ×E/ + H/ NN Encoder DecoderNN code input output = GeneratorWhat do we miss?
G as close as possible Target Generated ImageIt will be fine if the generator can truly copy the target image. What if the generator makes some mistakes …….
What do we miss?
Target1 pixel error 1 pixel error
6 pixel errors 6 pixel errors Not good Not good
What do we miss?
Not good fine
The relation between the components are critical.
Although highly correlated, they cannot influence each other. Need deep structure to catch the relation between
components. Layer L-1
……
Layer L……
……
……
……
Each neural in output layer corresponds to a pixel.
ink
empty
Hi, Neighbor, Let's keep the
same color No.
Outline
Basic Idea of GAN
GAN as structured learning
Can Generator learn by itself?
Can Discriminator generate?
A little bit theory
Discriminator
• Discriminator is a function D (network, can deep)
• Input x: an object x (e.g. an image)
• Output D(x): scalar which represents how “good” an object x is
R
:
D
X
®
D
1.0
D0.1
Can we use the discriminator to generate objects?
Hard.
Evaluation function, Potential
Function, Energy Function …
Discriminator
•
Suppose we already have a good discriminator
D(x) …
How to learn the discriminator?
Enumerate all possible x !!!
Generator v.s. Discriminator
•
Generator
•
Pros:
• Easy to generate even with deep model
•
Cons:
• Imitate the appearance • Hard to learn the
correlation between components
•
Discriminator
•
Pros:
• Considering the big picture
•
Cons:
• Generation is not always feasible
• Especially when your model is deep • How to do negative
Generator + Discriminator
•
General Algorithm
•
Given a set of positive examples, randomly
generate a set of
negative examples
.
•
In each iteration
•
Learn a discriminator D that can discriminate
positive and negative examples.
•
Generate negative examples by
discriminator D
D
( )
x
D
x
X xÎ=
arg
max
~
v.s. Gx~
=
Benefit of GAN
•
From Discriminator’s point of view
•
Using generator to generate negative samples
•
From Generator’s point of view
•
Still generate the object
component-by-component
•
But it is learned from the discriminator with
global view.
( )
x
D
x
X xÎ=
arg
max
~
Gx~
=
efficient
(Variational) Auto-encoder
x1 x2 G %, %- "",-GAN
https://arxiv.org/a bs/1512.09300
VAE GAN
Outline
Basic Idea of GAN
GAN as structured learning
Can Generator learn by itself?
Can Discriminator generate?
A little bit theory
Generator
• A generator G is a network. The network defines a probability distribution I4
generator
G
% " = ( % Normal Distribution I4(") I0565 " as close as possibleHow to compute the divergence?
!
∗= #$% min
"
)*+ ,
", ,
#$%$Divergence between distributions I4 and I0565 ": an image (a high-dimensional vector)
Discriminator
!
∗= #$% min
"
)*+ ,
", ,
#$%$Although we do not know the distributions of I4 and I0565, we can sample from them.
sample G ve cto r ve cto r ve cto r ve cto r sample from normal Database Sampling from L7 Sampling from L89:9
Discriminator
!
∗= #$% min
"
)*+ ,
", ,
#$%$Discriminator
: data sampled from I0565
: data sampled from I4
train
+ (, 0 = M;∼=!"#" -./0 " + M;∼=$ -./ 1 − 0 "
Example Objective Function for D
(G is fixed) 0∗ = NO/ max
? + 0, (
Training:
Using the example objective function is exactly the same as training a binary classifier.
The maximum objective value is related to JS divergence.
Discriminator
!
∗= #$% min
"
)*+ ,
", ,
#$%$Discriminator
: data sampled from I0565
: data sampled from I4
train hard to discriminate small divergence
Discriminator
train easy to discriminate large divergence 0∗ = NO/ max ? + 0, ( Training:!
∗= #$% min
"
)*+ ,
max
&0 !, )
", ,
#$%$The maximum objective value is related to JS divergence.
• Initialize generator and discriminator • In each training iteration:
Step 1: Fix generator G, and update discriminator D Step 2: Fix discriminator D, and update generator G
0∗ = NO/ max
? + 0, (
Outline
Basic Idea of GAN
GAN as structured learning
Can Generator learn by itself?
Can Discriminator generate?
A little bit theory
Domain X
Domain Y
male female
It is good.
It’s a good day. I love you.
It is bad.
It’s a bad day. I don’t love you.
Unsupervised Conditional Generation
Transform an object from one domain to another
without paired data (e.g. style transfer)
G
Unsupervised
Conditional Generation
•
Approach 1: Direct Transformation
•
Approach 2: Projection to Common Space
?
!
'→) Domain X Domain Y For texture or color change12
')1
) Encoder of domain X Decoder of domain YLarger change, only keep the semantics
Domain Y
Domain X Image
?
Direct Transformation
!
'→) Domain X Domain Y)
) Domain Y Domain X scalar Input image belongs to domain Y or not Become similar to domain YDirect Transformation
!
'→) Domain X Domain Y)
) Domain Y Domain X scalar Input image belongs to domain Y or not Become similar to domain YNot what we want! ignore input
Direct Transformation
!
'→) Domain X Domain Y)
) Domain X scalar Input image belongs to domain Y or not Become similar to domain YNot what we want! ignore input
[Tomer Galanti, et al. ICLR, 2018]
The issue can be avoided by network design. Simpler generator makes the input and
Direct Transformation
!
'→) Domain X Domain Y)
) Domain X scalar Input image belongs to domain Y or not Become similar to domain Y EncoderNetwork pre-trained NetworkEncoder as close as
possible
Unsupervised
Conditional Generation
•
Approach 1: Direct Transformation
•
Approach 2: Projection to Common Space
?
!
'→) Domain X Domain Y For texture or color change12
')1
) Encoder of domain X Decoder of domain YLarger change, only keep the semantics
Domain Y
Domain X Image
Domain X Domain Y
12
'12
))1
))1
' image image image image Face AttributeProjection to Common Space
12
'12
))1
))1
' image image image image)
')
) Discriminator of X domain Discriminator of Y domainProjection to Common Space
Training
Cycle Consistency:
Used in ComboGAN
[Asha Anoosheh, et al., arXiv, 2017] Minimizing reconstruction errorReference
• Jun-Yan Zhu, Taesung Park, Phillip Isola, Alexei A. Efros, Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, ICCV, 2017 • Zili Yi, Hao Zhang, Ping Tan, Minglun Gong, DualGAN: Unsupervised Dual
Learning for Image-to-Image Translation, ICCV, 2017
• Tomer Galanti, Lior Wolf, Sagie Benaim, The Role of Minimal Complexity Functions in Unsupervised Learning of Semantic Mappings, ICLR, 2018
• Yaniv Taigman, Adam Polyak, Lior Wolf, Unsupervised Cross-Domain Image Generation, ICLR, 2017
• Asha Anoosheh, Eirikur Agustsson, Radu Timofte, Luc Van Gool, ComboGAN: Unrestrained Scalability for Image Domain Translation, arXiv, 2017
• Amélie Royer, Konstantinos Bousmalis, Stephan Gouws, Fred Bertsch, Inbar Mosseri, Forrester Cole, Kevin Murphy, XGAN: Unsupervised Image-to-Image Translation for Many-to-Many Mappings, arXiv, 2017