Introduction: Mathematical optimization
Motivating Example Applications
Least-squares(LS) and linear programming(LP)-Very common place
Course goals and topics Nonlinear optimization
Brief history of convex optimization
Almost Every Problem can be posed as an Optimization Problem
Given a setC⊆ ℜn find the ellipsoidE⊆ ℜ n that is of smallest volume suchthatC⊆E. Hint: First work out the problem in lower dimensions
C
x in C is a vector of size n
x = [x1,x2...xn]
Constraint: x1^2/a1^2 + x2^2/a2^2
+ ....xn^2/an^2 <= 1
a2
a1
Almost Every Problem can be posed as an Optimization Problem
Given a setC⊆ ℜ n find the ellipsoidE⊆ ℜ n that is of smallest volume suchthatC⊆E. Hint: First work out the problem in lower dimensions
SphereS r ⊆ ℜn centered at0is expressedas:
Sr = { ||x||_2 <= r}
2-norm is the square root of sum of squares
of the individual components of x
Almost Every Problem can be posed as an Optimization Problem
u
A'
Our basic ellipsoid
had A' = diagonal
Given a setC⊆ ℜ n find the ellipsoidE⊆ ℜ n that is of smallest volume suchthatC⊆E. Hint: First work out the problem in lower dimensions
SphereS r⊆ ℜn centered at0is expressed as:S r ={u∈ ℜ n|∥u∥2≤r}
EllipsoidE⊆ ℜ n is expressedas:
Av + b
A'u + b'
Almost Every Problem can be posed as an Optimization Problem
Given a setC⊆ ℜ n find the ellipsoidE⊆ ℜ n that is of smallest volume suchthatC⊆E. Hint: First work out the problem in lower dimensions
SphereS r⊆ ℜn centered at0is expressed as:S r ={u∈ ℜ n|∥u∥2≤r}
EllipsoidE⊆ ℜ n is expressedas: n E= v∈ ℜ|Av+b∈S{ 1} { n 2 = v∈ ℜ |∥Av+b∥ ≤ } n ++
1 . Here,A∈S , that is,Ais
ann×n(strictly) positive definitematrix. The optimization problem willbe:
1) A is an nxn matrix
(Sphere and Ellipsoid are both
in R^n)
2) This brings an additional
constraint that A is symmetic,
and it is positive definite
Prof. Ganesh Ramakrishnan (IIT Bombay) Introduction to Convex Optimization : CS709 July 17,2018 4 / 42
3)That is, A has positive eigen
values..
4) The positive eigenvalues will
correspond to scaling of the axis
and corresponding eigenvectors
to the new axes
Almost Every Problem can be posed as an Optimization Problem
Given a setC⊆ ℜ n find the ellipsoidE⊆ ℜ n that is of smallest volume suchthatC⊆E. Hint: First work out the problem in lower dimensions
SphereS r⊆ ℜn centered at0is expressed as:S r ={u∈ ℜ n|∥u∥2≤r}
EllipsoidE⊆ ℜ n is expressedas: n
E= v∈ ℜ|Av+b∈S{ 1}= v∈ ℜ|∥Av+b∥{ n
2≤1 . Here,A} ∈S ++n , that is,Ais
ann×n(strictly) positive definite matrix. The optimization problem willbe:
minimize
[a11,a12..,ann,b1,..bn]
det(A−1)
subjecttov TAv>0,∀v̸= 0
A is positive definite
Almost Every Problem can be posed as an Optimization Problem (contd.)
Given a polygonP⊆ ℜ n find the ellipsoidE⊆ ℜ n that is of smallest volume suchthat
et v1,v 2, ...vp be the corners of thepolygonP P ⊆E.
L
The optimization problem willbe: minimize [a11,a12..,ann,b1,..bn] subject to−v i det(A−1) TAv>0,∀v̸= 0 ∥Av +b∥ 2≤1,i∈{1..p}
Introduction to Convex Optimization: CS709 July 17, 2018 5 / 42
v1
v2
v3
v4
v5
Given that the specified set S is indeed a polygon, is this problem
with a simplified set of constraints equivalent to the original problem?
Almost Every Problem can be posed as an Optimization Problem (contd.)
Given a polygonP⊆ ℜ n find the ellipsoidE⊆ ℜ n that is of smallest volume suchthat P⊆E.
Let v1,v 2, ...vp be the corners of the polygonP
The optimization problem willbe: minimize [a11,a12..,ann,b1,..bn] subject to−v det(A−1) TAv>0,∀v̸= 0 ∥Avi +b∥ 2≤1,i∈{1..p}
How would you pose an optimization problem to find the ellipsoidE ′ of largest volume that
So Again: Mathematical optimization
fi :ℜ n→ℜ,i= 1, ...,m: constraintfunctions
minixmize f0(x)
subject tofi(x)≤b i,i= 1, . . . ,m. x= (x 1, ...,x n): optimizationvariables
optimal solutionx ∗ has smallest value off 0 among all vectors that satisfy theconstraints
Examples
portfolio optimization
variables: amounts invested in differentassets
Examples
Data fitting- Machine learning
variables: modelparameters
constraints: prior information, parameter limits objective: measure of misfit or prediction error
Problem in Perspective
1,c2, . . . ,ck}
Given data pointsx i,i= 1,2, . . . ,m
Possible class choices:c 1,c 2, . . . ,c k
Wish to generate amapping/classifier
f:x→{ c
To get classlabelsy1,y2, . . . ,ym
Problem in Perspective
In general, series ofmappings
x−→y −→z −→{c ,c , . . .,f(·) g(·) h(·) 1 2 c }k
Problem in Perspective
In general, series ofmappings
x−→y −→f(·) g(·) z −→{c ,h(·) 1c , . . . ,c }2 k
wherey,zare in some latentspace.
Perceptron Classifier
Consider abinary classificationproblem:f(x)∈{−1,+1}
w^Tx + b
w
Objective: Learn a linear classifier With linearlyseparability
Perceptron Classifier
Consider abinary classificationproblem:f(x)∈{−1,+1}
Objective: Learn alinear classifier
With linearly separability, any finite time learningalgorithm?
Perceptron Classifier
Consider abinary classificationproblem:f(x)∈{−1,+1}
Desirable: Any new input pattern similar to a seen patternisclassifiedcorrectly
Linear Classification?
w⊤ϕ(x) +b≥0for +ve points (y= +1)
w⊤ϕ(x) +b<0for -ve points (y= -1)
Perceptron Update Rule: Error Perspective
Unsigned distance from hyperplane:ywT(ϕ(x))
wTϕ(x) = 0b ϕ(x0)
ϕ(x)
D
w
Perceptron Update Rule: Error Minimization
Perceptron update tries to minimize the errorfunction
E= negative of sum of unsigned distances over misclassified examples = sum of
misclassificationcosts ∑
E=− ywTϕ(x)
(x,y)∈M
whereM⊆Dis the set of misclassifiedexamples.
Machine Learning as Optimization
wb∗ =argmin L(w) +Ω(∥w∥2)
w (1)
0-1 Loss:
L(w) = ∑(x,y) δ (y̸=w Tϕ(x)) (2)
Minimizing the 0-1 Loss is NP-hard. We therefore look forsurrogates. Perceptron:A Non-convexSurrogate
L(w) =− ∑(x,y)∈MywTϕ(x) (3)
Convex Surrogates for 0-1 Loss in Classification
wb∗ =argmin L(w) +Ω(∥w∥2) w (4) Logistic Regression: L(w) =− 1∑m m i=1 y w ϕ((i) T (i) ( ( Tx )−log 1 +exp w ϕ x( (i)) )
)
(5)
Sigmoidal Neural Net:
1 L(w) =− m m K ∑ ∑ (i) ( L (i) ) yk log σk x + ( ) ( (i) k (
1−y log 1−σ kxL (i)
) ( ) )
(6)
i=1 k=1
More Generally..
xrepresents some action suchas
▶ portfolio decisions to be made ▶resources to beallocated ▶schedule to becreated ▶vehicle/airline deflections
Constraints impose conditions on outcome basedon
▶performancerequirements ▶manufacturingprocess
Objectivef 0(x)might correspond to one of the following and should be desirably small ▶total cost
▶risk
Solving optimization problems
General optimization problems very difficult tosolve
methods involve some compromise, e.g., very long computation time, or not always finding thesolution
Exceptions:certain problem classes can be solved efficiently and reliably least-squaresproblems
linear programming problems convex optimization problems
Least-squares
minixmize∥ Ax−b∥ 2
2
solving least-squares problems
analytical solution: x∗ = (ATA) −1ATb
reliable and efficient algorithms andsoftware
computation time proportional to n2k (A∈R k×n); less if structured a
maturetechnology using least-squares
least-squares problems are easy to recognize
a few standard techniques increase flexibility (e.g., including weights, adding regularization terms)
Linear programming
minixmize cTx T i
subject toa x≥b i,i= 1, . . . ,m.
solving linear programs
no analytical formula forsolution
reliable and efficient algorithms andsoftware
computation time proportional to n2m if m≥n; less with structure a
maturetechnology using linear programs
not as easy to recognize as least-squaresproblems
a few standard tricks used to convert problems into linear programs (e.g., problems involving l1- or l∞-norms, piecewise-linearfunctions)
[OPTIONAL]
Convex optimization problem
minixmize f0(x)
subject tofi(x)≤b i,i= 1, . . . ,m.
objective and constraint functions areconvex:
fi(αx1 +βx 2)≤αf i(x1) +βfi(x2)
Example: m lamps illuminating n(small, flat) patches
k
intensity Ik at patch k depends linearly on lamp powerspj: n
∑
I = a p ,kj jakj=rkj−2max{cosθ ,0}kj
j
j=1
problem:Provided the fixed locations(a kj’s), achieve desired illumination Ides with bounded
lamp powers
minipmize maxk=1,..,n |log(I k)−log(Ides)|
subject to0≤p j≤p max,j= 1, . . . ,m.
[OPTIONAL]
Course goals and topics
Goals
recognize/formulate problems (such as the illumination problem) as convex optimization problem
develop code for problems of moderate size (1000 lamps, 5000patches)
characterize optimal solution (optimal power distribution), give limits of performance,etc Topics
Convex sets, (Univariate) Functions, Optimization problem Unconstrained Optimization: Analysis and Algorithms Constrained Optimization: Analysis and Algorithms Optimization Algorithms for MachineLearning
Grading and Audit
Grading
Quizzes and Assignments: 15% Midsem:25%
Endsem: 45% Project:15% Audit requirement
Quizzes and Assignments and Project