• No results found

Introduction: Mathematical optimization

N/A
N/A
Protected

Academic year: 2021

Share "Introduction: Mathematical optimization"

Copied!
31
0
0

Loading.... (view fulltext now)

Full text

(1)

Introduction: Mathematical optimization

Motivating Example Applications

Least-squares(LS) and linear programming(LP)-Very common place

Course goals and topics Nonlinear optimization

Brief history of convex optimization

(2)

Almost Every Problem can be posed as an Optimization Problem

Given a setC⊆ ℜn find the ellipsoidE⊆ ℜ n that is of smallest volume suchthatC⊆E. Hint: First work out the problem in lower dimensions

C

x in C is a vector of size n

x = [x1,x2...xn]

Constraint: x1^2/a1^2 + x2^2/a2^2

+ ....xn^2/an^2 <= 1

a2

a1

(3)

Almost Every Problem can be posed as an Optimization Problem

Given a setC⊆ ℜ n find the ellipsoidE⊆ ℜ n that is of smallest volume suchthatC⊆E. Hint: First work out the problem in lower dimensions

SphereS r ⊆ ℜn centered at0is expressedas:

Sr = { ||x||_2 <= r}

2-norm is the square root of sum of squares

of the individual components of x

(4)

Almost Every Problem can be posed as an Optimization Problem

u

A'

Our basic ellipsoid

had A' = diagonal

Given a setC⊆ ℜ n find the ellipsoidE⊆ ℜ n that is of smallest volume suchthatC⊆E. Hint: First work out the problem in lower dimensions

SphereS r⊆ ℜn centered at0is expressed as:S r ={u∈ ℜ n|∥u∥2≤r}

EllipsoidE⊆ ℜ n is expressedas:

Av + b

A'u + b'

(5)

Almost Every Problem can be posed as an Optimization Problem

Given a setC⊆ ℜ n find the ellipsoidE⊆ ℜ n that is of smallest volume suchthatC⊆E. Hint: First work out the problem in lower dimensions

SphereS r⊆ ℜn centered at0is expressed as:S r ={u∈ ℜ n|∥u∥2≤r}

EllipsoidE⊆ ℜ n is expressedas: n E= v∈ ℜ|Av+b∈S{ 1} { n 2 = v∈ ℜ |∥Av+b∥ ≤ } n ++

1 . Here,A∈S , that is,Ais

ann×n(strictly) positive definitematrix. The optimization problem willbe:

1) A is an nxn matrix

(Sphere and Ellipsoid are both

in R^n)

2) This brings an additional

constraint that A is symmetic,

and it is positive definite

Prof. Ganesh Ramakrishnan (IIT Bombay) Introduction to Convex Optimization : CS709 July 17,2018 4 / 42

3)That is, A has positive eigen

values..

4) The positive eigenvalues will

correspond to scaling of the axis

and corresponding eigenvectors

to the new axes

(6)

Almost Every Problem can be posed as an Optimization Problem

Given a setC⊆ ℜ n find the ellipsoidE⊆ ℜ n that is of smallest volume suchthatC⊆E. Hint: First work out the problem in lower dimensions

SphereS r⊆ ℜn centered at0is expressed as:S r ={u∈ ℜ n|∥u∥2≤r}

EllipsoidE⊆ ℜ n is expressedas: n

E= v∈ ℜ|Av+b∈S{ 1}= v∈ ℜ|∥Av+b∥{ n

2≤1 . Here,A} ∈S ++n , that is,Ais

ann×n(strictly) positive definite matrix. The optimization problem willbe:

minimize

[a11,a12..,ann,b1,..bn]

det(A−1)

subjecttov TAv>0,∀v̸= 0

A is positive definite

(7)

Almost Every Problem can be posed as an Optimization Problem (contd.)

Given a polygonP⊆ ℜ n find the ellipsoidE⊆ ℜ n that is of smallest volume suchthat

et v1,v 2, ...vp be the corners of thepolygonP P ⊆E.

L

The optimization problem willbe: minimize [a11,a12..,ann,b1,..bn] subject to−v i det(A−1) TAv>0,∀v̸= 0 ∥Av +b∥ 2≤1,i∈{1..p}

Introduction to Convex Optimization: CS709 July 17, 2018 5 / 42

v1

v2

v3

v4

v5

Given that the specified set S is indeed a polygon, is this problem

with a simplified set of constraints equivalent to the original problem?

(8)

Almost Every Problem can be posed as an Optimization Problem (contd.)

Given a polygonP⊆ ℜ n find the ellipsoidE⊆ ℜ n that is of smallest volume suchthat P⊆E.

Let v1,v 2, ...vp be the corners of the polygonP

The optimization problem willbe: minimize [a11,a12..,ann,b1,..bn] subject to−v det(A−1) TAv>0,∀v̸= 0 ∥Avi +b∥ 2≤1,i∈{1..p}

How would you pose an optimization problem to find the ellipsoidE of largest volume that

(9)

So Again: Mathematical optimization

fi :ℜ n→ℜ,i= 1, ...,m: constraintfunctions

minixmize f0(x)

subject tofi(x)≤b i,i= 1, . . . ,m. x= (x 1, ...,x n): optimizationvariables

optimal solutionx has smallest value off 0 among all vectors that satisfy theconstraints

(10)

Examples

portfolio optimization

variables: amounts invested in differentassets

(11)

Examples

Data fitting- Machine learning

variables: modelparameters

constraints: prior information, parameter limits objective: measure of misfit or prediction error

(12)
(13)

Problem in Perspective

1,c2, . . . ,ck}

Given data pointsx i,i= 1,2, . . . ,m

Possible class choices:c 1,c 2, . . . ,c k

Wish to generate amapping/classifier

f:x→{ c

To get classlabelsy1,y2, . . . ,ym

(14)

Problem in Perspective

In general, series ofmappings

x−→y −→z −→{c ,c , . . .,f(·) g(·) h(·) 1 2 c }k

(15)

Problem in Perspective

In general, series ofmappings

x−→y −→f(·) g(·) z −→{c ,h(·) 1c , . . . ,c }2 k

wherey,zare in some latentspace.

(16)

Perceptron Classifier

Consider abinary classificationproblem:f(x)∈{−1,+1}

w^Tx + b

w

Objective: Learn a linear classifier With linearlyseparability

(17)

Perceptron Classifier

Consider abinary classificationproblem:f(x)∈{−1,+1}

Objective: Learn alinear classifier

With linearly separability, any finite time learningalgorithm?

(18)

Perceptron Classifier

Consider abinary classificationproblem:f(x)∈{−1,+1}

Desirable: Any new input pattern similar to a seen patternisclassifiedcorrectly

Linear Classification?

wϕ(x) +b≥0for +ve points (y= +1)

wϕ(x) +b<0for -ve points (y= -1)

(19)
(20)

Perceptron Update Rule: Error Perspective

Unsigned distance from hyperplane:ywT(ϕ(x))

wTϕ(x) = 0b ϕ(x0)

ϕ(x)

D

w

(21)

Perceptron Update Rule: Error Minimization

Perceptron update tries to minimize the errorfunction

E= negative of sum of unsigned distances over misclassified examples = sum of

misclassificationcosts

E=− ywTϕ(x)

(x,y)∈M

whereM⊆Dis the set of misclassifiedexamples.

(22)

Machine Learning as Optimization

wb=argmin L(w) +Ω(∥w∥2)

w (1)

0-1 Loss:

L(w) = (x,y) δ (y̸=w Tϕ(x)) (2)

Minimizing the 0-1 Loss is NP-hard. We therefore look forsurrogates. Perceptron:A Non-convexSurrogate

L(w) =− (x,y)∈MywTϕ(x) (3)

(23)

Convex Surrogates for 0-1 Loss in Classification

wb=argmin L(w) +Ω(∥w∥2) w (4) Logistic Regression: L(w) =− 1∑m m i=1 y w ϕ((i) T (i) ( ( T

x )−log 1 +exp w ϕ x( (i)) )

)

(5)

Sigmoidal Neural Net:

1 L(w) =− m m K ∑ ∑ (i) ( L (i) ) yk log σk x + ( ) ( (i) k (

1−y log 1−σ kxL (i)

) ( ) )

(6)

i=1 k=1

(24)

More Generally..

xrepresents some action suchas

portfolio decisions to be maderesources to beallocatedschedule to becreatedvehicle/airline deflections

Constraints impose conditions on outcome basedon

performancerequirementsmanufacturingprocess

Objectivef 0(x)might correspond to one of the following and should be desirably smalltotal cost

risk

(25)

Solving optimization problems

General optimization problems very difficult tosolve

methods involve some compromise, e.g., very long computation time, or not always finding thesolution

Exceptions:certain problem classes can be solved efficiently and reliably least-squaresproblems

linear programming problems convex optimization problems

(26)

Least-squares

minixmize∥ Ax−b∥ 2

2

solving least-squares problems

analytical solution: x= (ATA) −1ATb

reliable and efficient algorithms andsoftware

computation time proportional to n2k (A∈R k×n); less if structured a

maturetechnology using least-squares

least-squares problems are easy to recognize

a few standard techniques increase flexibility (e.g., including weights, adding regularization terms)

(27)

Linear programming

minixmize cTx T i

subject toa x≥b i,i= 1, . . . ,m.

solving linear programs

no analytical formula forsolution

reliable and efficient algorithms andsoftware

computation time proportional to n2m if m≥n; less with structure a

maturetechnology using linear programs

not as easy to recognize as least-squaresproblems

a few standard tricks used to convert problems into linear programs (e.g., problems involving l1- or l-norms, piecewise-linearfunctions)

[OPTIONAL]

(28)

Convex optimization problem

minixmize f0(x)

subject tofi(x)≤b i,i= 1, . . . ,m.

objective and constraint functions areconvex:

fi(αx1 +βx 2)≤αf i(x1) +βfi(x2)

(29)

Example: m lamps illuminating n(small, flat) patches

k

intensity Ik at patch k depends linearly on lamp powerspj: n

I = a p ,kj jakj=rkj−2max{cosθ ,0}kj

j

j=1

problem:Provided the fixed locations(a kj’s), achieve desired illumination Ides with bounded

lamp powers

minipmize maxk=1,..,n |log(I k)−log(Ides)|

subject to0≤p j≤p max,j= 1, . . . ,m.

[OPTIONAL]

(30)

Course goals and topics

Goals

recognize/formulate problems (such as the illumination problem) as convex optimization problem

develop code for problems of moderate size (1000 lamps, 5000patches)

characterize optimal solution (optimal power distribution), give limits of performance,etc Topics

Convex sets, (Univariate) Functions, Optimization problem Unconstrained Optimization: Analysis and Algorithms Constrained Optimization: Analysis and Algorithms Optimization Algorithms for MachineLearning

(31)

Grading and Audit

Grading

Quizzes and Assignments: 15% Midsem:25%

Endsem: 45% Project:15% Audit requirement

Quizzes and Assignments and Project

References

Related documents

                r† out E out E ϕ ϕ รูปที- 3.10 การปรับมุมการวางของแผ่นสะท้อน นอกจากการจัดเฟสแผ่นสะท้อนทั#ง s

In another research, data on the concentrations of seven air pollutants (CH 4 , NMHC, CO, CO 2 , NO, NO 2 , and SO 2 ) and meteorological variables (wind speed and

5 Use your answer from Question 4, your value for the boiling point of ester X, your value for the melting point of the pure sample of carboxylic acid Y and the data provided in

Changes to the credit risk standard for standardised banks are likely to be particularly relevant for NBDTs, although the Reserve Bank will continue to tailor the NBDT

Normal metabolic effects of thyroid hormone on different tissues: Liver: increase glycolysis, cholesterol synthesis, and conversion of cholesterol into bile salts Adipose: amplifies

As we will see, there is a possibility to implement SUSY Ward identities for theories with exact supersymmetry and an S -matrix invariant under SUSY transformations, by examining

The external CSR is the obligation of the business firm to consider their external aspects, like use of resources, caring for their society and work closely to the benefit of

Use Figure 17.8 in your text to label the following elements of the figure below: TATA box, RNA polymerase II, transcription factors, template DNA strand, start point, 5' and 3',