The Euler–Lagrange equation - Optimization in Geometry and Physics

This cheaper follows closely the presentation in [6, Ch. 1], however, with less details.

3.1. Function spaces

The one-dimensional optimization problems that we study are defined through the mini-mization (or maximini-mization) of functionals J of integral form, i.e.,

J (y) = ˆ _b

F (x, y(x), y⁰(x)) dx for real-valued functions y : [a, b] → R, or of the form

J (x, y) = ˆ _b

F (x(t), y(t), ˙x(t), ˙y(t)) dt, for curves (x, y) : [a, b] → R².

The first step is to describe the domain of J . In order to be able to integral well, we assume that the so-called Lagrangian (or Lagrange function)

F : [a, b] × R × R → R is continuous, and [a, b] := {x | a ≤ x ≤ b} is a compact interval in R.

We use the following (common) notation for function spaces.

Definition 3.1. The set of continuous functions on [a, b] is defined by C[a, b] := {y | y : [a, b] → R is continuous}

The set of continuously differentiable functions on [a, b] is defined by C¹[a, b] := {y | y ∈ C[a, b], y is differentiable on [a, b], y⁰ ∈ C[a, b]}

Here, we say that y is differentiable at the boundary points a, b if the one-sided derivatives exist.

The set of piecewise continuously differentiable functions on [a, b] is defined by C^1,pw[a, b] := {y | y ∈ C[a, b], y ∈ C¹[xi−1, xi], i = 1, . . . , m}, where a = x0< x1 < . . . < xm = b is a partition of [a, b] (depending on y).

Remark 3.2. Obviously,

C¹[a, b] ⊂ C^1,pw[a, b] ⊂ C[a, b],

and all three sets are infinite-dimensional vector spaces over R with the usual pointwise addition and scalar multiplication, i.e., for all functions y₁ and y₂ and α ∈ R,

(y₁+ y₂)(x) = y₁(x) + y₂(x), (αy1)(x) = αy1(x).

18 3. THE EULER–LAGRANGE EQUATION

for y ∈ C^1,pw[a, b]. Note that this sum of integrals exists, because by definition the integrand is continuous on each subinterval [xi−1, xi].

In order to measure the distance between two functions, and indeed to differentiate on functions spaces, we need to define norms on these spaces.

Definition 3.3. For y ∈ C[a, b] we define

kyk₀ := kyk_C[a,b]:= max

The distance between two functions y1, y2 in a function space X is given by the difference in the respective norm, i.e.,

dist(y1, y2) := ky1− y₂k_X.

In the case of the space C^1,pw[a, b] one has to consider the union of the partitions of both functions.

The following Lemma shows that the above definitions are norms, that is, they allow us to measure lengths and distances of vectors (here, functions) by respecting the vector space structure of the underlying function space.

Lemma 3.4. The maps k.k : X → R defined in Definition 3.1 for X = C[a, b], C¹[a, b] and C^1,pw[a, b] are norms, that is, they satisfy

(i) kyk ≥ 0 for all y ∈ X, and kyk = 0 ⇔ y = 0, (ii) kαyk = |α|kyk for all α ∈ R, y ∈ X,

(iii) ky1+ y2k ≤ ky₁k + ky₂k for all y₁, y2 ∈ X (triangle inequality).

Hence these function spaces with their respective norms are so-called normed vector spaces.

Proof. see Assignment 2.

The reason for defining a norm is that one is able to study convergence and continuity on these function spaces, and later differentiability. We define these notions analogous to those in Real Analysis (compare to your Calculus and Analysis classes).

Definition 3.5. We say that a sequence (yn)_n∈N in X converges to y0 ∈ X, for short

3.2. THE FIRST VARIATION 19

Remark 3.6. Note that continuity is equivalent to sequential continuity here (the proof is analogous to that for R). That is, J is continuous in y0∈ D if and only if for each sequence (y_n)_n∈N⊆ D ⊆ X,

n→∞lim y_n= y₀=⇒ lim

n→∞J (y_n) = J (y₀).

Remark 3.7. One can show that the spaces C[a, b] and C¹[a, b] with the norms defined above are complete normed spaces (also called Banach spaces), i.e., every Cauchy sequence converges in those spaces. This is not true for C^1,pw[a, b], and hence generally more compli-cated Sobolev spaces are used in the calculus of variations. However, for our purposes with

“broken extrema”, the use of C^1,pw[a, b] is sufficient.

3.2. The first variation

Finite-dimensional optimization problems can be solved using differentiation. There, the domain is R or Rⁿ. In the following we will introduce a notion of differentiability on an infinite-dimensional function space, based on directional derivatives. Differentiation requires knowledge of the function near a given value.

Definition 3.8. Let X be normed vector space and D an open subset of X, and J : D → R

exits, then we say that J is Gˆateaux-differentiable in y in direction h.

Note that what we actually do in Definition 3.8, since y and h are fixed, is differentiating the real-valued function

g : (−ε, ε) → R, g(t) := J (y + th), something that we already know how to do!

Remark 3.9. One can show that dJ (y, αh) = αdJ (y, h) for any α ∈ R. This means that the Gˆateaux differential is homogeneous. In general, dJ (y, h) need not be linear or continuous (see counterexample in [6, Sec. 1.2]). In the finite-dimensional case, i.e., for J : Rⁿ→ R, one does obtain by the chain rule that

dJ (y, h) = h∇J (y), hi,

where h., .i is the usual Euclidean scalar product in Rⁿ. So in the finite-dimensional case linearity does actually hold.

Since linearity does not hold in general, we introduce a new definition if this is the case.

Definition 3.10. If dJ (y, h) exists for y ∈ D ⊆ X and for h ∈ X, and dJ (y, h) is linear in h, then we call

δJ (y)h := dJ (y, h) the first variation of J in y in direction h.

If the first variation exists for all h ∈ X₀ ⊆ X, where X₀ is a subspace of X, then δJ (y) : X0→ R

is a linear functional.

20 3. THE EULER–LAGRANGE EQUATION

If X₀ = X = Rⁿ and J : Rⁿ→ R is totally differentiable, then δJ (y)h = h∇J (y), hi,

since the scalar product is linear in h. In the finite-dimensional case, linearity implies conti-nuity, so that δJ (y) : X0 → R is also continuous. If X is infinite-dimensional linearity need not imply continuity (see Assignment 2).

In what follows we provide a sufficient condition for the existence and linearity of the Gˆateaux-differential. In order to formulate the theorem we define the set of piecewise differ-entiable functions with vanishing boundary terms.

Definition 3.11.

C₀^1,pw[a, b] := C^1,pw[a, b] ∩ {h(a) = 0 = h(b)}.

Proposition 3.12. Consider the functional J : C^1,pw[a, b] ⊇ D → R, defined by J (y) :=

ˆ _b

F (x, y, y⁰) dx.

Assume that the Lagrangian F : [a, b]×R×R → R is continuous and continuously differentiable with respect to the last two variables. Furthermore, suppose that for every h ∈ C₀^1,pw[a, b], and for y ∈ D the translated value y + th ∈ D for all t ∈ (−ε, ε) (here, ε > 0 depends on h). for both functions. Note that

dJ (y, h) = lim Two steps are necessary to finish the proof. We need to exchange the limit with integration, and we need to compute the limit under the integral. We start with the second step because it also shows that the convergence is uniform, which immediately implies that we can exchange the limit with integration in (3.2).

We first express the integrand in a suitable way so that we compute the limit. Using the Fundamental Theorem of Calculus, partial differentiation and the chain rule we obtain for

1Partial derivatives are defined as derivatives of a function of multiple variables when all but the variable of interest are held fixed during the differentiation. For more details see Calculus B and Analysis 2.

3.2. THE FIRST VARIATION 21 because the additional colored terms in the last right hand side add up to zero.

We next show that due to uniform continuity of F_y and F_y⁰ on compact sets, the last two integral terms converge to 0 as t → 0 uniformly for all x ∈ [xi−1, xi]: Let ε > 0. Since y and

for |t| and khk1,pwsufficiently small (depending on ε and so that |t|khk1,pw< δ(ε); see [6, Prop.

1.2.1] for all the details). A similar estimate holds for the second integrand in (3.3) involving F_y⁰. Thus (3.3) implies for |t| sufficiently small. Hence, as t → 0, the difference converges

limt→0

t F (x, y(x) + th(x), y⁰(x) + th⁰(x)) − F (x, y(x), y⁰(x))

=Fy(x, y(x), y⁰(x))+F_y⁰(x, y(x), y⁰(x))h⁰(x) (3.4) independent of x ∈ [x_i−1, x_i], i = 1, . . . , m.

22 3. THE EULER–LAGRANGE EQUATION

Due to this uniform convergence, the limit can be exchanged with integration in (3.2) (it can always be exchanged withP), which implies

dJ (y, h) = lim

Corollary 3.13. Under the same assumptions as in Proposition 3.12, it follows that δJ (y) : C₀^1,pw[a, b] → R

is linear and continuous for every y ∈ D. In particular,

|δJ (y)h| ≤ C(y)khk_1,pw, for all h ∈ C₀^1,pw[a, b], where C(y) > 0 is a constant depending on y ∈ D.

Sketch of proof. This follows from the proof of Proposition 3.12, because Fy and F_y⁰ are (uniformly) continuous and hence bounded on compact subsets. (Details in Assignment

3.)

Remark 3.14. If J is defined on all of C^1,pw[a, b], then one can obtain all results in Proposition 3.12 and Corollary 3.13 for h ∈ C^1,pw[a, b], since the assumption h(a) = h(b) = 0 itself is not used anywhere in the proofs (it is just needed to ensure that y + th ∈ D, since D is usually specified in terms of the boundary conditions at a and b).

3.3. The Fundamental Lemma of Calculus of Variations

In order to solve an equality of the type δJ (y)h = 0 that local extrema y will be shown to satisfy, we need to analyze integrals of the type

ˆ _b

f h + gh⁰ = 0,

appearing in (3.1) for certain f, g, h. Ultimately it will lead us to the Fundamental Lemma of Calculus of Variations, which translates this “weak” integral equality into a “strong” dif-ferential equation between f and g. The application to the first variation, leading to the Euler–Lagrange equation, will then follow in the next section. We start with preparations involving the relevant function spaces and integrals.

The derivative of a piecewise continuously differentiable function is piecewise continuous.

We thus introduce a notion for this space.

3.3. THE FUNDAMENTAL LEMMA OF CALCULUS OF VARIATIONS 23

Definition 3.15.

C^pw[a, b] = {y : [a, b] → R | y ∈ C[xi−1, xi], i = 1, . . . , m}, where the partition a = x₀ < x₁ < . . . < x_m = b depends on y.

In this definition, y is allowed to have two different values at the jump points. Strictly speaking, such a y wouldn’t be a function, but in order to avoid further technicalities and more complicated spaces, we still work with this notion. In short, since it only affects a finite number of points it does not matter.

Our goal in this section to investigate a weak notion of derivative as they appear in (3.1).

This requires the use of so-called test functions, differentiable functions with compact support.

The space of test functions is a subset of the space C^∞[a, b] of smooth functions on [a, b] and defined as follows.

Definition 3.16.

C₀^∞[a, b] := C^∞[a, b] ∩ {h(a) = h(b) = 0}

If f = 0, then clearly f h = 0 holds for all h ∈ C₀^∞[a, b]. In particular, ´_b

af h = 0. The (weak) converse also holds, as the following result shows.

Lemma 3.17. If f ∈ C^pw[a, b] and ˆ _b

f h dx = 0, for all h ∈ C₀^∞[a, b], then f (x) = 0 for all x ∈ [a, b].

Proof. The proof is carried out by contradiction. Assume there is a x ∈ [xi−1, x_i] ⊆ [a, b]

such that f (x) 6= 0, say w.l.o.g. f (x) > 0. By continuity of f there exists a subinterval I ⊆ [a, b] such that f (x) > 0 on I. Choose² a test function h ∈ C₀^∞(I) ⊆ C^∞[a, b] which has

Figure 3.1. Construction of I and test function h in the proof of Lemma 3.17.

2Such functions always exist, but we will not prove this fact in this course.

24 3. THE EULER–LAGRANGE EQUATION

the same constant sign as f on the interior of I. Then f h > 0 in the interior of I and f h = 0 on [a, b] \ I. Thus´_b

af h dx > 0, a contradiction to the assumption that´_b

af h dx = 0. A similar trick can be done for the “weak derivative” of f , as follows.

Lemma 3.18. If f ∈ C^pw[a, b] and ˆ _b

f h⁰dx = 0, for all h ∈ C₀^1,pw[a, b], then f is constant on [a, b].

Proof. See Assignment 3.

If f is (piecewise) continuously differentiable, then the integral of Lemma 3.18 can be computed via partial integration.

Sketch of proof. This follows from the fact that this formula holds on each subinterval

of the partition of [a, b].

Remark 3.20. If we plug in h ≡ 1 in Lemma 3.19, then this is just the Fundamental Theorem of Calculus for piecewise continuously differentiable functions. In fact, one can show that the Fundamental Theorem of Calculus is even true for so-called absolutely continuous functions (because f then has a derivative f⁰ almost everywhere which is integrable).

We next state the main result of this section, which combines the two previous statements and which allows us to transform a weak statement about the variation into a strong one. As such it can be viewed as a regularization theorem.

Lemma 3.21 (Fundamental Lemma of Calculus of Variations). If f, g ∈ C^pw[a, b] satisfy ˆ _b

Proof. By first part of the Fundamental Theorem of Calculus, the function F (x) :=

ˆ _x

f (s) ds is well-defined, continuous and satisfies³

F⁰(x) = f (x) for x ∈ [x_i−1, x_i].

3In the boundary points we just take the one-sided derivatives.

3.4. THE EULER–LAGRANGE EQUATION 25

By assumption (3.5) we thus have that ˆ _b

Since we assumed that g ∈ C^pw[a, b] we can employ Lemma 3.18 and thus obtain that there is a c ∈ R such that

In the previous sections we defined how to differentiate functionals J and learned how to handle weak derivatives. In the next step we essentially translate a weak formulation of setting the functional derivative to zero (in variational form, δJ = 0) into a strong formulation (via a differential equation). It is important to note that just like in Real Analysis we only obtain a necessary criterion for an extremum, not a sufficient one. Further conditions are required to prove that the result is really a minimizer/maximizer—and as in Real Analysis, convexity will be the key.

We first define what it means to be a minimizer for a functional J (y) =

ˆ _b

F (x, y, y⁰) dx, (3.6)

which is defined on D ⊆ C^1,pw[a, b]. The definition of maximizer is defined analogously.

Definition 3.23. A function y ∈ D is called local minimizer of a functional J : X ⊇ D → R if

J (y) ≤ J (˜y) for all ˜y ∈ D with ky − ˜yk_1,pw< d, holds for a positive constant d > 0.

Remark 3.24. If in the previous definition we would use ky − ˜yk₀ < d instead, then it becomes a lot more difficult for y to be a minimizer. We would call it a strong local minimizer in this case. Not every local minimizer is also a strong local minimizer (see counterexample in [6, Ex. 1.4.7]). We will not use this term further.

For the rest of this section we assume that for y ∈ D the first variation δJ (y)h is defined for all h ∈ C₀^1,pw[a, b], in particular, y + th ∈ D for t ∈ (−ε, ε) sufficiently small (possibly depending on h). If D is only defined by boundary conditions, e.g.,

D = C^1,pw[a, b] ∩ {y(a) = A, y(b) = B},

then obviously y + th ∈ D for all t. We will mostly encounter domains D of this type.

Theorem 3.25. Suppose the Lagrangian F : [a, b] × R × R → R is continuous and contin-uously partially differentiable with respect to the last two variables. If y ∈ D ⊆ C^1,pw[a, b] is

26 3. THE EULER–LAGRANGE EQUATION

a local minimizer of the functional (3.6), then

F_y⁰(., y, y⁰) ∈ C^1,pw[a, b] ⊂ C[a, b] and (3.7) d

dxF_y⁰(., y, y⁰) = F_y(., y, y⁰) piecewise on [a, b]. (3.8) If y ∈ C¹[a, b], then also Fy⁰(., y, y⁰) ∈ C¹[a, b] and (3.8) holds on all of [a, b].

Proof. By assumption is for y ∈ D also y+th ∈ D for each h ∈ C₀^1,pw[a, b] and t ∈ (−ε, ε) sufficiently small. In particular, we can assume that kthk_1,pw < d and hence g(t) = J (y +th) is because of (3.6) locally minimal at t = 0. By Proposition 3.12 exists the Gˆateaux differential dJ (y, h) = g⁰(0) = 0 and (3.1) implies

δJ (y)h = ˆ _b

Fy(x, y, y⁰)h + F_y⁰(x, y, y⁰)h⁰dx = 0, (3.9) for all h ∈ C₀^1,pw[a, b]. By the Fundamental Lemma of Calculus of Variations 3.21, since F_y(., y, y⁰), F_y⁰(., y, y⁰) ∈ C^pw[a, b] due to the assumptions on y and F , equation (3.8) follows.

Corollary 3.22 implies the last sentence in the statement. Definition 3.26. The equation (3.8) is called Euler–Lagrange equation. It is a (piecewise) differential equation which must be obeyed by local minimizers (maximizers) y.

More precisely, (3.8) is called the strong version of the Euler–Lagrange equation, while (3.7) is called the weak version.

Lemma 3.19 implies the following.

Theorem 3.27. The strong version (3.8) and the weak version (3.9) of the Euler–Lagrange

equation are equivalent.

This result is not longer true for several independent variables (and when the Euler–

Lagrange equations are partial differential equations).

3.5. Existence of a minimizer

Note that by Theorem 3.25 we only know that if y is a local minimizer then it satisfies the Euler–Lagrange equation. The converse is not necessarily true⁴, as the following examples shows.

Example 3.28. Consider the functional J (y) =

By Theorem 3.25 an admissible function y ∈ D solves the Euler–Lagrange equation if F_y⁰(., y, y⁰) = 3(y⁰)² ∈ C^1,pw[0, 1] ⊂ C[0, 1], and

dx3(y⁰)² = 0 piecewise on [0, 1].

4This is also the case in Real Analysis. For instance, f (x) = x³has no local extremum at x = 0 although f⁰(0) = 0.

3.5. EXISTENCE OF A MINIMIZER 27

Hence for some constant c ≥ 0

(y⁰)² = c and thus

y⁰ = ±√

c on [0, 1]. (3.10)

In addition, y must satisfy the boundary conditions y(0) = 0 = y(1). There exist infinitely many solutions to this problem. One particular one is y ≡ 0. Generally, the solutions to (3.10) are piecewise straight lines with gradients ±√

c. None of those solutions, however, is a

Figure 3.2. Some solutions to the optimization problem in Example 3.28 with different c 6= 0.

local minimizer (or maximizer) for J (see Assignment 4 for the justification).

In what follows we want to derive a sufficient criterion for the existence of an extremum based on convexity assumption (and, of course, having a suitable candidate for an extremum by solving the Euler–Lagrange equation). If the second variation exists for a local minimizer, then one can also use its sign instead (again, just like in Real Analysis).

Figure 3.3. Visualization of the convexity condition (3.11) in Definition 3.29.

28 3. THE EULER–LAGRANGE EQUATION

Definition 3.29. A continuous Lagrangian F : [a, b] × R × R → R, which is continuously differentiable with respect to the last two variables, is called convex with respect to the last two variables if

F (x, ˜y, ˜y⁰) ≥ F (x, y, y⁰) + Fy(x, y, y⁰)(˜y − y) + Fy⁰(x, y, y⁰)(˜y⁰− y⁰) (3.11) holds for all (x, y, y⁰), (x, ˜y, ˜y⁰) ∈ [a, b] × R × R.

Geometrically, condition (3.11) means that the graph of F (x, ., .) lies above each tangent plane which is spanned by the tangents to the graphs of F (x, y, .) and F (x, ., y⁰).

Theorem 3.30. Let a functional J (y) =

ˆ _b

F (x, y, y⁰) dx

be defined on D = C^1,pw[a, b] ∩ {y(a) = A, y(b) = B} for a Lagrangian F : [a, b] × R × R → R that is continuous, and continuously differentiable and convex with respect to the last two variables. Then every solution y ∈ D of the Euler–Lagrange equation is a global minimizer for J .

Proof. Suppose y ∈ D is a solution to the Euler–Lagrange equation. For any other

y ∈ D the difference h := ˜y − y ∈ C₀^1,pw[a, b] since y and ˜y satisfy the same boundary conditions. Due to the convexity assumption (3.11) and the (weak) Euler–Lagrange equation we can estimate

Hence every y ∈ D that satisfies the weak or strong Euler–Lagrange equation is a global

minimizer.

Note that if F is concave (or equivalently, −F convex) then a solution y of the Euler–

Lagrange equation is a global maximizer for J (because it is a minimizer for −J ).

Remark 3.31. Suppose the second variation of J exists for a y ∈ D, i.e., for a direction h we have

δ²J (y)(h, h) = ˆ _b

Fyyh²+ 2F_yy⁰hh⁰+ F_y⁰_y⁰(h⁰)²dx. (3.12) This is the case if F is twice continuously differentiable with respect to the last two variables (we will not prove this formula, but you can find some explanation on how to do it in [6, Ex.

1.2.2]). Then one can show (similar to as it is done in Real Analysis) that for a local minimizer y ∈ D we must have that

δ²J (y)(h, h) ≥ 0 for all h ∈ C₀^1,pw[a, b].

One can show that if, in addition, F_yy⁰(., y, y⁰) ∈ C^1,pw[a, b], then a necessary criterion for a minimizer is

F_yy⁰(x, y(x), y⁰(x)) ≥ 0 for all x ∈ [a, b].

3.6. EXAMPLES 29

This criterion is easy to prove, hence useful.

Moreover, one can even deduce a sufficient criterion (here presented without proof).

Theorem 3.32. Suppose that J and F are defined as in Theorem 3.30. In addition, we assume that F is twice continuously differentiable with respect to the last two variables. Then the second variation δ²J (y)(h, h) of J in y ∈ D ⊆ C^1,pw[a, b] in direction h ∈ C₀^1,pw[a, b]

exists. Moreover, if

δJ (y)h = 0, and

δ²J (˜y)(h, h) ≥ 0 for all ˜y ∈ D and for all h ∈ C₀^1,pw[a, b], then y is a global minimizer for J on D.

3.6. Examples Example 3.33. Consider

J (y) = ˆ ₁

−1

y²(2x − y⁰)²dx

on the domain D = C¹[−1, 1] ∩ {y(−1) = 0, y(1) = 1}. Note that J (y) ≥ 0 for all y ∈ D. We try to guess a possible minimizer, which requires that either y = 0 or y⁰ = 2x. The function

y(x) =

(0 if x ∈ [−1, 0], x² if x ∈ [0, 1], is in D and satisfies J (y) =´₀

−10 dx +´₁

0 x⁴(2x − 2x)²dx = 0. Hence it is a global minimizer.

Figure 3.4. A global minimizer for Example 3.33.

Note that y is not twice differentiable at x = 0, so y 6∈ C²[−1, 1], however y satisfies the Euler–Lagrange equation on all of [−1, 1] since

F_y(x, y(x), y⁰(x)) = 2y(x)(2x − y⁰(x))², F_y⁰(x, y(x), y⁰(x)) = −2y(x)²(2x − y⁰(x)),

and F_y ≡ 0 ≡ F_y⁰on [−1, 1] for the minimizer. (Note that we did not prove that this minimizer is unique.)

Example 3.34. Consider the Dirichlet integral J (y) =

ˆ _b

(y⁰)²dx

30 3. THE EULER–LAGRANGE EQUATION

on D = C^1,pw[a, b] ∩ {y(a) = A, y(b) = B}. The Euler–Lagrange equation reads d

dx2y⁰= 2y⁰⁰ = 0 piecewise on [a, b], i.e., solutions are of the form

y(x) = cⁱ₁x + cⁱ₂ for x ∈ [xi−1, xi], i = 1, . . . , m.

It seems like y is defined piecewise, however, by (3.7) we know that 2y⁰ = Fy⁰(., y, y⁰) ∈ C^1,pw[a, b] ⊂ C[a, b]. Hence cⁱ₁ = c₁ for all i = 1, . . . , m, and since y ∈ C[a, b] we must also have that cⁱ₂= c2. Due to the boundary conditions we obtain that

y(x) = c1x + c2

with c1= ^B−A_b−a and c2 = ^bA−aB_b−a .

Figure 3.5. Solution of the Euler–Lagrange equation in Example 3.34.

Recall, however, that a solution to the Euler–Lagrange equation need not be an extremum.

In order to find minimizers we can employ Theorem 3.30 or, for sufficient regularity, the second variation. By formula (3.12) we have

δ²J (˜y)(h, h) = ˆ _b

2(h⁰)²dx ≥ 0,

for all ˜y ∈ D and all h ∈ C₀^1,pw[a, b]. Hence by Theorem 3.32 the straight line through (a, A) and (b, B) is a minimizer. One can also argue more directly that J (˜y) ≥ J (y) (see Assignment 4).

Example 3.35 (Counterexample of Weierstraß). Define J (y) =

ˆ ₁

−1

x²(y⁰)²dx

on the domain D = C¹[−1, 1] ∩ {y(−1) = −1, y(1) = 1}. Clearly J (y) ≥ 0 for all y ∈ D. One can show that the sequence (y_n)_n∈N⊂ D, defined by y_n(x) = ^{arctan nx}_{arctan n} satisfies⁵

J (y_n) = ˆ ₁

−1

n²x²

arctan²n(1 + n²x²)² dx

< 1 arctan²n

ˆ ₁

−1

1 + n²x² = 2 n arctan n,

5Recall that arctan⁰x = _1+x¹2 and arctan x = − arctan(−x).

3.7. SPECIAL CASES OF LAGRANGIANS 31

Figure 3.6. A potential minimizer (but not in D) for Weierstraß counterexample 3.35.

and thus limn→0J (yn) = 0. Therefore, indeed,

y∈Dinf J (y) = 0.

Note, however, that no minimum exists in D. Such a function would have to satisfy xy⁰ = 0 on all of [−1, 1] which due to the boundary conditions implies that we must have

y(x) =

(−1 x ∈ [−1, 0), 1 x ∈ (0, 1].

However, this function y 6∈ D. In Assignment 4 we will also see that none of the solutions for the Euler–Lagrange equation are in D.

Historically, this is a very important example. Weierstraß used this Lagrangian to disprove Dirichlet, who claimed that functionals which are bounded below always admit a minimizer.

3.7. Special cases of Lagrangians

Special cases occur when the Lagrangian F (x, y, y⁰) does not depend on all three compo-nents x, y, y⁰. Let us consider these simplifications separately, as they occur frequently (see also [6, p. 21ff]).

3.7.1. F(y, y⁰). If J (y) = ´_b

aF (y, y⁰) dx and y ∈ C^1,pw[a, b] ∩ C²(a, b) then the Euler–

Lagrange equation

dxF_y⁰(y, y⁰) = F_y(y, y⁰) on [a, b]

implies that d

dx(F (y, y⁰) − y⁰F_y⁰(y, y⁰))

= Fy(y, y⁰)y⁰+ F_y⁰(y, y⁰)y⁰⁰− y⁰⁰F_y⁰(y, y⁰) − y⁰ d

dxF_y⁰(y, y⁰)

= (Fy(y, y⁰) − d

dxF_y⁰(y, y⁰))y⁰.

32 3. THE EULER–LAGRANGE EQUATION

This implies that every solution of the Euler–Language equation and every constant solves the the first-order differential equation of the form

F (y, y⁰) − y⁰F_y⁰(y, y⁰) = c₁ on [a, b], (3.13) for some constant c1 ∈ R. If this equation can be solved in y⁰ one obtains y⁰= f (y, c1), which is a differential equation that can potentially be solved by using separation of variables. See also Section 4.1 below.

3.7.2. F(x, y⁰). If the Lagrangian is of the form F (x, y⁰) then the Euler–Lagrange equa-tion yields F_y⁰(x, y⁰) = c₁ which may be possible to solve for y⁰(x) = f (x, c₁) and then integrate to obtain

y(x) = ˆ

f (x, c₁) dx + c₂.

3.7.3. F(x, y). In case the Lagrangian F (x, y) does not depend on y⁰ the Euler–Lagrange equation Fy(x, y(x)) = 0 is not a differential equation.

CHAPTER 4

In document Optimization in Geometry and Physics (Page 21-37)