Particles and Fields
5.4 Einstein Sum Convention
If necessity is the mother of invention, laziness is the father. The Einstein summation convention is an offspring of this happy marriage. We introduced it in Section 4.4.2, and now we explore its use a little further.
Whenever you see the same index both downstairs and upstairs in a single term, you automatically sum over that index. Summation is implied, and you don’t need a summation symbol. For example, the term
means
A0 A0 + A1 A1 + A2 A2 + A3 A3
because the same index μ appears both upstairs and downstairs in the same term. On the other hand, the term
AνAμ
does not imply summation, because the upstairs and downstairs indices are not the same. Likewise,
AνAν
does not imply summation even though the index ν is repeated, because both indices are downstairs.
You may recall that some of the equations in Section 3.4.3 used the symbol to signify the sum of squares of space components. By using upstairs and downstairs indices along with the summation convention, we could have written
which is more elegant and precise.
The operation of Expression 5.19 has the effect of changing the sign of the time component. I should warn you that some authors follow the convention (+1, −1, −1, −1) for the placement of these minus signs. I prefer the convention (−1, 1, 1, 1), typically used by those who study general relativity.
An index that triggers the summation convention, like ν in the following example, doesn’t have a specific value. It’s called a summation index or a dummy index; it’s a thing you sum over. By contrast, an index that is not summed over is called a free index. The expression
depends on μ (which is a free index), but it doesn’t depend on the summation index ν. If we replace ν with any other Greek letter, the expression would have exactly the same meaning. I should also mention that the terms upstairs index and downstairs index have formal names. An upper index is called contravariant, and a lower index is called covariant. I often use the simpler words upper and lower, but you should learn the formal terms as well. We can have A with an upper (contravariant) index, or A with a lower (covariant) index, and we use the matrix η to convert one to the other. Converting one kind of index to the other kind is called raising the index or lowering the index, depending on which way we go.
Exercise 5.1: Show that AνAν has the same meaning as AμAμ.
Exercise 5.2: Write an expression that undoes the effect of Eq. 5.20. In other words, how do we “go backwards”?
Let’s have another look at the Expression 5.19, AμAμ.
This expression is summed over because it contains a repeated index, one upper and one lower. Previously, we expanded it using the indices 0 through 3.
We can write the same expression using the labels t, x, y, and z:
Aμ Aμ = AtAt + AxAx + AyAy + AzAz.
For the three space components, the covariant and contravariant versions are exactly the same. The first space component is just (Ax)2, and it doesn’t matter whether you put the index upstairs or downstairs. The same is true for the y and z components. But the time component becomes −(At)2,
Aμ Aμ = −(At)2 + (Ax)2 + (Ay)2 + (Az)2.
The time component has a minus sign because the operation of lowering or raising that index changes its sign. The contravariant and covariant time components have opposite signs, and At times At is −(At)2. On the other hand, the contravariant and covariant space components have the same signs.
The quantity AμAμ is exactly what we think of as a scalar. It’s the difference of the square of the time component and the square of the space component. If Aμ happens to be a displacement such as Xμ, then it’s the same as the quantity τ2, except with an overall minus sign; in other words, it’s −τ2. But whatever sign it has, this sum is clearly a scalar.
This process is called contracting the indices, and it’s very general. As long as Aμ is a 4-vector, the quantity AμAμ is a scalar. We can take any 4-vector at all and make a scalar by contracting its indices. We can also write AμAμ a little differently by referring to Eq. 5.20 and replacing Aμ with ημνAν. In other words, we can write
On the right side, we use the metric η and sum over μ and ν. Both sides of Eq.
5.21 represent the same scalar. Now let’s look at an example involving two different 4-vectors, A and B. Consider the expression
AμBμ.
Is this a scalar? It certainly looks like one. It has no indices because all the indices have been summed over.
To prove that it’s a scalar, we’ll need to rely on the fact that the sums and differences of scalars are also scalars. If we have two scalar quantities, then by definition you and I will agree about their values even though our reference frames are different. But if we agree about their values, we must also agree about the value of their sum and the value of their difference. Therefore, the sum of two scalars is a scalar, and the difference of two scalars is also a scalar. If we keep this in mind, the proof is easy. Just start with two 4-vectors Aμ and Bμ and write the expression
(A + B)μ(A + B)μ.
This expression must be a scalar. Why is that? Because both Aμ and Bμ are 4-vectors, their sum (A + B)μ is also a 4-vector. If you contract any 4-vector with itself, the result is a scalar. Now, let’s modify this expression by subtracting (A
− B)μ(A − B)μ. This becomes
This modified expression is still a scalar because it’s the difference of two scalars. If we expand the expression, we find that the AμAμ terms cancel, and so do the BμBμ terms. The only remaining terms are AμBμ and AμBμ, and the result is
I’ll leave it as an exercise to prove that
AμBμ = AμBμ.
It doesn’t matter if you put the ups down and the downs up; the result is the same. Therefore, the expression evaluates to
Because we know that the original Expression 5.22 is a scalar, the result AμBμ must also be a scalar.
You may have noticed that the expression AμBμ looks a lot like the ordinary dot product of two space vectors. You can think of AμBμ as the Lorentz or Minkowski version of the dot product. The only real difference is the change of sign for the time component, facilitated by the metric η.