6.1.1 4-Vector Summary
6.1.2 Forming Scalars
For any two 4-vectors A and B, we can form a product AνBν using the upstairs components of one and the downstairs components of the other.3 The result is a scalar, which means it has the same value in every reference frame. In symbols,
AνBν = scalar.
The repeated index ν indicates a summation over four values. The long form of this expression is
6.1.3 Derivatives
Coordinates and their displacements are the prototypes for contravariant (upstairs) components. In the same way, derivatives are the prototype for covariant (downstairs) components. The symbol ∂μ stands for
In Lecture 5, we explained why these four derivatives are the covariant components of a 4-vector. We can also write them in contravariant form. To summarize:
Covariant Components:
Contravariant Components:
As usual, the only difference between them is the sign of the time component.
The symbol ∂μ doesn’t mean much all by itself; it has to act on some object.
When it does, it adds a new index μ to that object. For example, if ∂μ operates on a scalar, it creates a new object with covariant index μ. Taking a scalar field ϕ as a concrete example, we could write
The right side is a collection of derivatives that forms the covariant vector
The symbol ∂μ also provides a new way to construct a scalar from a vector.
Suppose we have a 4-vector Bμ(t, x) that depends on time and position. In other words B is a 4-vector field. If B is differentiable, it makes sense to consider the quantity
∂μBμ(t, x).
Under the summation convention, this expression tells us to differentiate Bμ with respect to each of the four components of spacetime and add up the results:
The result is a scalar.
The summing process we’ve illustrated here is very general; it’s called index contraction. Index contraction means identifying an upper index with an identical lower index within a single term, and then summing.
6.1.4 General Lorentz Transformation
Back in Lecture 1, we introduced the general Lorentz transformation. Here, we return to that idea and add some details.
Lorentz transformations make just as much sense along the y axis or the z axis as they do along the x axis. There’s certainly nothing special about the x direction or any other direction. In Lecture 1, we explained that there’s another class of transformations—rotations of space—that are also considered members of the family of Lorentz transformations. Rotations of space do not affect time components in any way.
Once you accept this broader definition of Lorentz invariance, you can say that a Lorentz transformation along the y axis is simply a rotation of the Lorentz transformation along the x axis. You can combine rotations together with the
“normal” Lorentz transformations to make a Lorentz transformation in any direction or a rotation about any axis. This is the general set of transformations under which physics is invariant. The proof of this result is not important to us right now. What is important is that physics is invariant not only under simple Lorentz transformations but also under a broader category of transformations that includes rotations of space.
How can we fold Lorentz transformations into our index-based notation scheme? Let’s consider the transformation of a contravariant vector
Aμ.
By definition, this vector transforms in the same way as the contravariant displacement vector Xμ. For example, the transformation equation for the time component A0 is
This is the familiar Lorentz transformation along the x axis except that I’ve
called the time component A0 and I’ve called the x component A1. We can always write these transformations in the form of a matrix acting on the components of a vector. For example, I can write
(A′)μ
to represent the components of the 4-vector Aμ in my frame of reference. To express these components as functions of the components in your frame, we’ll define a matrix with upper index μ and lower index ν,
is a matrix because it has two indices; it’s a 4 × 4 matrix that multiplies the 4-vector Aν.4
Let’s make sure that Eq. 6.3 is properly formed. The left side has a free index μ, which can take any of the values (0, ). The right side has two indices, μ and ν. The summation index ν is not an explicit variable in the equation. The only free index on the right side is μ. In other words, each side of the equation has a free contravariant index μ. Therefore, the equation is properly formed; it has the same number of free indices on the left side as it does on the right side, and their contravariant characters match.
Eq. 6.4 gives an example of how we would use Lμν in practice. We have filled in matrix elements that correspond to a Lorentz transformation along the x axis.5
What does the equation say? Following the rules of matrix multiplication, Eq.
6.4 is equivalent to four simple equations. The first equation specifies the value of t′, which is the first element of the vector on the left side. We set t′
equal to the dot product of the first row of the matrix with the column vector on the right side. In other words, the equation for t′ becomes
or simply
Carrying out the same process for the second row of Lμν gives the equation for x′
The third and fourth rows produce the equations
It’s easy to recognize these equations as the standard Lorentz transformation along the x axis. If we wanted to transform along the y axis instead, we would just shuffle these matrix elements around. I’ll leave it to you to figure out how to do that.
Now let’s consider a different operation: a rotation in the y, z plane, where the variables t and x play no part at all. A rotation can also be represented as a matrix, but the elements would be different from our first example. To begin with, the upper-left quadrant would look like a 2 × 2 unit matrix. That assures that t and x are not affected by the transformation. What about the lower-right quadrant? You probably know the answer. To rotate the coordinates by angle θ, the matrix elements would be the sines and cosines shown in Eq. 6.5.
Following the rules of matrix multiplication, Eq. 6.5 is equivalent to the four
equations
In a similar way, we can write matrices representing rotations in the x, y or x, z planes by shuffling these matrix elements to different locations within the matrix.
Once we define a set of transformation matrices for simple linear motion and spatial rotations, we can multiply these matrices together to make more complicated transformations. In this way, we can represent a complicated transformation using a single matrix. The simple transformation matrices shown here, along with their y and z counterparts, are the basic building blocks.