Manifolds Application II

Einstein's Equations

Bianchi, Ricci, and Italian Surnames in Geometry



Welcome back everyone! Hopefully you've had enough time to reiterate over all of my blog posts thus far, because the topics discussed in this post will span the majority of my previous content. Most importantly, I hope that you were able to follow along with my last blog post, which introduced the structural equations and gave a look into how to deduce geodesics over a Riemannian manifold — we will be doing something very similar in this blog post. Previous posts have been archived; my own journey through algbraic geometry has made me realize there are far more enlightening discussions of Riemannian geometry than mine, and my colloquial blogpost-style approach may be more misleading than helpful. Any necessary reference material can be found in Introduction to Riemannian Manifolds (Lee).

Before we jump into the thick of things, I must clear up some prerequisite material that I failed to cover (this seems to be a common across most of my posts) and introduce some unfamiliar notation

Up to this point, I have used the term 'tensor' to refer to our covectors \(dx_1, dx_2, \dots, dx_n\) — this is not a correct (in either the algebraic or differenital geometery sense) definition. More generally, let \(V\) be some vector space (as you may expect, this will often be \(V = T_pM\) for some manifold \(M\)). As before, we say that a map \(f(v_1, \dots, v_n) \) is multilinear if it satisfies

  1. \(f(\dots, a_i + b_i, \dots ) = f(\dots, a_i, \dots) + f(\dots, b_i, \dots) \) at each index \(1\leq i\leq n\).
  2. \( f(\dots, kv_i,\dots ) = k\,f(\dots, v_i, \dots) \) at each index \(1\leq i\leq n\).

which really isn't all that surprising since it simply means 'linear when all other variables become fixed'. Given some integers \(p,q \geq 0\), we (correctly) define a \( (p,q) \)-tensor to be a multilinear map

$$ T: \underbrace{ V \times \dots \times V }_{p\ \textrm{copies}} \times \underbrace{V^* \times \dots \times V^*}_{q\ \textrm{copies}} \to \mathbb{R} $$

We refer to the space of \((p,q)\)-tensors as \( \mathcal{T}_p^q(V) \) (if either \(p\) or \(q\) are \(0\), we simply omit that index). An algebraist may want to be pedantic here and say that the tensor is actually the object associated under the universal property to this map, but for the sake of this blogpost I will consider that a bit of a red-herring. To bring this definition full circle, if \(T \in \mathcal{T}^1\) is skew-symmetric (i.e. so that

$$ T(x_1, \dots,x_i, x_{i+1},\dots, x_n ) = -T(x_1, \dots, x_{i+1},x_i,\dots, x_n) $$

for all \(x_i\) ) then we call \(T\) an exterior form. The archetypal example are our differential forms \(dx_i\) — these are \( (0, 1)\)-tensors, while a general \(k\)-form \( dx_{i_1} \wedge \dots \wedge dx_{i_k} \) is a \( (0,k) \)-tensor. Alternatively, our vector fields can simply be described as \( (1, 0) \)-tensors.

Most of everything I have just explained should be simple review from my first manifolds post, with a little bit fancier notation sprinkled in.







The next thing I want to introduce is the distinction between covariant and contravariant parts of a tensor.

Suppose we have some manifold \(M\). By this point, hopefully you're not confused when I say \( \frac{\partial}{\partial x_1}\vert_p, \dots, \frac{\partial}{\partial x_n}\vert_p \) forms a basis for our tangent space \(T_p M\). The multivariable chain rule says that if you wanted to change coordinate systems from \(x_1, \dots, x_n\) to some \(y_1, \dots, y_n\) then we would apply the following transformation:

$$ \frac{\partial}{\partial x_i}\vert_p = \sum_j \frac{\partial y_j}{\partial x_i} (p) \frac{\partial}{\partial y_j}\vert_p $$

This is known as the covariant transform. On the other hand, when we look at a change of coordinate system for the cotangent space \(T_p^\ast M = (T_p M)^\vee\) we wind up with the following:

$$ dx_i = \sum_j \frac{\partial x_j}{\partial y_j}\,dy_j $$

This is known as the contravariant transformation. Therefore, given some \((p,q)\)-tensor, we refer to the \(p\) covectors as "contravariant" vectors and the \(q\) vectors as "covariant" vectors.

I'll go ahead and stop right there to say it for you: this notation / naming convention is completely backwards: wouldn't you expect that covectors be labelled as covariant vectors?


Classical terminology used these same words [covariant and contravariant], and it just happens to have reversed this: a vector field is called a contravariant vector field, while a section of \(T^*M\) is called a covariant vector field. And no one has had the gall or authority to reverse terminology so sanctified by years of usage. So it's very easy to remember which kind of vector field is covariant, and which contravariant — it's just the opposite of what it logically ought to be.

Michael Spivak in A Comprehensive Introduction to Differential Geometry






With this correction to our terminology handled, recall that we derived the following structural equations in a previous blog post:

  1. First Structural Equation: \(\tau_i = d\theta^i + \sum_j \omega^i_j \wedge \theta_j \)
  2. Second Structural Equation: \( \Omega^i_j = d\omega^i_j + \sum_k \omega^i_k \wedge \omega^k_j \)

We may use these two equations to derive yet another two equations, known as the Bianchi Identities.


First Bianchi Identity:
Let \(\nabla\) be an affine connection. Let \(\theta\) and \(\tau\) denote the column vectors corresponding to the dual \(1\)-forms \(dx_i\) and torsion forms \(\tau^i\), respectively. Moreover, let \(\Omega = [ \Omega^i_j ]\) and \(\omega = [\omega^i_j] \) represent the curvature and connection matrices. Then $$ d\tau = \Omega \wedge \theta - \omega \wedge \tau $$



Second Bianchi Identity:
Let \(\nabla\) be a connection over some smooth vector bundle \(E\), and let \(\omega = [\omega^i_j]\) and \(\Omega = [\Omega^i_j]\) be the connection and curvature matrices, respectively. Then $$ d\Omega = \Omega \wedge \omega - \omega \wedge \Omega $$


If you remember, we previously defined a Riemannian connection to be \((i)\) torsion-free and \((ii)\) compatible with the metric. Thus, over Riemannian bundles we have that \(\tau = d\tau = 0\). Plugging this into the First Bianchi Identity, we can see that

$$ \Omega \wedge \theta = 0 $$

We will use this simple identity in our next theorem which will be used to characterize "curvature-like" tensors:


Theorem:
Let \(X, Y, Z, W \in \mathfrak{X}(M)\) be smooth vector fields over a Riemannian manifold \(M\). If \(R\) denotes the curvature tensor with respect to a Riemannian bundle \(\nabla\), then we have that

  1. Skew Symmetry: \(R(X, Y) = -R(Y, X)\)

  2. Skew Symmetry under Metric: \( \langle R(X, Y)Z, W\rangle = -\langle R(X,Y)W, Z\rangle \)

  3. First Bianchi Identity (Alt.): \(R(X,Y)Z + R(Y,Z)X + R(Z,X)Y = 0 \)


The theorem above actually gives us a great way to generalize curvature across tensors. In general, we say that a covariant tensor of degree 4 (i.e. a \((4,0)\)-tensor) \(T \in \mathcal{T}^4(V)\) over a vector space is curvature-like if it satisfies

  1. \(T(v_1, v_2, v_3, v_4) = -T(v_2, v_1, v_3, v_4)\)
  2. \( T(v_1, v_2, v_3, v_4) = -T(v_1, v_2, v_4, v_3) \)
  3. \(T(v_1, v_2, v_3, v_4) + T(v_1, v_3, v_4, v_2) + T(v_1, v_4, v_2, v_3) = 0\)

for all \(v_1, v_2, v_3, v_4 \in V\).

Our previous theorem shows that the Riemannian curvature tensor is curvature-like, which should be\(\dots\) well, obvious. But what else can we say about curvature-like tensors? For starters, we may apply characteristics \(1\)-\(3\) above to see that

$$ \begin{align} 2T(v_1, v_2,v_3,v_4) &= T(v_1, v_2, v_3, v_4) - T(v_2, v_1, v_3, v_4) \\&= - \big( T(v_1, v_3, v_4, v_2) + T(v_1, v_4, v_2, v_3) \big) \\&\hspace{2em} + \big( T(v_2, v_3, v_4, v_1) + T(v_2, v_4, v_1, v_3)\big) \\&= - \big( T(v_3, v_1, v_2, v_4) + T(v_3, v_2, v_4, v_1) \big) \\&\hspace{2em} + \big( T(v_4, v_1, v_2, v_3) + T(v_4, v_2, v_3, v_1) \big) \\&= T(v_3, v_4, v_1, v_2) - T(v_4, v_3, v_1, v_2) \\&= 2T(v_3, v_4, v_1, v_2) \end{align} $$

In terms of our Riemannian curvature, this tells us that \( \langle R(X,Y)Z, W\rangle = \langle R(Z,W)X, Y\rangle \).

We may use any curvature-like tensor \(T \in \mathcal{T}^4(V)\) to define two other kinds of curvature familiar to differential geometers: the sectional curvature and the Ricci curvature.

Let \(H\) be some plane that intersects our Riemannian manifold \(M\) spanned by orthonormal vectors \(X\) and \(Y\). We define the sectional curvature \(K_T(H)\) with respect to our curvature-like tensor \(T\) to be

$$ K_T(H) = T(X, Y, Y, X) $$

As reader may be able to visualize, our sectional curvature allows us to look at the curvature of level sets on our manifold (with the proper choice of plane \(\Pi\)). In other words, we are essentially cutting our manifold and looking at the cross sections.

The sectional curvature given by cutting through by a plane

Given an orthonormal frame \(e_1, \dots, e_n\), we may additionally define what is known as the Ricci curvature tensor (with respect to \(T\)) by

$$ \textrm{Ric}_T(X,Y) = \sum_{i=1}^n T(e_i, X, Y, e_i) $$

The Ricci curvature tensor is a complicated object, but the best explanation I've seen for it is on John Baez's website. Suppose you are an astronaut in space and you "spill" (more like let loose) a can of coffee grounds. The geometry and curvature of spacetime determines how each coffee ground in the can co-moves with one another. If the curvature of spacetime is flat, then all the coffee grounds will travel in a uniform, homogeneous ball until the end of time. However, since gravity affects spacetime curvature (spoiler alert!), the gravity of each individual coffee grind will affect one another. Thus, our initial ball of coffee grinds may eventually deform into a different (possibly ellipsoid) shape. The Ricci curvature tensor keeps track of the rate of change of the volume.



Theorem:
Let \(T \in \mathcal{T}^4(V)\) be a curvature-like tensor over some vector space \(V\). Then the Ricci curvature tensor with respect to \(T\) is symmetric; that is $$ \textrm{Ric}_T(v,w) = \textrm{Ric}_T(w,v) $$ for all \(v, w \in V \).







Einstein's Field Equations



Over the years popular culture has hyper-focused on only one of Einstein's famous equations

$$ E = mc^2 $$

(which really should be written \( E^2 = m^2 + p^2 \) by setting \(c=1\)) and little love has been given the more mathematically interesting equations (pertaining to geometry). It somewhat makes sense as the notation gets quite messy when playing with differential geometry, but I would still like to give a crack at it for those interested.

In the previous section, we built up a lot of theory regarding curvature-like tensors \(T \in \mathcal{T}^4(V)\). For the remainder of this article, we will only concern ourselves with the Riemannian curvature tensor \(\textrm{Rm}(X, Y, Z, W)\) defined by

$$ \textrm{Rm}\,(X,Y,Z,W) = \langle R(X,Y)Z,W\rangle $$

As previously noted, this was the archetypal example for curvature-like tensors — thus, we use it to define the Ricci curvature \(\textrm{Ric}\) on a Riemannian manifold:

$$ \textrm{Ric}(X,Y) = \sum_i \textrm{Rm}(e_i, X, Y, e_i) $$

As an aside, I've never been a fan of physicist's notation — there's a lot of indices flying around and it can look like an absolute mess. I'll do my best to clarify a few common notation conventions used in differential geometry.


  • Einstein Summation Convention: The same symbol that occurs twice (generally, once as superscript and one as subscript) is implied to be summed over — that is \(A_\alpha B^\alpha =\sum_\alpha A_\alpha B^\alpha\). Similarly, an index variable that is not balanced on both sides of an equals sign (=) is assumed to be summed over — that is, \( y = a_ix^i \) means the same as \(y = \sum_i a_ix^i \).

  • Contraction: Similar to Einstein summation, if an index variable is repeated in both an upper index and lower index then they may be cancelled out by summing. For example, suppose we were to represent our Riemannian curvature by \(\textrm{Rm}^i_{jkl} = \textrm{Rm}(e_i, e_j, e_k, e_l)\). According to Einstein summation convention, we may represent the Ricci curvature as \( \textrm{Ric}_{jk} = \textrm{Rm}^i_{jki} \) — in this case, we sum over the \(i\) to give us \( \textrm{Ric}_{jk} = \sum_i \textrm{Rm}^i_{jik} \). It is worth noting that our coefficients \(\textrm{Ric}\) no longer contain the index variable \(i\) — that is, it has been contracted.

  • Covariant Derivative as Subscript: Whenever a semicolon (;) is present in subscript followed by a single index variable, it implies that you take the covariant derivative of the term with respect to that index variable. For instance, if \(e_1, \dots, e_n\) is a smooth frame then \( R^i_{jkl;m} \) really means \(\nabla_{e_m} R^i_{jkl}\).

  • Riemannian Metric as Coefficient: Let \(e_1, \dots, e_n\) be some smooth frame for the tangent bundle with dual frame \(\theta^1, \dots, \theta^n\). Then we may represent our Riemannian metric \(g\) as \( g = \sum_{i,j} g_{ij} \theta^i \otimes \theta^j \). Conesequently, we may define a matrix representation of our Riemannian metric \( [g_{ij}] \). We denote the entries of the inverse matrix using superscripts — that is, \([g_{ij}]^{-1} = [g^{ij}]\).

Using what we have just learned (specifically regarding contraction), we may succintly define what is known as the scalar curvature as follows:

$$ S = \textrm{Ric}^i_i = \sum_i \textrm{Ric}(e_i, e_i) $$

I'll now go ahead and introduce the more popular form of the Second Bianchi Identity, especially so among physicists:


Theorem:
Let \(\nabla\) be a connection compatible with the metric on a Riemannian manifold \(M\). Then $$ \textrm{Rm}^i_{jkl;m} + \textrm{Rm}^i_{jlm;k} + \textrm{Rm}^i_{jmk;l} = 0 $$


Recall that we defined our Riemannian curvature tensor \(\textrm{Rm}\) by

$$ \textrm{Rm}(X, Y, Z, W) = \langle R(X, Y)Z, W\rangle $$

If we let \(\textrm{Rm}_{ijkl}\) denote the components for our local representation of the Riemannian curvature tensor (i.e. \(\textrm{Rm}_{ijkl} = \textrm{Rm}(e_i, e_j, e_k, e_l) \) ), it follows that

$$ \textrm{Rm}_{ijkl} = g_{ir}R^r_{jkl} = \sum_{r=1}^n g_{ir}R^r_{jkl} $$

Multiplying through by the inverse matrix representation of our Riemannian metric, we get that

$$ R^m_{jkl;m} = \sum_r g^{mr}\textrm{Rm}_{jkl;m} $$

Thus, our previous theorem simplifies further to

$$ \begin{align} 0 &= \sum_m R^m_{jkl;m} + \sum_m R^m_{jlm;k} + \sum_m R^m_{jmk;l} \\&= \sum_{m.r}g^{mr} \textrm{Rm}_{rjkl;m} - \textrm{Ric}_{jl;k} + \textrm{Ric}_{jk;l} \\&= - \sum_{m,r,l} g^{mr}R^l_{rkl;m} - S_{;k} + \sum_l \textrm{Ric}^l_{k;l} \hspace{3em}(\textrm{raise}\ \textrm{index}\ j,\ \textrm{set}\ j=l, \ \textrm{contract}) \end{align} $$

We may lower the index \(l\) to give us

$$ \sum_l ( \textrm{Ric}_{kl} - \frac{1}{2} Sg_{kl})_{;l} = 0 $$

The equation above leads to the famous Einstein tensor:

$$ G = \textrm{Ric} - \frac{1}{2}Sg $$

If you didn't follow the steps — trust me, you're not alone. A running joke that I have heard countless times in the differential geometry community is


Differential geometry is the study of things that are invariant under change of notation.
Anonymous




Around early November 1915, Einstein believed that the Ricci tensor \(\textrm{Ric}\) should be proportional to the energy-stress tensor \(T\). In other words, Einstein postulated that

$$ \textrm{Ric}_{\mu\nu} = a T_{\mu\nu} $$

for some constant \(a \in \mathbb{R}\) and indices \(\mu,\nu\).

The stress-energy tensor is a bit of a complicated object that arises in continuum mechanics — I could go down the rabbit hole trying to explain the concepts in detail, but for now I'll try to give a surface-level overview. The stress-energy tensor itself is a 4-dimensional analogue of what is known as the Cauchy stress tensor, which we may mathematically represent by the following matrix:

$$ \begin{pmatrix} T^{11} & T^{12} & T^{13} \\ T^{21} & T^{22} & T^{23} \\ T^{31} & T^{32} & T^{33} \end{pmatrix} $$

If you can imagine some 3-manifold divided into a bunch of infinitesimal cubes with three perpendicular vectors normal to each direction of our cubes, the elements \(T^{ij}\) represent the push, pull, and shear stress experienced along each coordinate direction. For this reason, the diagonal elements \(T^{ii}\) are referred to as the pressure, while the non-diagonal elements are referred to as the shear stress.

For example, let's visualize what's going on in two dimensions:

Shear stress of square

When our matrix \([T^{ij}]\) is not symmetric, the shear stress becomes unbalanced, thus causing angular momentum. In higher dimensions, we have that a non-symmetric stress-energy tensor correpsonds to a non-zero torsion element — consequently, we will assume from here on out that the stress energy tensor is symmetric.

Recall that in 4-dimensional Minkowski space, we assume that the first coordinate \( x_0 = ict \) corresponds to time. For this reason, our elements \(T^{ij}\) become a little bit different when a \(0\) is present in the index. When only one zero is present (i.e. \(T^{i0}\) or \(T^{0j}\)), we refer to the element as energy flux — however, the element \(T^{00}\) is referred to as the energy density.

In physics, one defines a ideal fluid to be a gas or liquid that has constant density, no conductivity and no viscosity. When conductivity is zero, we have that the energy flux must be zero as well; moreover, if there is no viscosity then the liquid doesnt experience sheer stress. Therefore, the stress energy tensor of a perfect fluid can be described by the diagonal matrix

$$ \begin{pmatrix} \rho & 0 & 0 & 0 \\ 0 & p & 0 & 0 \\ 0 & 0 & p & 0 \\ 0 & 0 & 0 & p \end{pmatrix} $$

where \(\rho\) denotes the energy density and \(p\) is the pressure. This may be put into tensor form via the equation \(T = (\rho + p)u^\beta \otimes u^\beta + pg\), where \(u^\beta\) represents four-velocity.

The reason I bring this up is because — in recent years — there has been notable research in the field of relativity showing that spacetime behaves very similar to an ideal fluid.

One important fact regarding the stress-energy tensor is that it's used to generalize the conservation of energy in the terse equation \(T^\mu_{\nu;\mu} = 0\) (this is often referred to as the continuity equation). Going back to Einstein's postulate that \(\textrm{Ric}_{\mu\nu} = a T_{\mu\nu}\), if you take the covariant derivative of both sides you'll find that \(\textrm{Ric}^\mu_{\nu;\mu} \neq 0\) — this was obviously a huge point of conflict for Einstein in his early work. However, after making similar derivations to ours above, Einstein found that his tensor \(G\) aleviated the long outstanding contention between curvature and energy. Later in November 1915, Einstein published Grundgedanken der allgemeinen Relativitätstheorie und Anwendung dieser Theorie in der Astronomie (a bit of a mouthful, but nonetheless one of his most famous papers), which outlined the beautiful equation:

$$ G_{\mu \nu} + \Lambda g_{\mu\nu} = \frac{8\pi\gamma}{c^4} T_{\mu\nu} $$

This became known as Einstein's Field Equation. On the left side of the equation, the constant \(\Lambda\) became known as the cosmological constant and describes the vaccuum energy (i.e. energy density of empty space) in our observable universe. On the right side of the equation, we use \(\gamma\) to denote Newton's gravitational constant — this would typically be denoted by \(G\) (like in Newton's equation \(F = \frac{Gm_1m_2}{r^2}\)) if Einstein's tensor were not present.

Einstein's field equation doesn't roll off the tongue as naturally as \(E = mc^2\); despite this, it has remained one of the most important equations in theoretical physics since its conception. over the past several decades, physicists have spent countless hours fervorously studying the vacuum solutions to Einstein's equations, in which case \(T \equiv 0\). It is worth noting that when the stress energy tensor vanishes, we have that

$$ R_{\mu\nu} = \frac{1}{2}Sg_{\mu\nu} $$

Using a bit of Ricci calculus, it isn't hard to show that when the stress-energy tensor \(T\) vanishes the Ricci curvature tensor \(\textrm{Ric}\) vanishes as well. In light of this, many physicists and mathematicians have recently turned their attention to Ricci-flat manifolds in order to model vaccuum solutions to Einstein's space equations. Such manifolds are commonly referred to as Calabi-Yau manifolds, and are a huge focal point of theoretical physics (and are also easy to play with from a Hodge-theoretic perspective, c.f the Torelli Theorems).








Schwarzschild-De Sitter Spacetime



In the previous section, we described how Einstein's field equation describes how the geometry of our universe is affected by mass and energy, similar to how to dropping a bowling ball on a bed sheet induces curvature. Moreover, by looking at the vacuum solutions \(T_{\mu\nu} \equiv 0\), we are able to isolate our Riemannian metric \(g_{\mu\nu}\) and describe our observable universe as a Riemannian manifold. In this case, matter and energy simply become analogous to deformations of our spacetime manifold.

There are numerous different models used to describe this universe's spacetime, but the most prevalent is known as Schwarzschild- De Sitter Spacetime which we will represent by \(dS_n\). Topologically, \(dS_n\) is simply \(\mathbb{R}^2 \times S^{n-1}\) (where the real components corresponds to time and radius). Ultimately, the presence of the sphere should make sense if we assume the universe would have expanded homogeneously without the presence of matter. Consequently, in \(3\) spatial dimensions we will consider the coordinate system \( (t, r, \theta, \phi) \).

It doesn't take much work to show that a 2-dimensional sphere of radius \(a\) comes equipped with a Riemannian metric

$$ g\vert_{S^2} = a^2d\theta^2 + a^2\sin^2\theta d\phi^2 $$

Using this, we expect our Riemannian metric \(g\vert_{dS_3}\) in de Sitter space to contain the terms \(r^2d\theta^2 + r^2\sin^2\theta d\phi^2\). The best we can do for the remaining terms corresponding to \(dt^2\) and \(dr^2\) is simply guess. If we let \(f_1\) and \(f_2\) be some smooth functions over \(dS_3\), we have the following Riemannian metric:

$$ g\vert_{dS_3} = f_1 \,dt^2 + f_2\,dr^2 + r^2(d\theta^2 + \sin^2\theta d\phi^2) $$

For the sake of simplicity, we assume that our choice functions \(f_1\) and \(f_2\) are of the form \(f_1 = -e^{2\alpha}\) and \(f_2 = e^{2\beta}\) for some smooth functions \(\alpha\) and \(\beta\). Using this, we get the following orthonormal frame

$$ e_1 = e^{-\alpha} \frac{\partial}{\partial t}\hspace{2em} e_2 = e^{-\beta} \frac{\partial}{\partial r} \hspace{2em} e_3 = \frac{1}{r} \frac{\partial}{\partial \theta} \hspace{2em} e_4 = \frac{1}{r\,\sin\theta} \frac{\partial}{\partial \phi} $$

and dual local basis

$$ \theta^1 = e^\alpha \,dt \hspace{2em} \theta^2 = e^\beta \,dr \hspace{2em} \theta^3 = r\,d\theta \hspace{2em} \theta^4 = r\sin\theta \,d\phi $$

I'll go ahead and apologize for the repetitive notation used for \(\theta\) — without a superscript we use it as a coordinate corresponding to angle, but with a superscript we use it to represent the dual coordinate basis.




Suppose we represent our connection forms by \(\omega^i_j = a^i_j \,dt + b^i_j \,dr + c^i_j\,d\theta + d^i_j d\phi\). By the first structural equation, we may directly compute

$$ \begin{align} 0 &= d\theta^i + \omega^i_1 \wedge \theta^1 + \omega^i_2 \wedge \theta^2 + \omega^i_3 \wedge \theta^3 + \omega^i_4 \wedge \theta^4 \end{align} $$

for \(i=1,2,3,4\). I'll go ahead and spare you all the algebra, but this gives us

$$ \begin{align} \omega^1_2 &= - \omega^2_1 = \frac{\partial \alpha}{\partial r}e^{\alpha - \beta}\, dt \\ \omega^2_3 &= -\omega^3_2 = - e^{-\beta}\,d\theta \\ \omega^2_4 &= - \omega^4_2 = -e^{-\beta}\sin \theta \,d\phi \\ \omega^3_4 &= -\omega^4_3 = - \cos \theta\,d\phi \end{align} $$

and \(\omega^i_j = 0\) otherwise. If we let \(\alpha' = \frac{\partial \alpha}{\partial r}\) and \(\beta' = \frac{\partial \beta}{\partial r}\), the second structural equation gives us

$$ \begin{align} \Omega^1_2 &= -\Omega^2_1 = d\omega^1_2 = -e^{\alpha-\beta}( \alpha'' + (\alpha')^2 -\alpha'\beta')\,dt \wedge dr \\&= -e^{2\beta}(\alpha'' + (\alpha')^2 -\alpha'\beta') \, \theta^1 \wedge \theta^2 \\ \Omega^1_3 &= -\Omega^3_1 = \omega^1_2 \wedge \omega^2_3 = -\frac{1}{r} \alpha' e^{-2\beta}\, \theta^1 \wedge \theta^3 \\ \Omega^1_4 &= -\Omega^4_1 = \omega^1_2 \wedge \omega^2_4 = -\frac{1}{r} \alpha' e^{-2\beta}\, \theta^1 \wedge \theta^4 \\ \Omega^2_3 &= -\Omega^3_2 = d\omega^2_3 + \omega^2_4 \wedge \omega^4_3 = \beta'e^{-\beta}\,dr \wedge d\theta \\&= \frac{1}{r} \beta' e^{-2\beta}\, \theta^2 \wedge \theta^3 \\ \Omega^2_4 &= -\Omega^4_2 = d\omega^2_4 + \omega^2_3 \wedge \omega^3_4 = \beta' e^{-\beta}\sin\theta\, dr \wedge d\phi \\&= \frac{1}{r}\beta'e^{-2\beta}\,\theta^2 \wedge \theta^4 \\ \Omega^3_4 &= -\Omega^4_3 = d\omega^3_4 + \omega^3_2 \wedge \omega^2_4 = (\sin \theta - e^{-2\beta} \sin \theta) \, d\theta \wedge d\phi \\&=\frac{1}{r^2} (1 - e^{-2\beta})\, \theta^3 \wedge \theta^4 \end{align} $$

Recall that, given a smooth frame \(e_1, \dots, e_n\), we defined the curvature forms to be the \(2\)-forms which satisfy

$$ R(X, Y)e_j = \sum_{i=1}^n \Omega^i_j(X, Y)e_i $$

Since we defined the curvature-like tensor \(R^i_{jkl}\) by \(R(e_k, e_l)e_j = \sum_{i=1}^n R^i_{jkl}e_i\), it follows that we may define \(R^i_{jkl}\) solely in terms of the curvature forms:

$$ R^i_{jkl} = \Omega^i_j(e_k, e_l) $$

Alternatively, if we are able to represent our curvature tensors \(R^i_{jkl}\) in terms of the dual frame \(\theta^1, \dots, \theta^n\), then we may also define the curvature forms in terms of the curvature tensor:

$$ \Omega^i_j = \sum_{k \lt l} R^i_{jkl} \theta^k \wedge \theta^l $$

Fortunately, we calculated the curvature forms \(\Omega^i_j\) both in terms of our original coordinates \((t, r, \theta, \phi)\) and our dual frame! Using the latter, we are able to compute the curvature tensors as follows:

$$ \begin{align} R^1_{212} &= - R^1_{221} = R^2_{112} = - R^2_{121} = -e^{-2\beta}(\alpha'' + (\alpha')^2 - \alpha'\beta') \\ R^1_{i1i} &= -R^1_{ii1} = R^i_{11i} = -R^i_{1i1} = - \frac{1}{r} \alpha' e^{-2\beta}\ \hspace{2em} \textrm{for} \ \ i=3,4 \\ R^2_{i2i} &= -R^2_{ii2} = -R^i_{22i} = R^i_{2i2} = \frac{1}{r}\beta' e^{-2\beta} \ \hspace{2em} \textrm{for} \ \ i=3,4 \\ R^3_{434} &= -R^3_{443} = -R^4_{334} = R^4_{343} = \frac{1}{r^2} (1 - e^{-2\beta}) \end{align} $$

and \(R^i_{jkl} = 0\) otherwise. Using our previous contraction \(\textrm{Ric}_{jk} = R^i_{jik}\), we get

$$ \begin{align} \textrm{Ric}_{11} &= R^2_{121} + R^3_{131} + R^4_{141} = e^{-2\beta} (\alpha'' + (\alpha')^2 -\alpha'\beta') + \frac{2}{r} \alpha' e^{-2\beta} \\ \textrm{Ric}_{22} &= R^1_{212} + R^3_{232} + R^4_{242} = -e^{-2\beta}(\alpha'' + (\alpha')^2 - \alpha' \beta') + \frac{2}{r}\beta' e^{-2\beta} \\ \textrm{Ric}_{33} &= R^1_{313} + R^2_{323} + R^4_{343} = \frac{1}{r} (\beta' - \alpha')e^{-2\beta} + \frac{1}{r^2}(1 - e^{-2\beta}) \\ \textrm{Ric}_{44} &= R^1_{414} + R^2_{424} + R^3_{434} = \frac{1}{r} (\beta' - \alpha')e^{-2\beta} + \frac{1}{r^2}(1 - e^{-2\beta}) \end{align} $$

Since we are using a negative coefficient for the time coordinate (much like in Minkowski space \(\mathbb{R}^4_1\)) in our metric \(g \vert_{dS_3}\), we anticipate a negative coefficient in the first index for any matrix operations on our tensors (with respect to the metric). I originally defined the scalar curvature \(S\) in the standard way that a physicist might: as the contraction of the Ricci tensor \(S = \textrm{Ric}_{ii}\). An equivalent definition that I have seen more so in differential geometry textbooks is by defining \(S = \textrm{tr}_g \textrm{Ric} \). From the latter definition, it is much more intuitive that the scalar curvature is

$$ \begin{align} S &= -\textrm{Ric}_{11} + \textrm{Ric}_{22} + \textrm{Ric}_{33} + \textrm{Ric}_{44} \\&= -2e^{-2\beta} (\alpha '' + (\alpha')^2 -\alpha'\beta') + \frac{4}{r} (\beta' - \alpha') e^{-2\beta} + \frac{2}{r^2}(1 - e^{-2\beta}) \end{align} $$

Finally, we calculate the Einstein tensor \(G = \textrm{Ric} - \frac{1}{2}Sg\). Again, it is worth noting that \(g\) has a negative sign in the first component so that

$$ \begin{align} G_{11} &= \textrm{Ric}_{11} - \frac{1}{2}Sg_{11} = \textrm{Ric}_{11} + \frac{1}{2}S = \frac{2}{2}\beta' e^{-2\beta} + \frac{1}{r^2} (1 - e^{-2\beta}) \\&= \frac{1}{r^2} \big\{ (r(1 - e^{-2\beta}) \big\}' \\ G_{22} &= \textrm{Ric}_{22} - \frac{1}{2}Sg_{22} = \textrm{Ric}_{22} - \frac{1}{2}S = \frac{2}{r}(\alpha + \beta)' e^{-2\beta} - \frac{1}{r^2} \big\{ (r(1 - e^{-2\beta}) \big\}' \\ G_{ii} &= \textrm{Ric}_{ii} - \frac{1}{2}Sg_{ii} = \textrm{Ric}_{ii} - \frac{1}{2}S = e^{-2\beta} (\alpha'' + (\alpha')^2 - \alpha'\beta') - \frac{1}{r}(\beta' - \alpha')e^{-2\beta} \\&= e^{-2\beta} \Big\{ (\alpha + \beta)'' + (\alpha + \beta)' \left( \alpha' - 2\beta' + \frac{1}{r} \right) \Big\} - \frac{1}{2r} \big\{ r(1 - e^{-2\beta}) \big\}' \ \hspace{2em} \ \textrm{for} \ i = 3,4 \end{align} $$

In order for this metric to be a vacuum solution to Einstein's field equations, we must have that \(G + \Lambda g = 0\). In the second, third, and fourth components above, this means that \((\alpha + \beta)' \) must vanish identically. Basic high-school calculus tells us this occurs when \(\alpha + \beta = \textrm{const} \). Similarly, all four components above must also have that \( r(1 - e^{-2\beta}) = \frac{\Lambda}{3}r^3 + \textrm{const} \).

The constant from the first equation simply represents time dialation — thus, if we normalize the time coordinate we should have \(\alpha + \beta = 0\). However, the constant from the second equality carries far more weight than the first: this constant is typically denoted \(R_s\) and is known as the Schwarzschild radius.

Putting the conditions \(\alpha + \beta = 0\) and \(r(1 - e^{-2\beta}) = \frac{\Lambda}{3}r^3 + R_s \) together, we wind up with

$$ e^{2\alpha} = e^{-2\beta} = 1 - \frac{R_s}{r} - \frac{\Lambda}{3}r^2 $$

giving us the Schwarzschild-De Sitter metric

$$ g\vert_{dS} = -\left( 1 - \frac{R_s}{r} - \frac{\Lambda}{3}r^2 \right)\,dt^2 + \frac{1}{\left( 1 - \frac{R_s}{r} - \frac{\Lambda}{3}r^2 \right)}\, dr^2 + r^2\,d\theta^2 + r^2 \sin^2\theta \,d\phi^2 $$






From this point forward, there are several directions we may go:

  1. Schwarzschild Spacetime : Prior to Hubble's discovery that the expansion of the universe is accelerating, many believed that the cosmological constant \(\Lambda\) was \(0\). For this reason, when \(\Lambda = 0\) we call $$ g_{Sw} = -\left( 1 - \frac{R_s}{r} \right)\,dt^2 + \frac{1}{1 - \frac{R_s}{r}}\,dr^2 + r^2\,d\theta + r^2 \sin^2\theta^2 d\phi^2 $$ the Schwarzschild metric. This metric has two singularities at \(r = 0\) and \(r = R_s\), implying that Schwarzschild spacetime is made up of two connected components \(M_1 = \mathbb{R} \times (0, R_s) \times S^2 \) and \(M_2 = \mathbb{R} \times (R_s, \infty) \times S^2\). The second singularity \(r = R_s\) is actually a removable singularity and represents a membrane which connects the two components \(M_1\) and \(M_2\) — it is often referred to as an event horizon. The region \(M_1\) is additionally referred to as a Schwarzschild black hole, and the singularity \(r = 0\) is referred to as the gravitational singularity.

  2. De Sitter Space : When \(R_s = 0\) and \(\Lambda \gt 0\), spacetime is referred to as De Sitter Space with metric $$ g\vert_{dS} = -\left( 1 - \frac{\Lambda}{3}r^2 \right)\,dt^2 + \frac{1}{1 - \frac{\Lambda}{3}r^2 }\,dr^2 + r^2\,d\theta^2 + r^2\sin^2\theta \,d\phi^2 $$ Since Hubble discovered that \(\Lambda\) has a (small) positive value, this is viewed as a good model for our universe far away from black holes.

  3. Anti-De Sitter Space : When \(R_s = 0\) and \(\Lambda \lt 0\), spacetime is referred to as Anti-De Sitter Space (AdS). Under this model for the universe, the attractive forces of the universe would eventually cause it to stop expanding and eventually collapse in on itself like a balloon letting out air.

  4. Minkowski Space : When \(R_s = 0\) and \(\Lambda = 0\), the metric simply becomes the spherical analog of Minkowski space, as described in a previous blog post. Einstein's equivalence principle says that space is locally Minkowski.

As I mentioned, the singularity \(r = R_s\) is a removable singularity representing the event horizon of a Schwarzschild black hole — but how do we actually remove this singularity? Around 1958, Professor David Fickelstein discovered that by making the change of coordinates

$$ \tilde{r} = r + R_s \log \Big\lvert \frac{r}{R_s} - 1 \Big\rvert $$

one could ultimately resolve the singularity. The coordinate \(\tilde{r}\) is known as the tortoise coordinate on acccount of Zeno's paradox, in which swift-footed Achilles could never overcome a tortoise that had a head-start (somehow implying \( \tilde{r} \to -\infty \) as \(r \to R_s \) I guess). Under this change of coordinates, we have

$$ d\tilde{r} = \frac{1}{1 - \frac{R_s}{r}}\,dr \Rightarrow dr = (1 - \frac{R_s}{r})\,d\tilde{r} $$

so that our Schwarzschild metric becomes

$$ g\vert_{Sw} = \Big( 1 - \frac{R_s}{r} \Big) (-dt^2 + d\tilde{r}^2) + r^2(d\theta^2 + \sin^2\theta\,d\phi^2) $$

I'll skip over the calculations, but the geodesic \(\gamma\) of a travelling light-ray in this system are given by

$$ (t \pm \tilde{r}) \circ \gamma = \textrm{const} $$

Therefore, we replace our time coordinate \(t\) with one of the coordinates \(\nu = t + \tilde{r} \) and \(u = t - \tilde{r} \), known as the ingoing Eddington-Finkelstein coordinate and outgoing Eddington-Finkelstein coordinate respectively. In the former, a simple differentiation shows us that

$$ -dt^2 + d\tilde{r}^2 = -d\nu(d\nu - 2d\tilde{r}) $$

so the Schwarzschild metric becomes

$$ g\vert_{Sw} = -\Big( 1 - \frac{R_s}{r} \Big)\,d\nu^2 + 2d\nu dr + r^2 (d\theta^2 + \sin^2\theta d\phi^2) $$

Since \(\det [g_{ij}] = -r^4 \sin^2\theta \neq 0 \) for \(r \gt 0\), the metric is non-degenerate at the Schwarzschild radius \(r = R_s\). Thus, our previous equation \((t \pm \tilde{r}) \circ \gamma = \textrm{const} \) implies that \(\nu \circ \gamma = \textrm{const} \) for ingoing light and \((\nu - 2 \tilde{r}) \circ \gamma = \textrm{const} \) for outgoing light. Thus, light cannot cross the event-horizon \(r = R_s\) from the inside, and every geodesic is bound to fall into the singularity \(r = 0\) — ultimately, this shows that the Schwarzschild model for spacetime is geodesically incomplete.

Ingoing Eddington Finkelstein coordinates
Ingoing (dashed-red) and outgoing (solid blue) null geodesics in ingoing Eddington-Finkelstein coordinates. Adapted from Black Holes, Fay Dowker. Imperial College of London, 2015.

On the flip-side, when we consider the outgoing Eddington-Finkelstein coordinate \(u\), we have that

$$ -dt^2 + d\tilde{r}^2 = -du(du + 2\,d\tilde{r}) $$

so that the Schwarzschild metric becomes

$$ g\vert_{Sw} = -\Big( 1 - \frac{R_s}{r} \Big)du^2 - 2du\,dr + r^2 (d\theta^2 + \sin^2\theta\, d\phi^2) $$

In this case, our equations for the geodesics become \((u + 2\tilde{r}) \circ \gamma = \textrm{const} \) for ingoing light and \( u \circ \gamma = \textrm{const} \) for outgoing light. Thus, light cannot cross the event-horizon \(r = R_s\) from the outside — that is, light can only cross from the inside. Thus, the region \( \{ r \lt R_s \} \) is known as a white hole.

Outgoing Eddington Finkelstein coordinates
Ingoing (dashed-red) and outgoing (solid blue) null geodesics in outgoing Eddington-Finkelstein coordinates. Adapted from Black Holes, Fay Dowker. Imperial College of London, 2015.

You may see a pretty big inconsistency here — indeed, our two changes in coordinates in the Eddington-Finkelstein coordinate system are not isometric invariants. In other words, black holes and white holes are completely different physical and geometric objects. However, we may resolve this issue by embedding them as a pair in a larger Lorentz manifold.

As before, take \(u = t - \tilde{r}\) and \(\nu = t + \tilde{r}\). Using this change of coordinates, we may represent our previous coordinates by \(t = \frac{1}{2} (u + \nu) \) and \(\tilde{r} = \frac{1}{2} (\nu - u) \). We define two new coordinate functions

$$ \tilde{\nu} = \textrm{exp}\left(\frac{\nu}{2R_s} \right) \hspace{3em} \tilde{u} = -\textrm{exp}\left( -\frac{u}{2R_s} \right) $$

to resolve any singularities at \(r = R_s\). Again, we may attempt to convert this over to our previous coordinate system by \( T = \frac{1}{2} (\tilde{\nu} + \tilde{u}) \) and \( R = \frac{1}{2} (\tilde{\nu} - \tilde{v}) \). It should be a fairly simple high-school level exercise to show that

$$ R^2 - T^2 = -\tilde{\nu}\tilde{u} = \left( \frac{r}{R_s} - 1 \right) \, \textrm{exp}\left( \frac{r}{R_s} \right) $$

We can represent our new coordinates explicitly by

$$ \begin{align} T &= \sqrt{ \left(\frac{r}{R_s} - 1\right)}\, \textrm{exp}\left( \frac{r}{2R_s} \right)\, \sinh \left( \frac{t}{2R_s} \right) \\ R &= \sqrt{ \left(\frac{r}{R_s} - 1\right)} \,\textrm{exp}\left( \frac{r}{2R_s} \right)\, \cosh \left( \frac{t}{2R_s} \right) \end{align} $$

when \( r \gt R_s\), and

$$ \begin{align} T &= \sqrt{ \left(\frac{r}{R_s} - 1\right)} \,\textrm{exp}\left( \frac{r}{2R_s} \right)\, \cosh \left( \frac{t}{2R_s} \right) \\ R &= \sqrt{ \left(\frac{r}{R_s} - 1\right)}\, \textrm{exp}\left( \frac{r}{2R_s} \right)\, \sinh \left( \frac{t}{2R_s} \right) \end{align} $$

when \(0 \lt r \lt R_s\). This gives an inverse transform of \( \tanh\left( \frac{t}{2R_s} \right) = \frac{T}{R} \) for \(r \gt R_s\) and \(\tanh\left( \frac{t}{2R_s} \right) = \frac{R}{T}\) for \(0 \lt r \lt R_s \). Using these transformations, we get that

$$ \begin{align} -dT^2 + dR^2 &= -d\tilde{\nu}d\tilde{u} = \frac{1}{4R_s^2} (- \tilde{\nu}\tilde{u})(- d\nu du) \\&= \frac{r}{4R_s^3}\left( 1 - \frac{R_s}{r} \right)\,\textrm{exp}\left( \frac{r}{R_s} \right)(-dt^2 + dr^2) \end{align} $$

This new system of coordinates \( (T, R, \theta, \phi) \) is known as Kruskal-Szekeres coordinates, and ultimately transforms our Schwarzschild metric into the following form:

$$ g\vert_{KS} = \frac{4R_s^3}{r} \, \textrm{exp}\left( -\frac{r}{R_s} \right)(-dT^2 + dR^2) + r^2 (d\theta^2 + \sin^2\theta\,d\phi^2) $$

This gives us a new Lorentzian manifold \(\mathcal{M} = \{ (T, R) \in \mathbb{R}^2 \mid T^2 - R^2 \lt 1 \} \times S^2 \) with four regions of interest:

$$ \begin{align} \mathcal{M}_{I} &= \{ (T,R, \theta, \phi) \in K \mid |T| \lt R \} \\ \mathcal{M}_{II} &= \{ (T, R, \theta, \phi) \in K \mid |R| \lt T \lt \sqrt{R^2 + 1} \} \\ \mathcal{M}_{III} &= \{ (T, R, \theta, \phi) \in K \mid T \gt |R| \} \\ \mathcal{M}_{IV} &= \{ (T, R, \theta, \phi) \in K \mid -\sqrt{1 + R^2} \lt T \lt -|R| \} \end{align} $$
A depiction of the four regions in Kruskal Szekeres coordinates


The regions \(\mathcal{M}_I\) and \(\mathcal{M}_{III}\) are both isometric to the external Schwarzschild spacetime — however, these two regions cannot be connected by any null geodesic. The correspondence between these two spaces is known as an Einstein-Rosen bridge. If you follow the link, you'll notice that there has been mathematically-rigorous ways to describe the change of coordinates used to describe such a bridge.

Similarly, the regions \(\mathcal{M}_{II}\) and \(\mathcal{M}_{IV}\) are our black hole and white hole, respectively. The purple-dotted line represents the singularity of our black / white hole — it is given by the hyperbola \(T^2 - R^2 = 1\) (yes, I know that I didn't draw it like a hyperbola in the picture).

I'll go ahead and wrap things up here since these topics open up into a much broader theory, and we have already accomplished what we set out to achieve. Interestingly enough, once you make it past deriving the Schwarzschild metric, all the remaining material simply becomes high-school algebra again! In the end, you should understand that white holes and Einstein-Rosen bridges are mathematical abstractions, not physical entities. Hopefully not too many of you are let down that I took the mathematically rigorous approach and not the clickbait-scifi approach, but at least you'll know what you're talking about the next time some coworker tries to explain Interstellar or Donnie Darko to you.

Thanks for reading! 😁