Dirac equation

The Dirac equation is a relativistic quantum mechanical wave equation invented by Paul Dirac in 1928. It provides a description of elementary spin½ particles, such as electrons, that is fully consistent with the principles of quantum mechanics and largely consistent with the theory of special relativity. It also accounts in a natural way for the nature of particle spin and the existence of antiparticles.
Contents 
Introduction
Since the Dirac equation was originally invented to describe the electron, we will generally speak of "electrons" in this article. Actually, the equation applies to other types of elementary spin½ particles, such as quarks (but not neutrinos because the lefthanded and the hypothetical righthanded neutrinos have different Majorana masses). A modified Dirac equation can be used to approximately describe protons and neutrons, which are made of smaller particles called quarks and are therefore not elementary particles.
The Dirac equation is
 <math> \left(\alpha_0 mc^2 + \sum_{j = 1}^3 \alpha_j p_j \, c\right) \psi (\mathbf{x},t) = i \hbar \frac{\partial\psi}{\partial t} (\mathbf{x},t) <math>
where m is the rest mass of the electron, c is the speed of light, p is the momentum operator, <math>\hbar<math> is the reduced Planck's constant, x and t are the space and time coordinates respectively, and ψ(x, t) is a fourcomponent wavefunction. (The wavefunction has to be formulated as a fourcomponent spinor, rather than a simple scalar, due to the demands of special relativity. The physical meanings of the components are discussed below.) The α's are linear operators that act on the wavefunction, written as a column matrix, as 4×4 matrices known as Dirac matrices. There is more than one way to choose a set of Dirac matrices, a convenient choice being
 <math>\alpha_0 = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \quad \alpha_1 = \begin{bmatrix} 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \end{bmatrix} <math>
 <math>\alpha_2 = \begin{bmatrix} 0 & 0 & 0 & i \\ 0 & 0 & i & 0 \\ 0 & i& 0 & 0 \\ i & 0 & 0 & 0 \end{bmatrix} \quad \alpha_3 = \begin{bmatrix} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \end{bmatrix} <math>
All possible choices are related by similarity transformations because Dirac spinors are unique representation theoretically.
The Dirac equation describes the probability amplitudes for a single electron. This singleparticle theory gives a fairly good prediction of the spin and magnetic moment of the electron and explains much of the fine structure observed in atomic spectral lines. It also makes the peculiar prediction that there exists an infinite set of quantum states in which the electron possesses negative energy. This strange result led Dirac to predict, via a remarkable hypothesis known as "hole theory", the existence of particles behaving like positivelycharged electrons. This prediction was verified by the discovery of the positron in 1932.
Despite these successes, the theory is flawed by its neglect of the possibility of creating and destroying particles, one of the basic consequences of relativity. This difficulty is resolved by reformulating it as a quantum field theory. Adding a quantized electromagnetic field to this theory leads to the modern theory of quantum electrodynamics (QED). For a more detailed discussion of the field formulation, refer to the article on Dirac field theory.
Similar equation for spin 3/2 particles is called RaritaSchwinger equation.
A handwaving derivation of the Dirac equation
The Dirac equation is a special case of the Schrödinger equation, which describes the timeevolution of a quantum mechanical system:
 <math> H \left \psi (t) \right\rangle = i \hbar {d\over d t} \left \psi (t) \right\rangle<math>
For convenience, we will work in the position basis, in which the state of the system is represented by a wavefunction, ψ(x,t). In this basis, the Schrödinger equation becomes
 <math> H \psi (\mathbf{x},t) = i \hbar \frac{\partial\psi}{\partial t} (\mathbf{x},t) <math>
where the Hamiltonian H now denotes an operator acting on wavefunctions rather than state vectors.
We have to specify the Hamiltonian so that it appropriately describes the total energy of the system in question. Let us consider a "free" electron isolated from all external force fields. For a nonrelativistic model, we adopt a Hamiltonian analogous to the kinetic energy of classical mechanics (ignoring spin for the moment):
 <math> H = \sum_{j=1}^3 \frac{p_j^2}{2m} <math>
where the p's are the momentum operators in each of the three spatial directions j=1,2,3. Each momentum operator acts on the wavefunction as a spatial derivative:
 <math>p_j \psi(\mathbf{x},t) \equiv  i \hbar \, \frac{\partial\psi}{\partial x_j} (\mathbf{x},t)<math>
To describe a relativistic system, we have to find a different Hamiltonian. Assume that the momentum operators retain the above definition. According to Albert Einstein's famous massmomentumenergy relationship, the total energy of a system is given by
 <math>E = \sqrt{(mc^2)^2 + \sum_{j=1}^3 (p_jc)^2}<math>
This prescribes something like
 <math> \sqrt{(mc^2)^2 + \sum_{j=1}^3 (p_jc)^2} \; \psi = i \hbar \frac{d\psi}{d t} <math>
This is not a satisfactory equation, for it does not treat time and space on an equal footing, one of the basic tenets of special relativity. The square of this equation leads to the KleinGordon equation. Dirac reasoned that, since the right side of the equation contains a firstorder derivative in time, the left side should contain equally simple firstorder derivatives in space (i.e., in the momentum operators). One way for this to happen is if the quantity in the square root is a perfect square. Suppose that you set
 <math>E \cdot I = \alpha_0 mc^2 + c \sum_{i=1}^3 \alpha_i p_i <math>
Here, I stands for the identity element. You'll gain the free Dirac equation:
 <math>i\hbar \frac{d\psi}{dt} = \left[ c \sum_{i=1}^3 \alpha_i p_i + \alpha_0 mc^2 \right] \psi<math>
where the α's are constants to be determined thanks to the relativistic total energy.
 <math> E^2 = (mc^2)^2 + \sum_{j=1}^3 (p_jc)^2 = \left( \alpha_0 mc^2 + \sum_{j=1}^3 \alpha_j p_j \, c \right)^2 <math>
Expanding the square and comparing coefficients on each side, we obtain the following conditions for the α's:
 <math> \alpha_0^2 = I <math>
 <math> \sum_{i=1}^3 (\alpha_i \alpha_0 + \alpha_0 \alpha_i) = 0 \,,\quad \sum_{i,j=1}^3 (\alpha_i \alpha_j + \alpha_j \alpha_i) = \delta_{ij} <math>
These last conditions may be written more concisely as
 <math>\left\{\alpha_\mu , \alpha_\nu\right\} = 2\delta_{\mu \nu} \cdot I \,,\quad \mu,\nu = 0, 1, 2, 3<math>
where {...} is the anticommutator, defined as {A,B}≡AB+BA, and δ is the Kronecker delta, which has the value 1 if its two subscripts are equal and 0 otherwise. See Clifford algebra.
These conditions cannot be satisfied if the α's are ordinary numbers, but they can be satisfied if the α's are matrices. The matrices must be Hermitian, so that the Hamiltonian is Hermitian. The smallest matrices that work are 4×4 matrices, but there is more than one possible choice, or representation, of matrices. Although the choice of representation does not affect the properties of the Dirac equation, it does affect the physical meaning of the individual components of the wavefunction.
In the introduction, we presented the representation used by Dirac. This representation can be more compactly written as
 <math>\alpha_0 = \begin{bmatrix} I & 0 \\ 0 & I \end{bmatrix} \quad \alpha_j = \begin{bmatrix} 0 & \sigma_j \\ \sigma_j & 0 \end{bmatrix} <math>
where 0 and I are the 2×2 zero and identity matrices, respectively, and the σ_{j}'s (j=1,2,3) are the Pauli matrices.
The Hamiltonian in this equation,
 <math> H = \,\alpha_0 mc^2 + \sum_{j = 1}^3 \alpha_j p_j \, c <math>
is called the Dirac Hamiltonian.
Nature of the wavefunction
Since the wavefunction ψ is acted on by the 4×4 Dirac matrices, it must be a fourcomponent object. We will see, in the next section, that the wavefunction contains two sets of degrees of freedom, one associated with positive energies and the other with negative energies, with each set containing two degrees of freedom that describe the probability amplitudes for the spin to be pointing "up" or "down" along a specified direction.
We may explicitly write the wavefunction as a column matrix:
 <math>\psi(\mathbf{x},t) \equiv \begin{bmatrix}\psi_1(\mathbf{x},t) \\ \psi_2(\mathbf{x},t) \\ \psi_3(\mathbf{x},t) \\ \psi_4(\mathbf{x},t) \end{bmatrix} <math>
The dual wavefunction can be written as a row matrix:
 <math>\psi^\dagger(\mathbf{x},t) \equiv \begin{bmatrix}\psi_1^*(\mathbf{x},t) & \psi_2^*(\mathbf{x},t) & \psi_3^*(\mathbf{x},t) & \psi_4^*(\mathbf{x},t) \end{bmatrix} <math>
where the * superscript denotes complex conjugation. By comparison, the dual of a scalar (onecomponent) wavefunction is just its complex conjugate.
As in ordinary singleparticle quantum mechanics, the "absolute square" of the wavefunction gives the probability density of the particle at each position x and time t. In this case, the "absolute square" is the scalar product of the wavefunction with its dual:
 <math>\psi^\dagger \psi \, (\mathbf{x},t) = \sum_{j = 1}^4 \psi_j^*(\mathbf{x},t) \psi_j(\mathbf{x},t) <math>
The conservation of probability gives the normalization condition
 <math>\int \psi^\dagger \psi \, (\mathbf{x},t) \; d^3x = 1 <math>
By applying Dirac's equation, we can examine the local flow of probability:
 <math>\frac{\partial}{\partial t} \psi^\dagger \psi \, (\mathbf{x},t) =  \nabla \cdot \mathbf{J} <math>
The probability current J is given by
 <math> J_j = c \psi^\dagger \alpha_j \psi<math>
Multiplying J by the electron charge e yields the electric current density j carried by the electron.
The values of the wavefunction components depend on the coordinate system. Dirac showed how ψ transforms under general changes of coordinate system, including rotations in threedimensional space as well as Lorentz transformations between relativistic frames of reference. It turns out that ψ does not transform like a vector under rotations and is in fact a type of object known as a spinor.
Energy spectrum
It is instructive to find the energy eigenstates of the Dirac Hamiltonian. To do this, we must solve the timeindependent Schrödinger equation,
 <math>H \psi_0 (\mathbf{x}) = E \psi_0(\mathbf{x}) <math>
where ψ_{0} is the timeindependent part of the energy eigenfunction:
 <math>\psi (\mathbf{x}, t) = \psi_0 (\mathbf{x}) e^{ i E t / \hbar} <math>
Let us look for a planewave solution. For convenience, we align the z axis with the direction in which the particle is moving, so that
 <math> \psi_0 = w e^{\frac{ipz}{\hbar}} <math>
where w is a constant fourcomponent spinor and p is the momentum of the particle, as we can verify by applying the momentum operator to this wavefunction. In the Dirac representation, the equation for ψ_{0} reduces to the eigenvalue equation:
 <math> \begin{bmatrix} mc^2 & 0 & pc & 0 \\ 0 & mc^2 & 0 & pc \\ pc & 0 & mc^2 & 0 \\ 0 & pc & 0 & mc^2 \end{bmatrix} w = E w <math>
For each value of p, there are two eigenspaces, both twodimensional. One eigenspace contains positive eigenvalues, and the other negative eigenvalues, of the form:
 <math>E_\pm (p) = \pm \sqrt{(mc^2)^2 + (pc)^2}<math>
The positive eigenspace is spanned by the eigenstates:
 <math>\left\{ \begin{bmatrix}pc \\ 0 \\ \epsilon \\ 0 \end{bmatrix} \,,\, \begin{bmatrix}0 \\ pc \\ 0 \\  \epsilon \end{bmatrix} \right\} \times \frac{1}{\sqrt{\epsilon^2+(pc)^2}}<math>
and the negative eigenspace by the eigenstates:
 <math>\left\{ \begin{bmatrix}\epsilon \\ 0 \\ pc \\ 0 \end{bmatrix} \,,\, \begin{bmatrix}0 \\ \epsilon \\ 0 \\ pc \end{bmatrix} \right\} \times \frac{1}{\sqrt{\epsilon^2+(pc)^2}}<math>
where
 <math>\epsilon \equiv E  mc^2<math>
The first spanning eigenstate in each eigenspace has spin pointing in the +z direction ("spin up"), and the second eigenstate has spin pointing in the −z direction ("spin down").
In the nonrelativistic limit, the ε spinor component reduces to the kinetic energy of the particle, which is negligible compared to pc:
 <math>\epsilon \sim \frac{p^2}{2m} <\!\!< pc <math>
In this limit, therefore, we can interpret the four wavefunction components as the respective amplitudes of (i) spinup with positive energy, (ii) spindown with positive energy, (iii) spinup with negative energy, and (iv) spindown with negative energy. This description is not accurate in the relativistic regime, where the nonzero spinor components have similar sizes.
Hole theory
The negative E solutions found in the preceding section are problematic, for relativistic mechanics tells us that the energy of a particle at rest (p = 0) should be E = mc² rather than E = −mc². Mathematically speaking, however, there seems to be no reason for us to reject the negativeenergy solutions. Since they exist, we cannot simply ignore them, for once we include the interaction between the electron and the electromagnetic field, any electron placed in a positiveenergy eigenstate would decay into negativeenergy eigenstates of successively lower energy by emitting excess energy in the form of photons. Real electrons obviously do not behave in this way.
To cope with this problem, Dirac introduced the hypothesis, known as hole theory, that the vacuum is the manybody quantum state in which all the negativeenergy electron eigenstates are occupied. This description of the vacuum as a "sea" of electrons is called the Dirac sea. Since the Pauli exclusion principle forbids electrons from occupying the same state, any additional electron would be forced to occupy a positiveenergy eigenstate, and positiveenergy electrons would be forbidden from decaying into negativeenergy eigenstates.
Dirac further reasoned that if the negativeenergy eigenstates are incompletely filled, each unoccupied eigenstate – called a hole – would behave like a positively charged particle. The hole possesses a positive energy, since energy is required to create a particle–hole pair from the vacuum. Dirac initially thought that the hole was a proton, but Hermann Weyl pointed out that the hole should behave as if it had the same mass as an electron, whereas the proton is over a thousand times heavier. The hole was eventually identified as the positron, experimentally discovered by Carl Anderson in 1932.
However, it is not entirely satisfactory to describe the "vacuum" using an infinite sea of negativeenergy electrons. We must postulate that the negativeenergy electrons do not contribute to the total energy and momentum of the vacuum, which would otherwise be infinite, and that the negativeenergy electrons do not produce an electric field, although they can be affected by an external field. These difficulties led physicists to abandon hole theory in favour of Dirac field theory, which bypasses the problem of negative energy states by treating positrons as true particles. (Caveat: in certain applications of condensed matter physics, the underlying concepts of "hole theory" are certainly valid. The sea of conduction electrons in an electrical conductor, called a Fermi sea, contains electrons with energies up to the chemical potential of the system. An unfilled state in the Fermi sea behaves like a positivelycharged electron, though it is referred to as a "hole" rather than a "positron". The negative charge of the Fermi sea is balanced by the positivelycharged ionic lattice of the material.)
Electromagnetic interaction
So far, we have considered an electron that is not in contact with any external fields. Proceeding by analogy with the Hamiltonian of a charged particle in classical electrodynamics, we can modify the Dirac Hamiltonian to include the effect of an electromagnetic field. The revised Hamiltonian is (in SI units):
 <math>H = \alpha_0 mc^2 + \sum_{j=1}^3 \alpha_j \left[p_j  e A_j(\mathbf{x}, t) \right] c + e \phi(\mathbf{x}, t) <math>
where e is the electric charge of the electron (in this convention, e is negative), and A and φ are the electromagnetic vector and scalar potentials, respectively.
By setting φ = 0 and working in the nonrelativistic limit, Dirac solved for the top two components in the positiveenergy wavefunctions (which, as discussed earlier, are the dominant components in the nonrelativistic limit), obtaining
 <math> \left( \frac{1}{2m} \sum_j p_j  e A_j(\mathbf{x}, t)^2  \frac{\hbar e}{2mc} \sum_j \sigma_j B_j(\mathbf{x}) \right) \begin{bmatrix}\psi_1 \\ \psi_2 \end{bmatrix}<math>
<math>= (E  mc^2) \begin{bmatrix}\psi_1 \\ \psi_2 \end{bmatrix}<math>
where B = <math>\nabla<math> ×A is the magnetic field acting on the particle. This is precisely the Pauli equation for a nonrelativistic spin½ particle, with magnetic moment <math>\hbar e/2mc<math> (i.e., a spin gfactor of 2). The actual magnetic moment of the electron is larger than this, though only by about 0.12%. The shortfall is due to quantum fluctuations in the electromagnetic field, which have been neglected. See vertex function.
For several years after the discovery of the Dirac equation, most physicists believed that it also described the proton and the neutron, which are both spin½ particles. However, beginning with the experiments of Stern and Frisch in 1933, the magnetic moments of these particles were found to disagree significantly with the predictions of the Dirac equation. The proton has a magnetic moment 2.79 times larger than predicted (with the proton mass inserted for m in the above formulas), i.e., a gfactor of 5.58. The neutron, which is electrically neutral, has a gfactor of −3.83. These "anomalous magnetic moments" were the first experimental indication that the proton and neutron are not elementary particles. They are in fact composed of smaller particles called quarks. Incidentally, quarks are spin½ particles, which are exactly described by the Dirac equation !
Interaction Hamiltonian
It is noteworthy that the Hamiltonian can be written as the sum of two terms:
 <math>H = H_{\mathrm{free}} + H_{\mathrm{int}} \,<math>
where H_{free} is the Dirac Hamiltonian for a free electron and H_{int} is the Hamiltonian of the electromagnetic interaction. The latter may be written as
 <math>H_{\mathrm{int}} = e \phi(\mathbf{x}, t)  ec \sum_{j=1}^3 \alpha_j A_j(\mathbf{x}, t) <math>
It has the expected value
 <math>\langle H \rangle = \int \, \psi^\dagger H_{\mathrm{int}} \psi \, d^3x = \int \, \left(\rho \phi  \sum_{i=1}^3 j_i A_i \right) \, d^3x <math>
where ρ is the electric charge density and j is the electric current density defined earlier. The integrand in the final expression is the interaction energy density. It is a relativistically covariant scalar quantity, as we can see by writing it in terms of the currentcharge fourvector j = (ρc,j) and the potential fourvector A = (φ/c,A):
 <math>\langle H \rangle = \int \, \left( \sum_{\mu,\nu = 0}^3 \eta^{\mu\nu} j_\mu A_\nu \right) \; d^3r<math>
where η is the metric of flat spacetime:
 <math>\eta^{00} = 1 <math>
 <math>\eta^{ii} \;= 1 \quad\, \forall \, i=1,2,3 <math>
 <math>\eta^{\mu\nu} = 0 \qquad \forall \, \mu \ne \nu <math>
Relativistically covariant notation
Let us return to the Dirac equation for the free electron. It is often useful to write the equation in a relativistically covariant form, in which the derivatives with time and space are treated on the same footing.
To do this, first recall that the momentum operator p acts like a spatial derivative:
 <math>\mathbf{p} \psi(\mathbf{x},t) =  i \hbar \nabla \psi(\mathbf{x},t)<math>
Multiplying each side of the Dirac equation by α_{0} (recalling that α_{0}²=I) and plugging in the above definition of p, we obtain
 <math> \left[ i\hbar c \left(\alpha_0 \frac{\partial}{c \partial t} + \sum_{j=1}^3 \alpha_0 \alpha_j \frac{\partial}{\partial x_j} \right)  mc^2 \right] \psi = 0 <math>
Now, define four gamma matrices:
 <math> \gamma^0 \equiv \alpha_0 \,,\quad \gamma^j \equiv \alpha_0 \alpha_j <math>
These matrices possess the property that
 <math>\left\{\gamma^\mu , \gamma^\nu \right\} = 2\eta^{\mu\nu} \cdot I\,,\quad \mu,\nu = 0, 1, 2, 3<math>
where η once again stands for the metric of flat spacetime. These relations define a Clifford algebra called the Dirac algebra.
The Dirac equation may now be written, using the positiontime fourvector x = (ct,x), as
 <math>\left(i\hbar c \, \sum_{\mu=0}^3 \; \gamma^\mu \, \partial_\mu  mc^2 \right) \psi = 0<math>
With this notation, the Dirac equation can be generated by extremising the action
 <math>\mathcal{S} = \int \bar\psi(i \hbar c \, \sum_\mu \gamma^\mu \partial_\mu  mc^2)\psi \, d^4 x <math>
where
 <math>\bar\psi \equiv \psi^\dagger \gamma_0 <math>
is called the Dirac adjoint of ψ. This is the basis for the use of the Dirac equation in quantum field theory.
A notation called the "Feynman slash" is sometimes used. Writing
 <math>a\!\!\!/ \leftrightarrow \sum_\mu \gamma^\mu a_\mu<math>
the Dirac equation becomes
 <math>(i \hbar c \, \partial\!\!\!/  mc^2) \psi = 0<math>
and the expression for the action becomes
 <math>\mathcal{S} = \int \bar\psi(i \hbar c \, \partial \!\!\!/  mc^2)\psi \, d^4 x <math>
See also
References
Selected papers
 P.A.M. Dirac, Proc. R. Soc. A117 610 (1928)
 P.A.M. Dirac, Proc. R. Soc. A126 360 (1930)
 C.D. Anderson, Phys. Rev. 43, 491 (1933)
 R. Frisch and O. Stern, Z. Phys. 85 4 (1933)
Textbooks
 Dirac, P.A.M., Principles of Quantum Mechanics, 4th edition (Clarendon, 1982)
 Shankar, R., Principles of Quantum Mechanics, 2nd edition (Plenum, 1994)
 Bjorken, J D & Drell, S, Relativistic Quantum mechanics
 Thaller, B., The Dirac Equation, Texts and Monographs in Physics (Springer, 1992)ca:Equació de Dirac
de:DiracGleichung fr:Équation de Dirac it:Equazione di Dirac ja:ディラック方程式 pl:Równanie Diraca