Books
W Pauli, “Die Allgemeinen Principien der Wellenmechanik”;Handbuch der Physik, 2 ed., Vol 24,
Part 1; Edwards reprint, Ann Arbor 1947 (In German) [1]
W Heitler, Quantum Theory of Radiation, 2nd Edition, Oxford 3rd edition just published [2]
G Wentzel, Introduction to the Quantum Theory of Wave-Fields, Inter- science, N.Y 1949 [3]
Throughout this course, I will reference various works, primarily drawing from the research of Feynman and Schwinger in the latter sections While familiarity with these texts is not expected, they will be integral to our discussions.
Subject Matter
Having completed a comprehensive course in non-relativistic quantum theory, you can confidently apply its principles, which remain valid even in relativistic contexts The foundational concepts you have learned continue to hold true, ensuring their relevance across various scenarios.
Having completed a course in classical mechanics and electrodynamics, including special relativity, you understand the concept of a relativistic system, where the equations of motion remain invariant under Lorentz transformations However, this discussion will not cover general relativity.
This course focuses on developing a Lorentz-invariant quantum theory, distinguishing it from the general dynamical methods of non-relativistic quantum theory While a universal approach for all systems remains elusive, the goal is to identify specific systems and corresponding equations of motion that align with non-relativistic quantum dynamics while maintaining Lorentz invariance.
In non-relativistic theory, nearly any classical system can be quantized, but in contrast, relativistic quantization allows for only a limited range of systems This crucial distinction indicates that, based solely on the principles of relativity and quantization, only specific types of objects can exist mathematically Consequently, this framework enables the prediction of significant phenomena in the real world, with notable examples illustrating these principles.
(i) Dirac from a study of the electron predicted the positron, which was later discovered [9].
(ii) Yukawa from a study of nuclear forces predicted the meson, which was later discovered [10].
The fundamental principle of relativistic quantum theory asserts that a finite number of particles cannot be adequately described within this framework Essential characteristics of such a theory include the presence of an indefinite number of identical and indistinguishable particles, as well as the potential for particle creation and annihilation.
The integration of the principles of relativity and quantum theory results in a universe composed of diverse elementary particles, reinforcing our confidence in our understanding of reality Furthermore, the specific characteristics of these observed particles emerge as essential outcomes of the overarching theory.
(i) Magnetic moment of Electron (Dirac) [9].
(ii) Relation between spin and statistics (Pauli) [11].
Detailed Program
We will not immediately develop a comprehensive theory involving multiple particles; instead, we will explore the historical progression of quantum theory Our approach begins with creating a relativistic quantum theory for a single particle, assessing its limitations and challenges This foundational work will guide us in modifying the theory to accommodate multiple particles Notably, single-particle theories remain valuable, providing accurate approximations in scenarios where new particle creation is absent and a non-relativistic framework is insufficient, as exemplified by the Dirac theory of the Hydrogen atom.
The non-relativistic theory accurately predicts energy levels but fails to account for fine-structure, achieving an accuracy of one part in 10,000 In contrast, the Dirac one-particle theory captures the primary characteristics of fine-structure with a 10% accuracy in component numbers and separations, resulting in an overall accuracy of one part in 100,000 Furthermore, the Dirac many-particle theory aligns closely with the fine-structure separations observed in the Lamb experiment, achieving an impressive accuracy of about one part in 10,000, with an overall precision reaching one part in 10 million.
To achieve an accuracy exceeding 1 in 10^8, it is likely that the Dirac many-particle theory alone is insufficient, necessitating the consideration of various meson effects that have not yet been adequately addressed Current experimental results are limited to an accuracy of approximately 1 in 10^8.
In this course, I will begin by thoroughly exploring one-particle theories and their breakdown Subsequently, I will introduce new methods developed by Feynman and Schwinger to construct a relativistic quantum theory This foundation will lead us to the study of many-particle theories, where I will discuss their general characteristics Finally, I will focus on the specific case of quantum electrodynamics, aiming to cover as much ground as possible before the course concludes.
One-Particle Theories
Take the simplest case, one particle with no forces Then the non-relativistic wave-mechanics tells you to take the equationE= 2m 1 p 2 of classical mechan- ics, and write
∂x (1) to get the wave-equation 2 i~∂
2m∇ 2 ψ (2) satisfied by the wave-functionψ.
To give a physical meaning toψ, we state thatρ=ψ ∗ ψis the probability of finding the particle at the point x, y, z at time t And the probability is conserved because 3
2mi(ψ ∗ ∇ψ−ψ∇ψ ∗ ) (4) whereψ ∗ is the complex conjugate ofψ.
Now do this relativistically We have classically
E 2 =m 2 c 4 +c 2 p 2 (5) which gives the wave equation
The Klein–Gordon equation is a significant milestone in the development of relativistic quantum theory Despite Schrödinger's efforts in 1926 to create a relativistic framework using this equation, he was unsuccessful, a challenge that stymied many others until Pauli and Weisskopf successfully advanced the many-particle theory in 1934.
To interpret the wave-function as a probability, it is essential to establish a continuity equation This continuity equation can be derived from the wave equation by utilizing the expression for current density, \( j \), and defining the probability density, \( \rho \), as \( \rho = \frac{i\hbar}{2mc^2} \psi^* \partial \psi \).
The second-order nature of the equation allows for ψ and ∂ψ/∂t to be arbitrary, which means that the density ρ does not have to be positive, resulting in the existence of negative probabilities This complication undermines all efforts to establish a coherent one-particle theory.
The theory can be effectively applied by describing an assembly of particles with both positive and negative charges, where ρ represents the net charge density at any point This approach, utilized by Pauli and Weisskopf, yields accurate results for π-mesons, the particles produced in the nearby synchrotron, which I will discuss further later.
The Form of the Dirac Equation
Prior to the development of relativistic quantum theory, the one-particle theory of Dirac emerged as a groundbreaking concept This theory achieved remarkable success in describing the behavior of electrons, earning it the status of the sole respectable relativistic quantum theory for many years Notably, the Dirac theory presented fewer immediate difficulties compared to the one-particle Klein-Gordon theory, making it a more viable option at the time.
Dirac proposed that a particle can exist in multiple distinct states with identical momentum, differing only in their spin orientations Consequently, the wave function ψ must consist of multiple components, functioning as a column matrix rather than a scalar Each component represents the probability amplitude of locating the particle in a specific location and substate.
Dirac assumed that the probability density at any point is still given by ρ=X α ψ α ∗ ψ α (8) which we write ρ=ψ ∗ ψ as in the non-relativistic theory Hereψ ∗ is arow matrix
To ensure satisfaction of the conditions, the wave function ψ must adhere to a first-order wave equation Given the relativistic nature of the equations, this requirement extends to first-order derivatives in the spatial dimensions of x, y, and z Consequently, the most general form of the wave equation can be established.
~ βψ= 0 (9) wherex 1 ,x 2 ,x 3 are written forx, y, zandα 1 ,α 2 , α 3 , β are square matrices whose elements are numbers The conjugate of (9) gives
~ ψ ∗ β ∗ = 0 (10) whereα k ∗ and β ∗ are Hermitian conjugates.
Now to get (3) out of (8), (9) and (10) we must have α k ∗ =α k , β ∗ =β soα k andβ areHermitian; and j k =c(ψ ∗ α k ψ) (11)
Next what more do we want from equation (9)? Two things (A) it must be consistent with the second order equation (6) we started from; (B) the whole theory must be Lorentz invariant.
First consider (A) If (9) is consistent with (6) it must be possible to get exactly (6) by multiplying (9) by the operator
~ β (12) chosen so that the terms with mixed derivatives ∂t ∂ , ∂x ∂ k and ∂t ∂ cancel This gives
∂x k This agrees with (6) if and only if α k α ` +α ` α k = 0 k6=` α k β+βα k = 0 α k2 =β 2 =I,(identity matrix)
Thus we could not possiblyfactorize the 2nd order equation into two first- order operators involving ordinary numbers But we can do it withmatrices. Consider the Pauli spin matrices σ 1 0 1
(14) you are familiar with They satisfy σ k σ ` +σ ` σ k = 2δ `k
But we cannot make 4 matrices of this type all anti-commuting They must beat least 4 ×4.
One possible set ofα k and β is α k 0 σ k σ k 0 β 1 0
These are hermitian as required Of course ifα k andβ are any set satisfying
(13) thenSα k S − 1 andSβS − 1 are another set, whereS is anyunitary matrix
SS ∗ = 1 And conversely it can be proved that every possible 4×4 matrices α k and β are of this form with some such matrix S We do not prove this here.
The Dirac equation is thus a set of 4 simultaneous linear partial differential 4 equations in the four functionsψ α
Lorentz Invariance of the Dirac Equation
What does this mean? Consider a general Lorentz transformation: Ifx 0 à are the new coordinates: x 0 à X3 ν=0 a àν x ν (x o =ct) (16)
In the new coordinate system, the wave function is denoted as ψ₀, which differs from the original wave function ψ For instance, in the relativistic Maxwell theory, the magnetic field H transforms as a tensor in a moving system rather than remaining a pure magnetic field Therefore, it is essential to establish a transformation law for ψ that preserves the invariance of the physical implications derived from the equations.
We need in fact two things: (i) the interpretation ofψ ∗ ψas a probability density must be preserved, (ii) the validity of the Dirac equation must be preserved in the new system.
First consider (i) The quantity which can be directly observed and must be invariant is the quantity
(ψ ∗ ψ)×V where V is a volume Now in going to a new Lorentz system with relative velocity v the volume V changes by Fitzgerald contraction to the value
The equation (17) demonstrates that (ψ ∗ ψ) = ρ behaves similarly to energy, functioning as the fourth component of a vector This indicates that ψ 0 is not equal to ψ Additionally, since ρ and ~ are connected through the continuity equation, the spatial components of the 4-vector can be derived from this relationship.
So we require that the 4 quantities
(S 1 , S 2 , S 3 , S 0 ) = (ψ ∗ α k ψ, ψ ∗ ψ) (19) transform like a4-vector This will be enough to preserve the interpretation of the theory.
Assume that ψ 0 =Sψ (20) whereS is alinear operator Then ψ 0∗ =ψ ∗ S ∗ (21)
Next consider (ii) The Dirac equation for ψ 0 is
Now the original Dirac equation forψexpressed in terms of the new coordi- nates is
The sets of equations (24) and (25) have to be equivalent, not identical. Thus (25) must be the same as (24) multiplied byβS − 1 β The condition for this is βS − 1 βα ν X3
0 α λ a νλ S − 1 (26) But (23) and (26) are identical if βS − 1 β =S ∗ which means S ∗ βS=β (27)Thusβ transforms like a scalar, α ν like a 4-vector when multiplied byS ∗ S.
To Find the S
When applying two successive coordinate transformations, the overall effect can be represented by the product of their respective matrices Therefore, it is essential to focus on three fundamental types of transformations.
(1) Pure rotations x 0 0 =x0 x 0 3 =x3 x 0 1 =x 1 cosθ+x 2 sinθ x 0 2 =−x 1 sinθ+x 2 cosθ
(2) Pure Lorentz transformations x 0 1 =x 1 x 0 2 =x 2 x 0 3 =x3coshθ+x0sinhθ x 0 0 =x 3 sinhθ+x 0 coshθ
Note that in all cases S is ambiguous by a factor±1 So in case 1 a rotation though 360 ◦ gives S =−1.
Problem 1 Find the S corresponding to a general infinitesimal coordinate transformation Compare and show that it agrees with the exact solutions given here.
Spinors, which are derived from the ψ α's transformations via S-transformations, extend the concept of non-relativistic 2-component spin-functions While the mathematical theory of spinors may seem complex, practical calculations are often simplified by avoiding explicit representations of spinors and instead utilizing formal algebra and the commutation relations of matrices.
The Covariant Notation
To prevent confusion between covariant and contravariant vectors, it is beneficial to introduce an imaginary fourth coordinate defined as \( x^4 = ix^0 = ict \).
In this coordinate system the four matrices 5 γ 1,2,3,4 = (−iβα 1,2,3 , β) i.e (32) γ 1 = 0 −i 0 −i 0
! are a 4-vector They are all Hermitian and satisfy γ à γ ν +γ ν γ à = 2δ àν (33) The Dirac equation and its conjugate may now be written
(36)These notations are the most convenient for calculations.
Conservation Laws Existence of Spin
The Hamiltonian in this theory is 6 i~∂ψ
This commutes with the momentum p =−i~∇ So the momentum p is a constant of motion.
However the angular momentum operator
L+ 1 2 ~σ =~J (42) is a constant, the total angular momentum, because by (40), (41) and (42)
L is the orbital angular momentum and 1 2 ~σ the spin angular momentum This agrees with the non-relativistic theory But in that theory the spin and
Lof a free particle were separately constant This is no longer the case. When a central force potential V(r) is added to H, the operator J still is constant.
Elementary Solutions
For a particle with a particular momentump and energyE, the wave func- tion will be ψ(x, t) =uexp ipãx
(43) whereuis a constant spinor The Dirac equation then becomes an equation foru only
We write now p + =p 1 +ip 2 p − =p 1 −ip 2 (45)
Then (44) written out in full becomes
These 4 equations determine u 3 and u 4 givenu 1 and u 2 , or vice-versa And eitheru 1 and u 2 , or u 3 andu 4 , can be chosen arbitrarily provided that 7
Thus givenpandE = +p m 2 c 4 +c 2 p 2 , there are two independent solutions of (46); these are, in non-normalized form:
This gives the two spin-states of an electron with given momentum, as re- quired physically.
The equation E = -pm²c⁴ + c²p² reveals two independent solutions, resulting in a total of four solutions, including the well-known negative energy states Ignoring these states is not feasible, as the presence of fields in the theory allows for transitions from positive to negative energy states For instance, the Hydrogen atom is predicted to decay into a negative energy state in less than 10⁻¹⁰ seconds.
Negative energy particles cannot be physically contained, as they are unable to be halted by stationary matter; instead, they accelerate with each collision This phenomenon led Dirac to explore further implications in his research.
The Hole Theory
Negative-energy states are typically occupied by one electron each due to the exclusion principle, which prohibits ordinary electrons from transitioning into these states However, there are instances where a negative energy state with momentum −p may become relevant.
−E isempty, this appears as a particle of momentumpenergy +E, and the opposite charge to an electron, i.e an ordinary positron.
To achieve meaningful results, we must adopt a many-particle theory This approach is essential for obtaining positive probabilities when dealing with spin-0 particles and for ensuring positive energies in the case of spin-1/2 particles.
The Dirac theory, in its one-particle formulation, is inadequate for accurately depicting interactions among multiple particles However, it remains effective for describing free particles through one-particle wave functions.
Positron States
So which wave-function will describe a positron with momentum p and en- ergy E? Clearly the wave function should be of the form φ(x, t) =vexp ipãx
(49) as always in quantum mechanics But the negative-energy electron whose absence is the positron has a wave-function ψ(x, t) =uexp
(50) since it has a momentum−penergy−E.
Thus we must take φ=Cψ + , i.e v=Cu + (51) whereψ + isψ with complex conjugate elements butnot transposed, andC is a suitable constant matrix; ψ + (x, t) =u + exp ipãx
We know thatu is a solution of
We want the theory to make no distinction between electrons and positrons, and so v must also satisfy the Dirac equation
Ev = cαãp+mc 2 β v ECu + = cαãp+mc 2 β
But from (52) we have foru + the equation
In order that (53) and (54) be identical we should have
The relation betweenψ and φis symmetrical because
The φ is called the charge-conjugate wave-function corresponding to the negative-energy electronψ Clearly φ ∗ φ= Cψ + ∗
Thus the probability and flow densities are the same for a positron as for the conjugate negative electron.
In various applications, it is often more convenient to utilize the ψ wave-function to represent positrons, particularly when calculating cross-sections for pair creation However, for detailed observations, such as in positronium experiments, employing the φ wave-function is essential to accurately depict aspects like the orientation of spin.
This is all we shall say about free electrons and positrons.
Electromagnetic Properties of the Electron
Given an external (c-number) electromagnetic field defined by the potentials
A à à= 1,2,3,4 A 4 =iΦ given functions of space and time Then the motion of a particle in the field is found by substituting in the free-particle Lagrangian
E+eΦ forE p+ e cA forp (60) where (−e) is the electron charge We write the momentum-energy 4-vector p= (p 1 , p 2 , p 3 , p 4 =iE/c) (61) Then we have to substitute simply p à +e cA à forp à (62)
Now in the quantum theory pà→ −i~ ∂
Therefore the Dirac equation with fields is
In the non-covariant notations this is i~∂ψ
# ψ (66) since by (57), we have ψγ à =ψ ∗ βγ à = (Cφ + ) T βγ à =φ T C T βγ à ; the wave functionφ=Cψ + of a positron satisfies by (65)
This is exactly the Dirac equation for a particle of positive charge (+e) We have used
The Hydrogen Atom
This is the one problem which it is possible to treat very accurately using the one-electron Dirac theory The problem is to find the eigenstates of the equation
As in the non-relativistic theory, we have as quantum numbers in addition to E itself the quantities j z =−i[r× ∇] 3 + 1 2 σ 3 (71) j(j+ 1) =J 2 −i(r× ∇) + 1 2 σ2
In the context of angular momentum theory, the quantum numbers j and z are identified as half-odd integers However, these quantum numbers alone are insufficient to define the state, as each j value can correspond to two non-relativistic states with orbital angular momentum ` equal to j ± 1/2 To differentiate between states with spin σ aligned parallel or antiparallel to the total angular momentum J, an additional operator that commutes with the Hamiltonian H is required.
Q=σãJ But [H,σ] is non-zero and rather complicated So it is better to try
Q=βσãJ (73) which is the same in the non-relativistic limit.
[H, Q] = [H, βσãJ] = [H, βσ]ãJ +βσã[H,J] But [H,J] = 0; furthermore, since α k βσ ` =βσ ` α k k6=` and α k βσ k =−βσ k α k we get
Hence the quantity which commutes withHand is a constant of the motion is
There must be a relation betweenK and J In fact
(75) ThereforeK has integer eigenvalues not zero,
By utilizing the eigenvalue for K, we can streamline the Hamiltonian, a process not possible in non-relativistic theory where only the eigenvalue of L² is considered The equation σãr σã(rì ∇) = iσã(rì(rì ∇)) = i(σãr)(rã∇) - ir²σã∇ illustrates this simplification.
Then multiplying (78) by − 1 we get:
Letα r = 1 r αãr, then by (39) and (42)
∂r Thus finally we can write (70) in the form
This gives the Dirac equation as an equation in the single variabler, having separated all angular variables.
For the solution of this equation, see – Dirac,Quantum Mechanics, ThirdEdition, Sec 72, pp 268–271.
Solution of Radial Equation
We may choose a two-component representation in which β 1 0
~c, (83) the fine structure constant Then
Next put a = √a 1 a 2 = √ m 2 c 4 −E 2 /~c which is the magnitude of the imaginary momentum of a free electron of energy E Then ψ ∼ e − ar at infinity Hence we write u= e − ar r f v= e − ar r g
Now we try solutions in series f =X c s r s , g=X d s r s (87)
This gives α c s −a 1 c s − 1 =−ad s − 1 + (s+k)d s α d s +a 2 d s − 1 = +ac s − 1 + (−s+k)c s (88) Putting e s =a 1 c s − 1 −ad s − 1 we have e s =α c s −(s+k)d s = a 1 a (α d s + (s−k)c s ) c s = a1α+a(s+k) a 1 α 2 +a 1 (s 2 −k 2 )e s d s = aα−a1(s−k) a 1 α 2 +a 1 (s 2 −k 2 )e s e s+1 = a 2 1 −a 2 α+ 2saa1 a 1 α 2 +a 1 (s 2 −k 2 ) e s Suppose the series do not terminate Then for large s e s+1 e s ≈ c s+1 c s ≈ 2a s hence f ≈exp(2ar)
This is permissible whenais imaginary Thus there is a continuum of states with
For realathe series must terminate at both ends in order not to blow up at infinity Suppose then 8 es is non-zero for s=+ 1, + 2, , +n n≥1 (90) and otherwise zero This gives α 2 + 2 −k 2 = 0 a 2 1 −a 2 α+ 2 (+n)aa 1 = 0
Now not both c and d are zero, thus the wave function r − 1+ must be integrable at zero This gives > − 1 2 But =±√ k 2 −α 2 Now k 2 ≥ 1, hence√ k 2 −α 2 > 1 2 , and
With a positive value of E, the expression \(2(1 - a^2)\) is negative, allowing us to square the positive \(n\) without complications Consequently, for each \(k = \pm 1, \pm 2, \pm 3, \ldots\), there are solutions for \(n = 1, 2, 3, \ldots\) as indicated in equation (93), with E defined by (92).
The alternative possibility is that all es are zero Suppose not both of c and d are zero Then α 2 + 2 −k 2 = 0 as before and so=√ k 2 −α 2 But now a 1 c −ad = 0 α c −(+k)d = 0
Henceaα−a 1 (+k) = 0 andkmust be positive to make+k=√ k 2 −α 2 + k >0 After this the solution goes as before So solutions (92) exist for n= 0, k= +1,+2,+3, (94) The principal quantum numberN is
There is exact degeneracy between the two states of a given |k| Nonrela- tivistic states are given by j=`+1
Behaviour of an Electron in a Non-Relativistic
Multiplying the Dirac equation (64) by P νγ ν
∂t =−iE 1 electric field σ 12 =iσ 3 spin component σ 14 =iα 1 velocity component Thus (97) becomes
Now in the non-relativistic approximation i~∂
The non-relativistic approximation means dropping the terms O 1/mc 2
. Thus the non-relativistic Schr¨odinger equation is i~∂ψ
The term "αãE" is inherently relativistic and requires more precise treatment or elimination Consequently, we derive the equation of motion for a non-relativistic particle possessing a spin magnetic moment.
This is one of the greatest triumphs of Dirac, that he got this magnetic moment right out of his general assumptions without any arbitrariness.
It is confirmed by measurements to about one part in 1000 Note that the most recent experiments show a definite discrepancy, and agree with the value
(101) calculated by Schwinger using the complete many-particle theory.
Problem 2 Calculate energy values and wave functions of a Dirac particle moving in a homogeneous infinite magnetic field This can be done exactly. See F Sauter,Zeitschrift f¨ur Physik 69(1931) 742.
Take the fieldB in thez direction.
The second-order Dirac equation (98) gives for a stationary state of energy ±E
Taking a representation withσz diagonal, this splits at once into two states withσ z =±1 Also
∂x is a constant of the motion, sayLz =`~where`is an integer And−i~ ∂
This is an eigenvalue problem with eigenvalues of a two-dimensional har- monic oscillator.
E=p m 2 c 4 +c 2 p 2 z +M|eB~c| withM = 0,1,2, The lowest state has energy exactlymc 2
Summary of Matrices in the Dirac Theory in
We use the following representation: σ1 0 1
Comparison with the Dirac notation: ρ 1 = ρ 2 =η ρ 3 =β. Latin indices: 1, 2, 3 Greek indices: 1, 2, 3, 4.
2.14 Summary of Matrices in the Dirac Theory in the Feyn- man Notation α k α ` +α ` α k = 2δ k` I α k β+βα k = 0 g00= +1 g kk =−1 gàν = 0, à6=ν σ k σ ` +σ ` σ k = 2δ k` I β 2 =I γ k =βα k α k =βγ k γ 0 =β γ à γ ν +γ ν γ à = 2g àν I (γ k ) ∗ =−γ k α k γ ` −γ ` α k =−2δ `k β γ 5 =iγ 0 γ 1 γ 2 γ 3 γ à γ 5 +γ 5 γ à = 0 α k γ 5 −γ 5 α k = 0 γ 5 2 =−I Representation: σ 1 0 1
k, `, m= (1,2,3) cyclicly permutedLatin indices: 1, 2, 3 Greek indices: 0, 1, 2, 3.
General Discussion
The scattering of a Dirac particle by a potential can be precisely analyzed by determining the continuum solutions of the Dirac equation, a complex task even for the basic scenario of a Coulomb force This analysis was conducted by Mott in his 1932 paper published in the Proceedings of the Royal Society A.
In relativistic problems, particularly when dealing with complex scattering effects related to radiation theory, the Born approximation is commonly employed This approach allows for the treatment of scattering processes to the first order in interaction, or to a specific order of interest The transition probability per unit time for scattering from an initial state A to a final state B, which exists within a continuum of states, is given by the formula w = 2π.
This you ought to know ρ E = density of final states per unit energy interval.
The matrix element of the potential V for the transition is represented by V BA This potential V can encompass various forms and may also represent a second-order or higher-order effect derived from advanced perturbation theory.
Real calculations often encounter difficulties due to the factors 2 and π, as well as the proper normalization of states To address this, continuum states will be normalized in a non-standard manner, deviating from the conventional approach of one particle per unit volume, which lacks invariance.
One particle per volume mc 2
|E| (103) where|E|is the energy of the particles Then under a Lorentz transformation the volume of a fixed region transforms like 1/|E| and so the definition stays invariant.
Thus a continuum state given by the spinorψ=uexp{(ipãx−iEt)/~} is normalized so that u ∗ u= |E| mc 2 (104)
Now if we multiply the Dirac equation for a free particle, (44), by u on the left, we get Eu ∗ βu = cu ∗ βα ãpu+mc 2 u ∗ u; its complex conjugate is
Eu ∗ βu=−cu ∗ βαãpu+mc 2 u ∗ usince βαis anti-Hermitian; then by adding we get
Therefore the normalization becomes uu= +1 for electron states
=; This is the definition of (106)
With this normalization the density of states in momentum space is one per volumeh 3 of phase space, that is to say ρ = 1 h 3 mc 2
In momentum space, the invariant differential dp 1 dp 2 dp 3 (107) is established for each spin direction and charge sign, maintaining consistency across all parameters.
Projection Operators
In many cases, the spin of intermediate, initial, or final states is not of primary interest Consequently, we must perform summations over spin states that take a specific form.
(sOu) (uP r) (109) whereO andP are some kind of operators,sandrsome kind of spin states,and the sum is over the two spin statesuof an electron of momentumpand energyE.
The Dirac equation satisfied byu is
The two spin states with momentum 4-vector −psatisfy
As one can easily show from (48), these 4 states are all orthogonal in the sense that (u 0 u) = 0 for each pairu 0 u Therefore the identity operator may be written in the form
(uu) (113) summed over all 4 states withdefined as earlier Hence by (111) and (112) we can write (109) as
(uP r) = (sOΛ + P r) (114) by virtue of (113); here the operator Λ+= /p+imc
2imc (115) is a projection operator forelectrons of momentump.
In the same way for a sum over the two positron statesuwith momentum p energyE
(sOu) (uP r) = (sOΛ − P r) (116) with Λ − = /p−imc
These projection operators are covariant In Heitler the business is done in a different way which makes them non-covariant and more difficult to handle.
Note that here charge-conjugate wave functions are not used The positrons of momentum p are represented by the electron wave functions u of momentum−penergy −E.
Calculation of Traces
Suppose we have to calculate an expression such as
(u F Ou I ) (u I Ou F ) summed over electron states only This gives
X(u F OΛ + OΛ + u F ) summed over all four spin states u F To calculate this, let us consider the general expression
(uQu) summed over all 4 spin states, whereQ is any 4×4 matrix.
LetQhave the eigenvectorsw 1 , w 2 , w 3 , w 4 with eigenvaluesλ 1 , λ 2 , λ 3 , λ 4 Then
NowP λ= sum of diagonal elements ofQ = Trace ofQ Thus 10
(uOu) = TrQ and this is always easy to calculate.
Problem 3 Given a steady potential V a function of position, and a beam of incident particles, electrons Solve the Schr¨odinger equation in the Born approximation
(b) By time-dependent perturbation theory.
The results indicate agreement with a transition probability per unit time expressed as w = (2π/~)ρ E |V BA |² In the scenario where V = -Ze²/r, we evaluate the cross section by averaging the spin over the initial state and summing over the final state Additionally, we repeat the calculation for particles that follow the Klein–Gordon equation, excluding the V² term, using either method A comparison of the angular distribution in both cases reveals significant differences.
Problem 4 A nucleus (O 16 ) has an even 11 j = 0 ground state and an even j = 0 excited state at 6 MeV Calculate the total rate of emission of pairs, and the angular and momentum distributions.
In the context of nuclear transitions, let ∆E represent the excitation energy, while ρ N and j N denote the charge and current density operators of the nucleus, respectively For the transition under consideration, both ρ N and j N are dependent on the position r, with the time-dependent behavior of the single matrix element expressed as exp{−i∆E/~}.
The electrostatic potentialV of the nucleus has the matrix element given by
The states being spherically symmetric, ρ N is a function of r only, and so the general solution of Poisson’s equation simplifies to 12
Outside the nucleusV(r) =Ze 2 /ris constant in time, and so the matrix element ofV(r) for this transition is zero In fact from (119) and (120) we get by integration
V(r) = ~ i∆E(−4π)(−r)j N o (r) =4πr~ i∆Ej N o (r) (122) wherej N o is the outward component of the current.
The interaction which creates pairs is then
As an approximation consider the de Broglie wavelengths of all pairs long compared with the nuclear size Then
The exact value of the constant R r j N o (r)dτ remains uncertain For an approximate estimation, we consider the nucleus with charge Ze uniformly distributed within a sphere of radius r o in both the ground and excited states Given that ρ N is approximately uniform within the nucleus, we can derive the relationship j N = i∆E through integration.
Qis roughly a measure of the charge-moment of inertia of the nucleus, and is equal to
The challenge lies in calculating the probabilities of pair emission resulting from this interaction It's important to highlight that genuine radiation is not permitted in a 0–0 transition Consequently, these pairs are detected in the reaction p + F19 → O∗16 + α → O16 + e+ + e− + α.
Is it correct to take for the interaction just
V(r) (−eψ ∗ ψ)dτ taking the Coulomb potential of the nuclear charge and ignoring all electro- dynamic effects? Yes Because in general the interaction would be
) dτ (128) where ϕ, A k are the scalar and vector potentials satisfying the Maxwell equations
=−4π c j N The matrix element of the interaction (128) is unchanged by any gauge transformation of the (A, ϕ) Therefore we may take the gauge in which
∇ãA= 0 Incidentally, sinceϕ=V(r), the second Maxwell equation reduces to
Now, since there is no free radiation present, also ∇ ×A = 0, and hence
A= 0, in this gauge, and therefore we can indeed ignore all electrodynamic effects.
Let us calculate then the probability of pair emission with the interaction
(126) A typical final state has an electron of momentump 1 and a positron of momentum p 2 , with energies E 1 , E 2 and spins u 1 , u 2 respectively For the creation of this pair the matrix element of I is
The density of final states is by (107)
E 1 E 2 p 2 1 dp1dω1p 2 2 dp2dω2 (129) wheredω1anddω2are the solid angles forp1andp2 The creation probability per unit time is thus by (102) w= 2π
(130) Now fixingp 1 , dp 2 d(E1+E2) = dp 2 dE2
=−1 +p 1 ãp 2 m 2 c 2 +E1E2 m 2 c 4 = E1E2−m 2 c 4 +c 2 p1p2cosθ m 2 c 4 whereθ is the angle between the pair Then writing in (130) dE 1 =dp 1 c 2 p 1
E 1 , dω 1 = 4π , dω 2 = 2πsinθ dθ we obtain the differential probability 13 inE 1 andθ w o = 4Z 2 e 4 r o 4
∆E= 6 MeV = 12mc 2 we can to a good approximation treat all particles as extreme relativistic. Thus w o = 4Z 2 e 4 r o 4 25πc 6 ~ 7 E 1 2 E 2 2 dE 1 (1 + cosθ) sinθ dθ (132)
So the pairs have an angular distribution concentrated in the same hemi- sphere, and predominantlyequal energies Then, since
15(∆E) 5 the total creation probability per unit time is w T = 4Z 2 e 4 r 4 o 25π~ 7 c 6
10 since r o = 4×10 − 13 cm Hence the lifetime will be τ = 15×25π×10 5 ×17 2 ×1
Scattering of Two Electrons in Born Approximation
The transition scattering matrix element M is calculated for a system transitioning from an initial state A, which consists of two electrons with momenta p1, p2 and spin states u1, u2, to a final state B, comprising two electrons with momenta p'1, p'2 and spin states u'1, u'2 This matrix element M represents the probability amplitude of the system evolving into state B after a significant duration, starting from state A Importantly, M is required to be invariant under relativistic transformations.
In the Born approximation, we analyze particle interactions by assuming that particles transition directly from free-particle state A to state B through a single application of the interaction operator This approach is particularly effective for electrons moving at high or relativistic velocities, where the approximation holds true due to the relationship between the interaction strength and the velocity of the particles.
1) Also we treat the electromagnetic interaction classically, just as in the
The interaction between particle 1 and particle 2 can be understood through classical Maxwell equations, which treat the field produced by particle 1 as a direct influence on particle 2 This approach overlooks the quantum nature of the field, as it is composed of quanta However, as we will explore in the development of quantum field theories, this simplification does not lead to significant errors when operating within the Born approximation.
In the analysis of the field generated by particle 1 during its transition from state (p1, u1) to (p0, u0), we refer to the matrix elements as ϕ(1) and A(1) Instead of employing the gauge condition where ∇·A = 0, we adopt the covariant gauge approach.
So using covariant notations we have in this gauge
∂x 2 ν A à(1) = +4πes à (1) (charge is −e) (136) s à(1) =i u 1 0 γàu1 exp
The effect of the field (138) on particle 2 is given by the interaction term in the Dirac equation for particle 2
This gives for particle 2 for the transition from state p 2 , u 2 to p 0 2 , u 0 2 a tran- sition matrix element
! ψ2 (141) a 3-dimensional integral over space at the timetsay For the total transition matrix elementM by first order perturbation method
! ψ2 (142) where the 4-fold integral isdx 1 , dx 2 ,dx 3 ,dx 0 ,x 0 =ct Putting in the values of A à (1) , ψ 2 0 and ψ 2 , we get
The exchange process involves the transition of particle p1, u1 to p02, u02 and vice-versa, contributing negatively to M due to the requirement for the wave function to be antisymmetric between the two particles.
This covariant formula is elegant and easy to arrive at The question now is, how does one go from such a formula to a cross-section?
Generally, suppose in such a 2-particle collision process the transition matrix is
To determine the cross-section in terms of K, we conduct this calculation to establish formulae for M, similar to those derived from radiation theory, specifically in the convenient form of equation (145).
Relation of Cross-sections to Transition Amplitudes
Letwbe the transition probability per unit volume and per unit time This is related to the transition probability for a single final state, which is w s =c|K| 2 (2π~) 4 δ 4 p 1 +p 2 −p 0 1 −p 0 2
(146) since in|M| 2 one of the two (2π~) 4 δ 4 (p1+p2−p 0 1 −p 0 2 )/cfactors represents merely the volume of space-time in which the interaction can occur The number of final states is by (107)
|E 2 0 |dp 0 11 dp 0 12 dp 0 13 dp 0 21 dp 0 22 dp 0 23 (147) Multiplying (146) by (147) gives the total transition probability w=|K| 2 1
E 1 0 E 2 0 c δ 4 p 1 +p 2 −p 0 1 −p 0 2 dp 0 11 dp 0 12 dp 0 13 dp 0 21 dp 0 22 dp 0 23
=δ 3 p 1 +p 2 −p 0 1 −p 0 2 c δ(E 1 +E 2 −E 1 0 −E 2 0 ) and the integration over dp 2 gives then by the momentum conservation w=|K| 2 c 2
E 1 0 E 2 0 δ(E 1 +E 2 −E 1 0 −E 2 0 )dp 0 11 dp 0 12 dp 0 13 (148a) Furthermore, iff(a) = 0, we havef(x) =f(a) +f 0 (a)(x−a) =f 0 (a)(x−a) and thus δ(f(x)) =δ f 0 (a)(x−a) = δ(x−a) f 0 (a)
Applying this to (148a) with f(x) = f(p 0 13 ) = E 1 +E 2 −E 1 0 −E 2 0 and a= (p 0 13 ) c = the value ofp 0 13 giving momentum and energy conservation, we get δ(E 1 +E 2 −E 1 0 −E 2 0 ) = 1 d(E 1 +E 2 −E 1 0 −E 2 0 ) dp 0 13 δ p 0 13 −(p 0 13 ) c
In a chosen Lorentz-system where momenta p1 and p2 align with the x3 direction, the transition probability is analyzed using p0 11 and p0 12 as variables to ensure relativistic invariance With p0 11 and p0 12 fixed, momentum conservation leads to the relationship p0 13 = -p0 23 Consequently, the expression for the transition probability can be derived as d(E10 + E01) dp0 13 dE10 dp0 13 - dE20 dp0 23.
E 1 0 E 2 0 (149) Then the cross-sectionσ is defined in this system by σ= wV 1 V 2
|v 1 −v 2 | (150) where V 1 is the normalization volume for particle 1, and v 1 its velocity In fact by (103)
Hence the cross-section becomes σ = w mc 2 2 c 2 |p 1 E 2 −p 2 E 1 |
The factor \( p_1 E_2 - p_2 E_1 \) remains invariant under Lorentz transformations that do not alter the \( x_1 \) and \( x_2 \) components, such as boosts along the \( x_3 \) axis To demonstrate this invariance, we need to show that \( p_{13} E_
E˜ shθ−cpsinhθ ˜ p=pcoshθ−E c sinhθ SinceE 2 =p 2 c 2 +m 2 c 4 , we can write
E =mc 2 coshφ pc=mc 2 sinhφ , which makes
E˜ =mc 2 cosh(φ−θ) pc˜ =mc 2 sinh(φ−θ) and thus
=m 2 c 3 sinh (φ 1 −φ 2 ) independently of θ Hence we see that σ is invariant under Lorentz trans- formations parallel to thex 3 axis.
Results for Mứller Scattering
One electron initially at rest, the other initially with energyE =γmc 2 ; γ = 1 p1−(v/c) 2 scattering angle =θin the lab system
=θ ∗ in the center-of-mass system
Then the differential cross-section is (Mott and Massey,Theory of Atomic Collisions, 2nd ed., p 368)
2 + (γ−1) sin 2 θ Without spin you get simply
The effect of spin is a measurable increase of scattering over the Mott for- mula The effect of exchange is roughly the 3
1−x 2 term Positron-electron scattering is very similar Only the exchange effect is different because of annihilation possibility.
Note on the Treatment of Exchange Effects
The correctly normalized initial and final states in this problem are
(154) where ψ 2 (1) means the particle 2 in the state 1, and so on With these states the matrix element M is exactly as we have calculated it including the exchange term.
The number of possible final states for two indistinguishable particles is half that of two distinguishable particles; however, this does not affect the differential cross-section since the density of antisymmetrical states matches the density of states for distinguishable particles in a specified momentum range Consequently, the differential cross-section does not include a factor of 1/2, while the total cross-section does, as each final state is counted only once when integrating over angles.
Relativistic Treatment of Several Particles
The Műller treatment effectively analyzes the interaction between two electrons by calculating the field of particle 1 independently of particle 2's influence To improve this calculation, it is essential to develop a motion equation that continuously tracks both particles and synchronizes their movements This requires formulating a Dirac equation for two electrons that accurately incorporates their interactions, including the dynamics of the Maxwell field.
This kind of 2-particle Dirac equation is no longer relativistically invari- ant, if we give each particle a separate position in space but all the sametime.
Dirac proposed a many-time theory where each electron possesses its own unique time coordinate and adheres to its specific Dirac equation While this concept is theoretically sound, it becomes overly complex when particle pairs are created, leading to equations with fluctuating time coordinates Ultimately, the attempt to quantize electron theory as a model of discrete particles with individual time coordinates becomes impractical in the context of an infinite "sea" of particles Consequently, we reach the limits of what can be achieved with the relativistic quantum theory of particles.
The theory encountered significant issues due to the reliance on an operator to represent a particle's position at a specific time, t, which is merely a number rather than an operator This approach rendered the interpretation of the formalism predominantly non-relativistic, despite the formal invariance of the equations In equations such as the Klein-Gordon and Dirac equations, space and time coordinates are presented symmetrically, prompting a new perspective on the matter.
Relativistic quantum theory explores quantities ψ that are functions of four coordinates (x1, x2, x3, x0), where all coordinates are treated as classical numbers In this framework, only the expressions involving ψ serve as operators that describe the dynamics of the system.
The dynamical system is specified by the quantityψexisting at all points of space-time, and so consists of a system of fields Relativistic quantum theory is necessarily a field theory.
The process of reinterpreting a one-particle wave-function like the Dirac ψ as a quantized field operator is called Second Quantization.
Before we can begin on the program of constructing our quantum theory of fields, we must make some remarks about Classical Field Theory.
Classical Relativistic Field Theory
We take a field with components (vector, spinor etc.) labeled by a suffixα. Let φ α à = ∂φ α
The theory is fully described by an invariant function of position called the Lagrangian Density,
The behavior of the field is determined by the Action Principle, which relates to the function φ α and its first derivatives at the point x This principle applies to any finite or infinite region of space-time, Ω.
The operator L d 4 x (157) remains stationary for the physically feasible fields φ α Consequently, the variation ϕ α → φ α + δφ α results in no first-order change in I, provided that δφ α is an arbitrary variation that equals zero at the boundary of Ω.
It is always assumed that L is at most quadratic in the φ α à and is in various other respects a well-behaved function.
Let Σ represent the boundary of the region Ω, with dσ denoting an element of three-dimensional volume on Σ The outward unit normal vector to dσ is denoted as n, and dσ can be expressed as n dσ The notation X is defined such that n² = -1 for indices 1, 2, 3, and 4, with x₀ being constant (ct) The differential element dσ is given by the expression (dx₂ dx₃ dx₀, dx₁ dx₃ dx₀, dx₁ dx₂ dx₀, -i dx₁ dx₂ dx₃).
So the principle of action gives the field equations
= 0 (160) defining the motion of the fields.
∂φ α à (161) is the momentum conjugate to φ α , defined at x and with respect to the surface Σ.
A broader form of variation involves altering both the function φ α and the boundary of the domain Ω Each point x is adjusted to a new position (x + δx), where δx can be either constant or variable across the surface By denoting the updated function as N φ α and the original as O φ α, the change in φ α at point x can be expressed as δφ α (x) = N φ α (x + δx) − O φ α (x).
Therefore under the joint variation c δI(Ω) Z
X α π α (x)∆φ α (x)dσ the latter being true by (159) if we assume (160).
Now since by (162) δφ α (x) = N φ α (x) +X à δx à N φ α à (x)−Oφ α (x) = ∆φ α +X à δx à N φ α à (x) hence we get finally δI(Ω) Z
1 cn à L −φ α à π α δx à dσ (163) with all the new quantities on the RHS.
The unique motion in physically significant cases is determined by specifying the values of φ α across two space-time surfaces, σ 2 and σ 1, which represent the past and future boundaries of the volume.
Ω A space-like surface is one on which every two points are outside each other’s light-cones, so that the fields can be fixed independently at every point.
In a special case of non-relativistic theory, both σ₁ and σ₂ represent spatial coordinates at times t₁ and t₂, while δx corresponds to the product of the imaginary unit i and a time displacement of δt₁ and δt₂ Consequently, we can express nₐ as (0, 0, 0, i) and define πα as the derivative of the Lagrangian with respect to φ̇ₐ This leads us to the formulation of the Hamiltonian.
The classical theory's key aspect is that the Action Principle applies solely to variations that vanish on the boundary of the domain Ω This allows for the deduction of the impact on I(Ω) from variations that do not vanish on the boundary Each state of motion is defined by fixing independent field quantities, such as all fields on two space-like surfaces or all fields and their time-derivatives on one surface, which in turn determines the entire past and future of the motion through the field equations.
Field equations can be written in the Hamiltonian form φ˙ α = ∂H
(167) where we considerψ andψ ∗ independent one-component fields.
3 Maxwell Field, four componentA à , Fermi form,
5 Dirac Field interacting with Maxwell Field
L Q =L D +L M −X λ ieA λ ψγ λ ψ (170) here Qstands for quantum electrodynamics.
In this article, we explore Problem 5, which involves deriving the field equations, determining the momentum conjugate to each field component, and establishing the Hamiltonian function specifically for a flat space scenario We will verify that the Hamiltonian accurately represents the field equations through the canonical formulation of Hamiltonian equations of motion.
Quantum Relativistic Field Theory
Classical relativistic field theories have traditionally been quantized using the Hamiltonian formulation of field equations, incorporating commutation relations from non-relativistic quantum mechanics, as discussed in Wentzel's book However, this method is considered ineffective and overly complex, making it challenging to demonstrate the relativistic nature of the resulting theory, as the Hamiltonian approach lacks covariance.
Just recently we learnt a much better way of doing it, which I shall now expound in these lectures It is due to Feynman and Schwinger 16
References: R P Feynman,Rev Mod Phys 20(1948) 367
It is relativistic all the way, and it is much simpler than the old methods It is based directly on the Action Principle form of the classical theory which
I have just given you, not the Hamiltonian form.
In quantum theory, the operators φ α are defined at each point in space-time and adhere to the same field equations as previously established This consistency is guaranteed by the assumption that the Action Principle, represented as δI(Ω) = 0, holds true.
ΩL φ α , φ α à d 4 x (171) holds for all variationsδφ α of the operators vanishing on the boundaries of Ω.
In quantum theory, the complementarity relations prevent assigning numerical values to all field operators during physical motion The state of motion is defined by numerical values of φ α on a single space-like surface, making it impossible to predict future states from the field equations, which are typically second-order differential equations Consequently, the classical action principle is insufficient, necessitating additional considerations regarding the behavior of δI for variations δφ α that are non-zero on the boundaries of Ω.
A state of motion is defined by a space-time surface σ and specific numerical values φ 0 α for the eigenvalues of operators φ α on σ, represented by the Dirac ket vector |φ 0 α , σi This particular state features eigenvalues for φ α on σ, while a general state is a linear combination of various |φ 0 α , σi states with different φ 0 α values Observable physical quantities are derived from expressions like the matrix element.
(172) of the field operatorφ β (x) between the two states specified byφ 0 1 α onσ1 and by φ 0 2 α on σ 2 In particular, the transition probability amplitude between the two states is φ 0 1 α , σ 1 φ 0 2 α , σ 2
The squared modulus indicates the probability of locating the values φ 0 1 α for the fields on σ1, while the motion is determined by the fields having specific values φ 0 2 α on σ2.
The Feynman Method of Quantization
The Feynman method of quantizing the theory consists in writing down an explicit formula for the transition amplitude (173) Namely φ 0 1 α , σ1 φ 0 2 α , σ2
In this context, H denotes the historical data of fields situated between σ 2 and σ 1, represented by a set of classical functions φ α (x) defined within the region Ω These functions assume the values φ 0 1 α on σ 1 and φ 0 2 α on σ 2 The term I H (Ω) refers to the computed value of I(Ω) using these specific functions.
H encompasses all conceivable histories, representing an infinite sum that is challenging to define mathematically N serves as a normalization factor, unaffected by specific states, and is selected to ensure that the sum of the squares of the amplitudes from a particular state to all other states equals a consistent value.
Feynman's formula is rooted in fundamental principles, leveraging Huygens' principle to bridge wave mechanics and wave optics This single formula has the power to quantize an entire theory, providing a framework to address any physical problem Notably, its applications extend beyond field theory to encompass ordinary non-relativistic quantum theory, offering a versatile tool for solving complex problems.
We just show that it gives the same results as the usual quantum mechanics.
For a discussion of the difficulties in defining the sumP
H, and a method of doing it in simple cases, see C Morette, Phys Rev 81(1951) 848.
The most general Correspondence Principle is derived from formula (174), which reverts to classical theory as ~ approaches 0 In this limit, the exponential factor in (174) transforms into a highly oscillating function of histories H, except for the specific history where I(Ω) remains stationary Consequently, in this limit, the sum P converges.
The contribution from classical motion, denoted as H, transitions from φ 0 2 α on σ 2 to φ 0 1 α on σ 1, with all other contributions interfering destructively Classical motion is characterized by the condition that δI(Ω) = 0 for all small variations of φ α between σ 2 and σ 1 This transition to classical theory is analogous to the shift from wave optics to geometrical optics as the wavelength of light approaches zero The WKB approximation is achieved by considering ~ small but not exactly zero.
To connect the Feynman method with traditional quantization, Feynman clarifies his definition of an operator He considers a space-time point x within a specified region Ω and defines a field operator O(x) at that point, such as φ β (x) or φ β à(x) The meaning of O(x) is established through its matrix element, which is defined between the states |φ 0 2 α , σ 2 i and |φ 0 1 α , σ 1 i Here, σ 2 and σ 1 represent two surfaces situated in the past and future of x, respectively The matrix element can be expressed as φ 0 1 α , σ 1 O(x) φ 0 2 α , σ 2.
The number OH represents the value of the expression O when the variables φ and α are assigned their historical values in H The definitions provided are consistent with the physical properties of transition amplitudes and operator matrix elements However, the Feynman method has a significant limitation: it cannot be utilized without a practical approach to calculating or applying sums over histories Fortunately, Schwinger has demonstrated a way to derive an Action Principle formulation from the Feynman method, effectively circumventing this issue.
The Schwinger Action Principle
The Field Equations
In the specific scenario where the variation δφ α is zero at the boundary of Ω and δL = δxà= 0, the quantities hφ α 1 0, σ1 φ α 2 0, and σ2 remain solely dependent on the operators φ α associated with σ1 and σ2, showing no influence from the variation Consequently, for all such variations, the integral δI(Ω) equals zero.
That is to say, the classical action principle and the classical field equations are valid for the quantum field operators.
The generalization in equation (177) effectively encapsulates the essence of the original variation principle (171), incorporating crucial information for quantum theory regarding the impact of variations that do not diminish at the boundary of the domain Ω.
The Schr¨odinger Equation for the State-function
Specialize σ 1 and σ 2 to be the whole space at the timest 1 and t 2 Then φ 0 1 α , σ 1 φ 0 2 α , σ 2 φ 0 1 α , t 1 φ 0 2 α , t 2
The Schrödinger wave-function, denoted as Ψ(φ0 1 α, t1), represents the probability amplitude of finding a system in the state φ0 1 α at a given time t1, based on the initial conditions φ0 2 α at t2 This wave-function essentially describes the time-evolution of the system's state in the Schrödinger representation, providing a mathematical framework for understanding the dynamics of the system over time.
Take in (177) a variation in which δφ α =δL = 0, the surface σ 1 being just moved through the displacement δt in the time direction Then using
The ordinary Schrödinger equation, as expressed in Dirac's notation, is a fundamental concept that underscores the predictive power of the Schwinger action principle By providing a mathematical framework for describing the time-evolution of a quantum system, this principle enables the prediction of future behavior based on a known initial quantum state.
Operator Form of the Schwinger Principle
Feynman introduced operators by providing a formula for their matrix elements, which are defined between states on two distinct surfaces The initial state is determined in the past, while the final state is designated for the future, with the operator corresponding to a specific time considered as the present.
Operators are typically defined by specifying their matrix elements between states on the same surface, focusing on the matrix element φ 0 α , σ O φ 00 α , σ.
In the context of eigenvalues, let φ₀ₐ and φ₀₀ₐ represent specific sets, while σ denotes a surface that can be situated in the past, present, or future concerning the field-points referenced by O By selecting a reference surface σ₀ in the distant past and varying φₐ, σ, and L while keeping all aspects of σ₀ constant, we can derive from equation (178) that, assuming equation (179) is valid, the variation results in δI(Ω) = 1/c.
1 cn à L −φ α à π α δx à dσ (182) where Ω is the region bounded by σ o and σ Let us now first calculate the variation of (181) arising from the change in the meaning of the states
|φ 0 α , σi and |φ 00 α , σi The operator O itself is at this point fixed and not affected by the variations inφ α , σ andL Then φ 0 α , σ O φ 00 α , σ
(183) therefore, denoting φ 0 α , σ O φ 00 α , σ σ 0 O σ 00 etc., we have δ σ 0 O σ 00
00 σ 0 σ 0 o σ o 0 O σ 00 o δ σ 00 o σ 00 because|φ 0 o α i and|φ 00 o α i are not changed by the variation, and neither isO. Therefore, using (177) we have δ σ 0 O σ 00
~ σ 0 OδIσ o − σ σ 00 where the subscript σ −σ o refers to the surface integrals in (178) Since δI σ − σ o =−δI σ o − σ , we get finally δ φ 0 α , σ O φ 00 α , σ
(184) where[P, R]=P R−RP This applies for the case whenO is fixed and the states vary.
We aim to calculate the variation of the matrix element \( \phi_0^\alpha, \sigma O \phi_0^{\alpha}, \sigma \) when the states are fixed and \( O = O(\phi_\alpha(\sigma)) \) varies This scenario mirrors the previous case but features an opposite sign in the results.
(185) ifboth the states and O change simultaneously is zero Therefore, if we use a representation in which matrix elements of O are defined between states not subject to variation we get 19 i~δO(σ) =[δI(Ω),O(σ)] (186)
This is the Schwinger action principle in operator form It is related to (177) exactly as the Heisenberg representation is to the Schr¨odinger representation in elementary quantum mechanics.
The Canonical Commutation Laws
Taking forσ the space at timet, forO(σ) the operatorφ α (r, t) at the space- point r, and δx à = δL = 0 we have by (182) and (186) for an arbitrary variationδφ α
Z[π β (r 0 , t)δφ β (r 0 , t), φ α (r, t)]d 3 r 0 (187) because dσ = −n à dσ à = −i(−i dx 0 1 dx 0 2 dx 0 3 ) = −d 3 r 0 by (158); the unit vector in theincreasing time direction isi, and this is the outward direction since we choose σ o in the past Hence for everyr, r 0
[φ α (r, t), π β (r 0 , t)]=i~δ αβ δ 3 (r−r 0 ) (188) Also since theφ α (r) onσ are assumed independent variables,
This method automatically establishes the correct canonical commutation laws for fields, eliminating the need to verify the consistency of these rules with the field equations, a requirement present in older approaches.
The Heisenberg Equation of Motion
Suppose that σ is a flat surface at time t, and that a variation is made by moving the surface through the small time δt as in B above But now let
The operator O(t), constructed from the field operators φ α on σ, experiences a change due to variation, expressed as i~δO(t) = [-H(t)δt, O(t)] This indicates that O(t) adheres to the Heisenberg equation of motion, represented by i~dO(t)/dt = [O(t), H(t)], where H(t) denotes the total Hamiltonian operator.
General Covariant Commutation Laws
From (186) we derive at once the general covariant form of the commutation laws discovered by Peierls in 1950 [13] This covariant form is not easy to reach in the Hamiltonian formalism.
In a field theory context, consider two points, z and y, along with operators R(z) and Q(y) that depend on field quantities φ α at these points By introducing a reference surface σ o that intersects both z and y, we can add an infinitesimal term, δ R (L) = δ 4 (x−z)R(z), to the Lagrangian density L(x) This addition results in a minimal change, δ R φ α (x), in the field equations' solutions Importantly, if the modified field quantities φ α (x) remain unchanged on the reference surface σ o, the variation δ R φ α (x) will only be non-zero within the future light-cone of point z.
Adding δ Q (L) = δ 4 (x−y)Q(y) to L(x) results in a maximum change of δ Q φ α (x) in φ α (x) The change in Q(y) due to this addition is represented by δ R Q(y), while δ Q R(z) indicates the change in R(z) If point y is located on a surface σ that lies in the future of point z, we consider Q(y) for O(σ) in equation (186), with δL defined accordingly.
(191) The δI(Ω) given by (182) reduces then simply to δI(Ω) = 1 cR(z)
It is assumed that there is no intrinsic change δφ α of φ α or δx à of σ apart from the change whose effect is already included in theδL term Thus (186) gives
In quantum field theory, the commutation relations between operators R(z) and Q(y) reveal that when the spacetime points y and z are separated by a space-like interval, the commutator vanishes Specifically, for y 0 > z 0, the relation is given by [R(z), Q(y)] = i~c δ R Q(y), while for z 0 > y 0, it is [R(z), Q(y)] = -i~c δ Q R(z) This zero commutator occurs because the disturbance R(z) can only influence events within its future light cone, leading to δ R Q(y) = 0 in such scenarios.
Peierls’ formula, valid for any pair of field operators, is
[R(z), Q(y)]=i~c{δ R Q(y)−δ Q R(z)} (194)This is a useful formula for calculating commutators in a covariant way.
Anticommuting Fields
Schwinger’s action principle allows for the straightforward construction of a specific type of field theory that is not derived from Feynman’s framework In this classical field theory, a set of field operators, denoted as ψ α, consistently appears in bilinear combinations within the Lagrangian, such as ψ β ψ α Notable examples of this include the Dirac Lagrangian (L D) and quantum electrodynamics (L Q).
Then instead of taking every φ α on a given surface σ to commute as in
(189), we may take every pair ofψ α to anticommute, thus 20
The bilinear combination maintains its commutative property, similar to the previous φ α fields The ψ α fields also commute with any field quantities on σ, except for the ψ fields themselves Schwinger asserts that the relationship holds as before, with the variation δI(Ω) calculated under the condition that δψ α anticommutes with all ψ α and ψ β operators In these theories, the momentum πα, which is conjugate to ψ α, is simply a linear combination of ψ due to the Lagrangian's linear dependence on the derivatives of ψ The field equations are derived as before, along with the Schrödinger equation, while the commutation rules are defined by specific equations To validate these rules, it is necessary to express the canonical commutation law, considering that δψ β anticommutes with both the ψ and π operators.
The general commutation rule (194) is still valid provided thatQandRare also expressions bilinear in theψ andψ.
The interpretation of operators and the justification of the Schwinger principle for anticommuting fields remains unclear; however, it offers a consistent and straightforward formulation of relativistic quantum field theory Despite the lack of complete understanding of its conceptual foundations, utilizing this method leads to a mathematically unambiguous theory that aligns with experimental results, which is satisfactory for practical purposes.
The Maxwell Field
Momentum Representations
Z exp(ikãx)d 4 k (206) where the integral is fourfold, overdk 1 dk 2 dk 3 dk 4 Therefore
The expression involves the integration of a function defined by exp(ikãx) multiplied by a factor that includes k, where k² = |k|² - k₀² The integration variables k₁, k₂, and k₃ are treated as ordinary real integrals, while the integration with respect to k₀ is conducted as a contour integral along the real axis, navigating above the two poles located at k₀ = ±|k|.
For detailed calculations, see the Appendix below This gives the correct behaviour ofDR being zero forx0 0, we have to take the bottom path; then
+ e ik ã x e − ik 0 x 0 k 2 −k 2 0 d 3 k dk 0 wherek and xare 3 dimensional vectors.
Now because of the clockwise direction
Fourier Analysis of Operators
Let us analyze the potential A à into Fourier components
The equation Z d 3 k|k| − 1/2 {a kà exp(ikãx) + ˜a kà exp(−ikãx)} represents a mathematical expression where the factor |k| − 1/2 is used for convenience, leading to the Fourier coefficients |k| − 1/2 a kà and |k| − 1/2 ˜a kà The integration is conducted over all 4-vectors (k) with k 0 = +|k|, while B serves as a normalization factor to be defined later Additionally, the operators a kà and ˜a kà are independent of the variable x.
Since A 1 , A 2 , A 3 andA 0 are Hermitian, ˜ a kà =a ∗ kà = Hermitian conjugate ofa kà , à= 1,2,3,0 and therefore ˜ a k4 =−a ∗ k4 =−Hermitian conjugate ofa k4
(212) Computing the commutator [A à (z), A λ (y)] from (211) and comparing the result with (203) in the momentum representation (210), we have first, since the result is a function of (z−y) only 22
[a kà , ˜a k 0 λ]=δ 3 (k−k 0 )δ àλ And the two results for the commutator agree then precisely if we take
Emission and Absorption Operators
The operatorsA à (x) obey the Heisenberg equations of motion for operators (190) i~∂A à
The operator \( a_k^\dagger \exp(ikx) \) has non-zero matrix elements between an initial state with energy \( E_1 \) and a final state with energy \( E_2 \) only when the condition \( i\hbar(-ik_0) = E_1 - E_2 = \hbar c |k| \) is satisfied This relationship is derived from equations (211) and (190a), leading to the expression \( \psi_1 i\hbar(-ik_0) a_k^\dagger \psi_2 = \psi_1 \hbar c |k| a_k^\dagger \psi_2 \).
The constant energy, denoted as ~c|k|, is associated with the frequency ω=ck, which is specific to the Fourier components of the field The operator a kà functions to decrease the system's energy by a discrete amount corresponding to this energy size Similarly, the operator ˜a kà is only effective under certain conditions.
E1−E2 =−~c|k| to increase the energy by the same amount.
Quantized field operators fundamentally alter a system's energy in discrete jumps rather than continuously This characteristic accurately reflects the experimentally observed quantum behavior of radiation.
We call a kà the absorption operator for the field oscillator with prop- agation vector k and polarization direction à Likewise ˜a kà the emission operator.
A photon with a specific momentum can exhibit four polarization directions; however, not all of these are observed in electromagnetic radiation Free radiation is limited to transverse waves, resulting in only two possible polarization states This limitation arises from a supplementary condition that restricts the physically allowable states Ψ.
∂x à Ψ = 0 (217) whereA (+) à is the positive frequency part ofA à , i.e the part containing the absorption operators In the classical theory we have
∂x à = 0 the condition imposed in order to simplify the Maxwell equations to the simple form 2 A à = 0 In the quantum theory it was usual to take
The equation ∂x à Ψ = 0 indicates that certain types of photons cannot be emitted, leading to physical interpretations that are challenging and introduce mathematical inconsistencies in the theory Consequently, we adopt the assumption outlined in (216), which states that these photons are absent and cannot be absorbed from a physical standpoint, providing a coherent rationale In the classical limit, the expression P à∂A à /∂x à is a real quantity, thus confirming that P à∂A à /∂x à = 0 is a valid conclusion derived solely from (216) This approach of employing (216) as a supplementary condition was introduced by Gupta and Bleuler.
References: S N Gupta, Proc Roy Soc A 63(1950) 681.
The older treatment is unnecessary and difficult, so we will not bother about it.
By (211), (216) is equivalent to assuming
(kàa kà ) Ψ = 0 (216a) for each momentum vectork of a photon.
Gupta and Bleuler's work indicates that supplementary conditions are not practically applied in the theory Consequently, we can effectively utilize the theory and achieve accurate results while disregarding these supplementary conditions.
Gauge-Invariance of the Theory
The theory is gauge-invariant, meaning that adding a gradient term to the potentials does not alter the physically observable fields Consequently, all states that differ solely by this addition to the potentials are considered physically identical.
In quantum mechanics, if Ψ represents any state, then the state Ψ 0 can be derived by emitting a pseudo-photon with potentials proportional to ∂Λ/∂x à This results in Ψ 0 being indistinguishable from the original state Ψ Furthermore, for any state Ψ2 that meets the supplementary condition (216a), the matrix element Ψ 0∗, Ψ 1 Ψ ∗ is considered.
The matrix elements of Ψ and Ψ₀ are equivalent for any physical state Ψ₂, demonstrating that the outcomes of the theory remain consistent regardless of whether the state is represented by the vector Ψ or Ψ₀ This illustrates the proper gauge invariance of the theory, even though the states are defined by potentials that lack gauge invariance.
The Vacuum State
The vacuum state is by definition the state of lowest energy, so that all absorption operators operating on it give zero: a kà Ψ o = 0 (217a) and therefore by (212)
In the study of operators, we focus on the "vacuum expectation value" of an operator Q, denoted as ⟨Q⟩₀ = (Ψ₀*, QΨ₀) Notably, we find that ⟨aₖ aₖ' λ⟩₀ = 0 and ⟨âₖ âₖ' λ⟩₀ = 0, confirming the absence of correlations between certain operators Additionally, the relationship ⟨âₖ aₖ' λ⟩₀ = ⟨[aₖ, âₖ' λ]⟩₀ = δ³(k - k₀) δₐλ arises from the established commutation laws, indicating a delta function dependence on momentum and other quantum numbers.
The vacuum expectation valuehA à (z)A λ (y)i o is thus just the part of the commutator[A à (z), A λ (y)]which contains positive frequencies exp{ikã(z− y)}with k 0 >0, as one can see using (211), (219) and (220) Thus 25 hAà(z)A λ (y)io=i~c δ àλ D + (z−y) (221)
(224) The even function D (1) is then defined by hA à (z)A λ (y) +A λ (y)A à (z)i o =~c δ àλ D (1) (z−y) (225)
It is then not hard to prove (see the Appendix below) that
The functionsDandD (1) are the two independent solutions of 2 D = 0, one odd and the other even Then we define the function
Z exp(ikãx) 1 k 2 d 4 k (228) the last being a real principal value integral: This is the even solution of the point-source equation
0 d|k| {sin((|x|+x 0 )|k|) + sin((|x| −x 0 )|k|)} Taking the integral in the Abelian sense lim→ 0
The Gupta-Bleuler Method
There is one difficulty in the preceding theory We assume according to (220) ha kà a ∗ k 0 λi o =±δ 3 (k−k 0 )δ àλ (220a)
In the context of the harmonic oscillator, the operators \(a_k\) and \(a^*_k\) are represented by matrices, leading to positive vacuum expectation values for the product \(a_k a^*_k\), even for \(k=4\) This indicates that \(a_k a^*_k\) maintains a positive expectation value across all states when photon oscillators are considered as standard elementary oscillators It is essential to differentiate between the scalar product defined by our covariant theory, \((\Psi^*_1, \Psi_2)\), and the scalar product calculated using matrix representations, \((\Psi^*_1, \Psi_2)_E\), as the latter lacks physical significance due to its reference to states with photons polarized solely in the time dimension, which are not physically realizable Nonetheless, utilizing matrix representations remains practical for calculations.
To employ matrix representations, we define the operator η with the condition ηΦ = (−1)Φ, where Φ represents any state with a specific number N of photons polarized in the 4-direction Consequently, the physical scalar product can be expressed using the explicit matrix representations.
Gupta's definition (220c) ensures that matrix representations align with the covariant theory's requirements, accurately reflecting the relationship outlined in (220) From the perspective of these matrix representations, the physical scalar product is viewed as an indefinite metric However, as demonstrated in (216b), the scalar product (Ψ ∗ 1 , Ψ2) for any physical states is equivalent to (Ψ ∗ 1T , Ψ1T), where Ψ1T represents states with only transverse photons, resulting in a positive value Therefore, for physical states, the metric is definite, which meets our necessary criteria.
Example: Spontaneous Emission of Radiation
The phenomenon described is a purely quantum-mechanical effect, where classical treatments can accurately explain radiation absorption and stimulated emission when considering the atom's response to a classical Maxwell field However, these classical approaches fall short in accounting for spontaneous emission.
An atom can exist in two states: the ground state (1) and an excited state (2) with energy approximately equal to cq During the transition from state 2 to state 1, the charge-current density of the atom is represented by the unintegrated matrix elements j àA (x) = j àA (r, t) at the specific point x = (r, t).
The interaction with the Maxwell field has matrix element 27
Z X à jàA(r, t)hAà(r, t)iemitd 3 r (230) for making a transition with emission of a photon The total emission proba- bility per unit time is obtained using time dependent perturbation theory: 28 w= 1
In the integral expressed as A ∗ λ (x 0 )Aà(x) over all space for a long time T, it is essential to sum only over the physical photon states It is incorrect to consider the four states with polarization directions à = 1, 2, 3, 4, as these do not represent physical states.
Using a sum-rule to sum over the states w= 1
Write ˜j λA (x 0 ) for the matrix element 29 of j λA (x 0 ) in the reverse transition
2 (q−k 0 )2 →πδ(q−k 0 ) ifcT → ∞ where 30 j àA (k) Z j àA (r)e − ik ã r d 3 r ˜j àA (k) Z ˜j àA (r)e ik ã r d 3 r
By the charge conservation law
The equation |j A|² = |j 1A|² + |j 2A|² (1−cos 2 θ) demonstrates that only two transverse polarization directions are relevant in real emission scenarios, as the third and fourth directions do not contribute This conclusion remains valid when applying an indefinite metric, where the sum over four polarization states includes a negative sign for the fourth state However, utilizing the covariant formalism is more straightforward than addressing non-physical photon states and subsequently employing the metric to achieve accurate results.
The emission probability in the direction of polarization 1 and propagation, described by the solid angle dΩ, is given by the equation w = (q dΩ) / (8π² ~c² |j₁A(x)|²), utilizing the delta function δ(|k|² - q²) = 2q₁ δ(|k| - q) for q > 0 For dipole radiation from a one-electron atom located at coordinates (x, y, z), the expression simplifies to w = (e² q³ dΩ) / (8π² ~ |⟨x⟩|²), corroborating findings in Bethe’s Handbuch article.
The example illustrates the effectiveness of covariant methods, even for basic problems where they may not be ideally suited This approach simplifies the process by eliminating the need to consider the normalization of photon states, as factors such as 2 and π are automatically accounted for when using the specified equation.
The Hamiltonian Operator
Using the commutation rules (213) we can find an operatorHwhich satisfies all these conditions simultaneously Namely
This operator is in fact unique apart from an arbitrary additive constant.
To establish the constant, we need to solve the equation 32 hHio = 0, which directly results in (234a), as demonstrated in (219) Consequently, (234a) represents the Hamiltonian of this theory, notable for its simplicity within the momentum representation.
To deriveHfrom the Lagrangian is also possible but much more tedious. From (234a) we see that
The operator N kλ = ˜a kλ a kλ represents the number of quanta at a specific frequency k and polarization λ, as derived from the commutation rules and the singular δ-function factor associated with the continuous spectrum Essentially, N kλ quantifies the number of quanta per unit frequency range.
N kλ d 3 k integrated over any region of momentum-space has the integer eigenvalues
0, 1, 2, This is so, because the state with n i particles with momentum k i is Ψ = Q` i=1(˜a k i λ ) n i Ψ o Then taking R
ΩN kλ d 3 k over Ω including the momentak 1 , k 2 , , k j we get
Fluctuations of the Fields
Electromagnetic fields E and H are quantum-mechanical variables that lack well-defined values in states where energy and momentum are clearly defined, such as in the vacuum state The state of these fields can be described by either fixing the values of E and H or by specifying the number of quanta present with different momenta and energies These two descriptions are complementary and can only coexist in the classical limit, characterized by large numbers of quanta and very strong fields.
In his 1946 paper published in Phys Rev 69, L P Smith provides an insightful analysis of cavity resonators, specifically focusing on a single mode of oscillation This discussion highlights a crucial concept: the impossibility of determining the time-dependence of the field's phase when the number of quanta, or energy, is held constant This article is highly recommended for those interested in understanding the complexities of oscillatory behavior in resonators.
We consider a more general problem What is the mean-square fluctua- tion in the vacuum state of a field-quantity? We define
(235) averaged over some finite space-volume V and also over a time T Let
V e − ik ã r dτ Then since H =∇ ìA, we have
Taking for V any finite volume and T a finite time, this mean-square fluc- tuation is finite Example: a sphere of radius R gives
If either R or T approaches zero, the fluctuations increase significantly and ultimately diverge Therefore, only measurements of field quantities that are averaged over both space and time possess any physical reality.
5.1.10 Fluctuation of Position of an Electron in a Quan- tized Electromagnetic Field The Lamb Shift
In a hydrogen atom, an electron is modeled as an extended spherical charge with a radius R, existing in a stationary state influenced by the potential φ(r) This electron is described by a specific wave-function ψ(r), and while we treat the system non-relativistically, we account for the quantized radiation field that interacts with the electron The interaction with this fluctuating field causes rapid positional fluctuations of the electron, represented by the equation m¨r = −eE, where m is the electron's mass, e is its charge, and E is the electric field.
A fluctuating electric field E with frequency c|K| induces a corresponding fluctuation in the response variable r, with the amplitude increased by a factor of m e c² |1/K|² However, electrons cannot track the slow fluctuations of E if the frequency is below the atomic frequency cK_H Consequently, by applying the limit T → 0, we derive that r² 1 o = e² m².
|K| (K 2 2 +K 3 2 )|V(K)| 2 1 c 4 |K| 4 because lim x → 0 sin 2 x x 2 = 1 The integral now converges at ∞, because of the finite size of the electron Since R is very small we may approximate (237) by
0 for|K|R >1 Then, since (K 2 2 +K 3 2 ) = |K| 2 (1−cos 2 θ) and Rπ
This fluctuation in position produces a change in the effective potential acting on the electron Thus hV(r+δr)i=V(r) +hδrã ∇V(r)io+1
2 r 2 1 o∇ 2 V because hδrã ∇V(r)io = 0, being odd Now in a hydrogen atom, ∇ 2 V e 2 δ 3 (r) (Heaviside units!) Hence the change in the energy of the electron due to the fluctuations is 35 (a o = Bohr radius)
(239) because for the hydrogen atom 36 (ρ=r√
The fluctuations will lead to a significant increase in kinetic energy, but this effect is disregarded since it will uniformly affect all atomic states, resulting in no relativistic displacement.
Of course this is not a good argument.
Hence we find the first approximation to the Lamb shift; the 2s state is shifted relative to the 2p states by
We take R= (~/mc), the electron Compton wave-length since it is at this frequency that the non-relativistic treatment becomes completely wrong. Then
The frequency value of α 3π 3 Ry is calculated to be 136 Mc, which aligns with the expected sign and order of magnitude, a method attributed to Welton However, the log size is inaccurate due to an improper low-frequency cut-off, resulting in an energy shift of ∆E ∼1600 Mc instead of the correct 1060 Mc, although the physical origin of this shift is accurately represented.