LOG#048. Thomas precession.




Let me begin this post with an uncommon representation of Lorentz transformations in terms of “uncommon matrices”. A Lorentz transformation can be written symbolically, as we have seen before, as the set of linear transformations leaving invariant

    \[ ds^2=d\mathbf{x}^2-c^2dt^2\]

Therefore, the Lorentz transformations are naively X'=\mathbb{L}X. Let \mathbf{A}, \mathbf{B} be 3-rowed column matrices and let M, R, \mathbb{I} represent 3\times 3 matrices and T will be used (unless it is stated the contrary) to denote the matrix transposition ( interchange of rows and columns in the matrix).

The invariance of ds'^2=ds^2 implies the following results from the previous definitions:

    \[ \gamma^2-\mathbf{B}^2=1\]

    \[ M^TM =\mathbf{A}\mathbf{A}^T+\mathbb{I}\]

    \[ M^T\mathbf{B}=\gamma \mathbf{A}\leftrightarrow \mathbf{B}^T M=\gamma \mathbf{A}^T\]

Then, we can write the matrix for a Lorent transformation (boost) in the following non-standard manner:

    \[ \boxed{\mathbb{L}=\begin{pmatrix}\gamma & -\mathbf{A}^T\\ -\mathbf{B} & M\end{pmatrix}}\]

and the inverse transformation will be

    \[ \boxed{\mathbb{L}^{-1}=\begin{pmatrix}\gamma & \mathbf{B}^T\\ \mathbf{A} & M^T\end{pmatrix}}\]

Thus, we have \mathbb{L}\mathbb{L}^{-1}=\mathbb{I}_{4x4}\equiv \mathbb{E}, where we also have

    \[ \gamma^2-\mathbf{A}^2=1\]

    \[ M\mathbf{A}=\gamma \mathbf{B}\]

    \[ MM^T=\mathbf{B}\mathbf{B}^T+\mathbb{I}_{3x3}\]

Let us define, in addition to this stuff, the reference frames S, \overline{S}', corresponding to the the coordinates \mathbf{X} and \overline{\mathbb{X}}'. Then, the boost matrix will be recasted, if the velocity read \mathbf{v}=\mathbf{A}/\gamma, as

    \[ L_{v}=\begin{pmatrix}\gamma & -\gamma \mathbf{v}^T\\ -\gamma \mathbf{v} & \mathbb{I}+\frac{\gamma^2}{1+\gamma}\mathbf{v}\mathbf{v}^T\end{pmatrix}=\begin{pmatrix}\gamma & -\mathbf{A}^T\\ -\mathbf{A} & \mathbb{I}+\frac{\mathbf{A}\mathbf{A}^T}{1+\gamma}\end{pmatrix}\]

Remark: a Lorentz transformation will differ from boosts only by rotations in the general case. That is, with these conventions, the most general Lorentz transformations include both boosts and rotations.

For all \gamma>0, the above transformation is well-defined, but if \gamma<0, then it implies we will face with transformations containing the reversal of time ( the time reversal operation T, please, is a different thing than matrix transposition, do not confuse their same symbols here, please. I will denote it by \mathbb{T} in order to distinguish, although there is no danger to that confusion in general). The time reversal can be written indeed as:

    \[ \mathbb{T}=\begin{pmatrix}-1 & \mathbf{0}^T\\ \mathbf{0} & \mathbb{I}\end{pmatrix}\]

In that case, (\gamma<0), after the boost L_{v}, we have to make the changes \gamma \rightarrow \vert \gamma\vert and \mathbf{A}\rightarrow -\mathbf{A}. If these shifts are done, the reference frames \overline{S} and \overline{S}' can be easily related

    \[ \overline{X}'=LX=LL^{-1}_{v}\overline{X}\]

in such a way that

    \[ LL^{-1}_{v}=\begin{pmatrix}1 & \mathbf{0}\\ \mathbf{0} & R\end{pmatrix}=L_R\]

where the rotation matrix is given formally by the next equation:

    \[ R=M-\dfrac{\mathbf{B}\mathbf{A}^T}{1+\gamma}\]

R must be an orthogonal matrix, i.e., R^TR=\mathbb{I}_{3x3}. Then (\det (R))^2=1, or det R=\pm 1.. For \det R=-1 we have the parity matrix

    \[ \mathbb{P}=\begin{pmatrix}1 & \mathbf{0}^T\\ \mathbf{0} & -\mathbb{I}_{3x3}\end{pmatrix}\]

and it will transform right-handed frames to left-handed frames \overline{S} or \overline{S}'. The rotation vector \alpha can be defined as well:

    \[ 1+2\cos \alpha=Tr (R)\rightarrow \cos\alpha=\dfrac{Tr R-1}{2}\]

so \alpha^\mu=\dfrac{1}{2}\epsilon^{\mu\nu\lambda}R^\nu_{\lambda}\dfrac{\alpha}{\sin\alpha}, \forall 0\leq \alpha<\pi. The rotation acting on 3-rowed matrices:

    \[ R\mathbf{A}=\mathbf{B}\]

implies that \overline{X}'=R\overline{X}, and it changes -\mathbf{A}/\gamma of the frame S into \overline{S}. Passing from one frame into another, \overline{S}' to S', it implies we can define a boost with L_{-\mathbf{B}/\gamma}. In fact,

    \[ L_{-\mathbf{B}/\gamma}L=\begin{pmatrix}1 & \mathbf{0}^T\\ \mathbf{0} & R\end{pmatrix}=L_R\]


Remark(I): Without the time reversal, we would get L_{R\mathbf{v}}L_R=L=L_RL_{\mathbf{v}}

with \mathbf{v}=\mathbf{A}/\gamma and R=M-\dfrac{\mathbf{BA}^T}{1+\gamma}.

Remark (II): L_RL_v\rightarrow L^T=L^T_vL_R^T=L_vL_{R^T}. If L^T=L=L_{R\mathbf{v}}L_R, then the uniqueness of R\mathbf{v} provides that R=R^T=R^{-1}, i.e., that R is an orthogonal matrix. If R is an orthogonal matrix and a proper Lorentz transformation ( det R=+1), then we would get late \sin\alpha=0, and thus \alpha=0 or \alpha=\pi, and so, R=I or R=2\mathbf{n}\mathbf{n}^T-1, with the unimodular vector \mathbf{n}, i.e., \vert \mathbf{n}\vert=1. That would be the case \forall \mathbf{v}\neq 0 and \mathbf{n}=\mathbf{v}/\vert\vert \mathbf{v}\vert\vert. Otherwise, if \mathbf{v}=0, then \mathbf{n} would be an arbitrary vector.


The second step previous to our treatment of Thomas precession is to review ( setting c=1) the addition of velocities in the special relativistic realm. Suppose a point particle moves with velocity \overline{w} in the reference frame \overline{S}. Respect to the S-frame (in rest) we will write:

    \[ \mathbf{x}=\overline{\mathbf{x}}+\dfrac{\gamma^2}{\gamma+1}(\overline{\mathbf{x}}\mathbf{v})\mathbf{v}+\gamma \mathbf{v}\overline{t}\]


    \[t=\gamma \overline{t}+\gamma (\mathbf{v}\overline{\mathbf{x}})\]

and with \overline{x}=\overline{\mathbf{w}}\overline{t} we can calculate the ratio \mathbf{u}=\mathbf{x}/t:

    \[ \mathbf{u}=\dfrac{\dfrac{\overline{\mathbf{w}}}{\gamma}+\dfrac{\gamma}{1+\gamma}(\mathbf{v}\overline{\mathbf{w}})\mathbf{v}+\mathbf{v}}{1+\mathbf{v}\overline{\mathbf{w}}}\]

and thus

    \[ \mathbf{u}\equiv \dfrac{\mathbf{v}+\mathbf{w}_\parallel+(\mathbf{w}_\perp/\gamma)}{1+\mathbf{v}\overline{\mathbf{w}}}\]

where we have defined:

    \[ (\mathbf{w}_\perp/\gamma)\equiv\dfrac{\overline{\mathbf{w}}}{\gamma}\]


    \[ \mathbf{w}_\parallel\equiv \dfrac{\gamma}{1+\gamma}(\mathbf{v}\overline{\mathbf{w}})\mathbf{v}\]

Comment: the composition law for 3-velocities is special relativity is both non-linear AND non-associative.

There are two special cases of motion we use to consider in (special) relativity and inertial frames:

1st. The case of parallel motion between frames (or “parallel motion”). In this case \overline{\mathbf{w}}=\lambda \mathbf{v}, i.e., \mathbf{w}\times \mathbf{v}=0. Therefore,

    \[ \mathbf{u}=\dfrac{\mathbf{v}+\overline{\mathbf{w}}}{1+\mathbf{v}\overline{\mathbf{w}}}\]

This is the usual non-linear rule to add velocities in Special Relativity.

2nd. The case of orthogonal motion between frames, where \mathbf{v}\perp\mathbf{w}. It means \mathbf{v}\mathbf{w}=0. Then,

    \[ \mathbf{u}=\mathbf{v}+\mathbf{w}/\gamma= \mathbf{v}+\overline{\mathbf{w}}\sqrt{1-\mathbf{v}^2}\]

This orthogonal motion to the direction of relative speed has an interesting phenomenology, since this inertial motion will be slowed down due to time dilation because the spatial distances that are orthogonal to \mathbf{v} are equal in both reference frames.

Furthermore, we get also:

    \[ \mathbf{u}^2=1-\dfrac{(1-\overline{\mathbf{w}}^2)(1-\mathbf{v}^2)}{(1+\mathbf{v}\overline{\mathbf{w}})}\leq 1\]

Indeed, the condition \mathbf{u}^2=1 implies that \overline{\mathbf{w}}^2=1 or \mathbf{v}^2=1, and the latter condition is actually forbidden because of our interpretation of \mathbf{v} as a relative velocity between different frames. Thus, this last equation shows the Lorentz invariance in Special relativity don’t allow for superluminal motion, although, a priori, it could be also used for even superluminal speeds since no restriction apply for them beyond those imposed by the principle of relativity.


We are ready to study the Thomas precession and its meaning. Suppose an inertial frame \overline{\overline{S}} obtained from another inertial frame \overline{S} by boosting the velocity \overline{w}. Therefore, \overline{\overline{S}} owns the relative velocity \mathbf{v} given by the addition rule we have seen in the previous section. Moreover, we have:

    \[ \overline{\overline{x}}=L_{\overline{w}}\overline{x}=L_{\overline{w}}L_{\mathbf{v}}x\]

Then, we get

    \[ L_{\mathbf{v}}=\begin{pmatrix}\gamma_v & -\gamma_v \overline{\mathbf{v}}^T\\ -\gamma_v \mathbf{v} & \mathbf{1}+\dfrac{\gamma_v^2}{1+\gamma_v}\mathbf{v}\mathbf{v}^T\end{pmatrix}\]

    \[ L_{\overline{\mathbf{w}}}=\begin{pmatrix}\gamma_{\overline{\mathbf{w}}} & -\gamma_{\overline{\mathbf{w}}} \overline{\mathbf{w}}^T\\ -\gamma_{\overline{\mathbf{w}}} \overline{\mathbf{w}}^T & \mathbf{1}+\dfrac{\gamma_{\overline{\mathbf{w}}}^2}{1+\gamma_{\overline{\mathbf{w}}}}\overline{\mathbf{w}}\overline{\mathbf{w}}^T\end{pmatrix}\]


    \[ \gamma_{v}=\dfrac{1}{\sqrt{1-\mathbf{v}^2}}\]

    \[ \gamma_{\overline{\mathbf{w}}}=\dfrac{1}{\sqrt{1-\overline{\mathbf{w}}^2}}\]

and then

    \[ \boxed{L\equiv=L_{\overline{\mathbf{w}}}L_v=\begin{pmatrix}\gamma & -\mathbf{A}^T\\ -\mathbf{B} & M\end{pmatrix}}\]


    \[ \gamma (\mathbf{v},\overline{\mathbf{w}})=\gamma_v\gamma_{\overline{w}}(1+\mathbf{v}\overline{\mathbf{w}})\equiv \gamma (\overline{\mathbf{w}},\mathbf{v})\]

    \[ \mathbf{A}=\gamma (\mathbf{v},\overline{\mathbf{w}})\overline{\mathbf{w}}o \mathbf{v}\]

    \[ \mathbf{B}=\gamma (\overline{\mathbf{w}},\mathbf{v})\mathbf{v}o\overline{\mathbf{w}}\]

    \[ M=M(\overline{\mathbf{w}},\mathbf{v})=\mathbf{1}+\dfrac{\gamma_v^2}{1+\gamma_v}\mathbf{v}\mathbf{v}^T+\dfrac{\gamma_{\overline{\mathbf{w}}}^2}{1+\gamma_{\overline{\mathbf{w}}}}\overline{\mathbf{w}}\overline{\mathbf{w}}^T+\gamma_v\gamma_{\overline{\mathbf{w}}}\left( 1+\dfrac{\gamma_v\gamma_{\overline{\mathbf{w}}}}{(1+\gamma_v)(1+\gamma_{\overline{\mathbf{w}}})}\mathbf{v}\overline{\mathbf{w}}\right)\overline{\mathbf{w}}\mathbf{v}\]

Here, we have defined:

    \[ \boxed{\overline{\mathbf{w}}o \mathbf{v}\equiv \dfrac{\left( \gamma_{\overline{\mathbf{w}}}\gamma_v\mathbf{v}+\gamma_{\overline{\mathbf{w}}}\overline{\mathbf{w}}+\gamma_{\overline{\mathbf{w}}}\dfrac{\gamma_v^2}{1+\gamma_v}(\overline{\mathbf{w}}\mathbf{v})\right)}{\gamma (\mathbf{v},\overline{\mathbf{w}})}}\]

Remark (I): The matrix L given by

    \[ \begin{pmatrix}\gamma & -\mathbf{A}^T\\ -\mathbf{B} & M\end{pmatrix}\]

is NOT symmetric as we would expect from a boost. According to our decomposition for the matrix l M it can be rewritten in the following way

    \[ \boxed{R=R(\overline{\mathbf{w}},\mathbf{v})=M(\overline{\mathbf{w}},\mathbf{v})-\dfrac{\mathbf{B}\mathbf{A}^T}{1+\gamma}}\]

This last equation is called the Thomas precession associated with the tridimensional 3-vectors \mathbf{v},\overline{\mathbf{w}}. We observe that R is a proper-orthogonal matrix from the multiplicative property of the determinants and the fact that all boosts have determinant one. Equivalently, from the condition R=\pm 1 for all orthogonal matrix R together with the continuous dependence of R on the velocities and the initial condition lR(0,0)=\mathbf{1}.

Remark (II): From the definitions of M, and the vectors \mathbf{A},\mathbf{B}, we deduce that \mathbf{v}\times \overline{\mathbf{w}} is an eigenvector of R with eigenvalue +1 and this gives the axis of rotation. The rotation angle \alpha as calculated from Tr R=1+2\cos\alpha is complicated expression, and only after some clever manipulations or the use of the geometric algebra framework, it simplifies to

    \[ 1+\cos\alpha=\dfrac{(1+\gamma_u+\gamma_v+\gamma_{\overline{w}})}{(1+\gamma_u)(1+\gamma_v)(1+\gamma_{\overline{w}})}>0\]

In order to understand what this equation means, we have to observe that the components \mathbf{v} and \overline{\mathbf{w}} refer to different reference frames, and then, the scalar product \mathbf{v}\mathbf{\overline{w}} and the cross product \mathbf{v}\times\overline{\mathbf{w}} must be given good analitic expressions before the geometric interpretation can be accomplished. Moreover, if we want to interpret the cross product as an axis in the reference frame S, and correspondingly we want to split L=L_{R\mathbf{v}}L_R,  by the definition \overline{\mathbf{w}}o\mathbf{v} we deduce that

    \[ \mathbf{v}\times\mathbf{u}=\dfrac{\mathbf{v}\times\overline{\mathbf{w}}}{\gamma_v(1+\mathbf{v}\overline{\mathbf{w}})}\]

and thus, the Thomas rotation of the inertial frame S has its axis orthogonal to the relative velocity vectors \mathbf{v},\mathbf{u} of the reference frame \overline{\overline{S}}, \overline{\overline{S}} against S.

By the other hand, if we interpret the above last equation as an axis in the reference frame \overline{\overline{S}}, asociated to the split L=L_RL_\mathbf{u}, we would deduce that L_{R\mathbf{u}}L_R implies the following consequence. The reference frame \overline{\overline{S}} is got from boosting certain frame S’ obtained itself from a rotation of S by R. Then, \overline{\overline{S}} obtains (compared with S or S’), a velocity whose components are R\mathbf{u} in the inertial frame S’. Reciprocally, the components of the velocity of S or S’ against the frame \overline{\overline{S}} are provided, in \overline{\overline S}, by \overline{\overline{\mathbf{u}}}=-R\mathbf{u}. Therefore, from the Thomas precession formula for R we observe that R\mathbf{u} differs from \mathbf{u} only by linear combinations of the vectors \mathbf{v} and \overline{\mathbf{w}}. With all this results we easily derive:

    \[ \overline{\overline{u}}\times \overline{\overline{\mathbf{w}}}=(-R\mathbf{u})\times (-\overline{\mathbf{w}})\propto \mathbf{v}\times \overline{\mathbf{w}}\]

i.e., the axis for the Thomas rotation matrix of \overline{\overline{S}} is orthogonal to the relative velocities \overline{\overline{\mathbf{u}}}, \overline{\overline{\mathbf{w}}} of the inertial frames S, l \overline{S} against \overline{\overline{S}}. Finally, to find the rotation matrix, it is enough to restrict the problem to the case where \overline{\mathbf{w}} is small so that squares of it may be neglected. In this simple case, R would become into:

    \[ \boxed{R\approx \mathbf{1}+\dfrac{\gamma_v}{1+\gamma_v}\left(\overline{\mathbf{w}}\mathbf{v}^T-\mathbf{v}\overline{\mathbf{w}}^T\right)}\]

and where the rotation angle is given by

    \[ \boxed{\alpha\approx -\dfrac{\gamma_v}{1+\gamma_v}\mathbf{v}\times\overline{\mathbf{w}}\approx -\dfrac{\gamma_v^2}{1+\gamma_v}\mathbf{v}\times\mathbf{u}}\]

In order to understand the Physics behind the Thomas precession, we will consider one single experiment. Imagine an inertial frame S in accelerated motion with respect to other inertial frame I. The spatial axes of S remain parallel at any time in the sense that the instantaneous reference frame coinciding with S at times t+\Delta t are related by a pure boost in the limit \Delta t\rightarrow 0. This may be managed if we orient S with the aid of a very fast spinning torque-free gyroscope. Then, from the inertial frame I, S seems to be rotated at each instant of time and there is a continuous rotation of S against I since the velocity of S varies and changes continuously. This gyroscopic rotation of S relative to I IS the Thomas precession.  We can determine the angular velocity of this motion in a straightforward manner. During the small interval of time \Delta t measured from I, the instantaneous velocity \mathbf{v} of S changes by certain quantity \Delta \mathbf{v}, measured from I. In that case,

    \[ \Delta \alpha=-\gamma_v^2\mathbf{v}\times\dfrac{\Delta\mathbf{v}}{(1+\gamma_v)}\]

for the rotation vector during a time interval \Delta t. Thus, the angular velocity for the Thomas precession will be given by:

    \[ \boxed{\omega_T=-\dfrac{\gamma^2}{1+\gamma_v}\mathbf{v}\times\dfrac{d\mathbf{v}}{dt}}\]

or reintroducing the speed of light we get

    \[ \boxed{\omega_T=-\dfrac{\gamma^2}{1+\gamma_v}\mathbf{v}\times\dfrac{1}{c^2}\dfrac{d\mathbf{v}}{dt}=\dfrac{\gamma^2}{1+\gamma_v}\dfrac{1}{c^2}\mathbf{a}\times\mathbf{v}}\]

Remark(I): The special relativistic effect given by the Thomas precession was used by Thomas himself to remove a discrepancy and mismatch between the non-relativistic theory of the spinning electron and the experimental value of the fine structure. His observation was, in fact, that the gyromagnetic ratio of the electron calculated from the anomalous Zeeman effect led to a wrong value of the fine structure constant \alpha. The Thomas precession introduces a correction to the equation of motion of an electron in an external electromagnetic filed and such a correction induces a correction of the spin-orbit coupling, explaining the correct value of the fine structure.

Remark (II): In the framework of the relativistic quantum theory of the electron, Dirac realized that the effect of Thomas precession was automatically included!

Remark (III): Inside the Thomas paper, we find these interesting words

“(…)It seems that Abraham (1903) was the first to consider in any detail an electron with an axis. Many have since then considered spinning electron, ring electrons, and the like. Compton (1921) in particular suggested a quantized spin for the electron. It remained for Uhlenberg and Goudsmit (1925) to show ho this idea can be used to explain the anomalous Zeeman effect. The asumptions they had to make seemed to lead to optical and relativity doublet separations twice larger than those we observe. The purpose of the following paper, which contains the results mentioned in my recent letter to Nature (1926), is to investigate the kinematics of an electron with an axis on the basis of the restricted theory of relativity. The main fact used is that the combination of two Lorentz transformations without rotation in general is not of the same form(…)”.

From the historical viewpoint it should also be remarked that the precession effect was known by the end of 1912 to the mathematician E.Borel (C.R.Acad.Sci.,156. 215 (1913)). It was described by him (Borel, 1914) as well as by L.Silberstein (1914) in textbooks already 1914. It seems that the effect was even known to A.Sommerfeld in 1909 and before him, perhaps even to H.Poincaré. The importance of Thomas’ work and papers on this subject was thus not only the rediscovery but the relevant application to a virulent problem in that time, as it was the structure of the atomic spectra and the fine structure constant of the electron!

Remark (IV): Not every Lorentz transformation can be written as the product of two boosts due to the Thomas precession!


Even though we have not studied group theory in this blog, I feel the need to explain some group theory stuff related to the Thomas precession here.

The kinematical differences between Galilean and Einsteinian relativity theories is observed at many levels. The essential differences become apparent already on the level of the homogenous groups without reversals (inverses). Let me first consider the Galileo group. It is generated by space rotations G_R=L_R and galilean boosts in any number and order. Using the notation we have developed in this post, we could write X'=G_\mathbf{v}X in this way:

    \[ G_\mathbf{v}=\begin{pmatrix}1 & \mathbf{0}^T\\ -\mathbf{v} & \mathbf{1}\end{pmatrix}\]

The following relationships are deduced:

    \[ G_RG_\mathbf{v}=G_{R\mathbf{v}}G_{R}\]

    \[ G_{R_1}G_{R_2}=G_{R_1R_2}\]

    \[ G_{\mathbf{v}_1}G_{\mathbf{v}_2}=G_{\mathbf{v}_1+\mathbf{v}_2}=G_{\mathbf{v}_2}G_{\mathbf{v}_1}\]

In the case of the Lorentz group, these equations are “generalized” into

    \[ L_RL_\mathbf{v}=L_{R\mathbf{v}}L_{R}\]

    \[ L_{R_1}L_{R_2}=L_{R_1R_2}\]

    \[ L_{\mathbf{v}_1}L_{\mathbf{v}_2}=L_{R(\mathbf{v}_1,\mathbf{v}_2)}L_{\mathbf{v}_1 o \mathbf{v}_2}\]

where R(\mathbf{v}_1,\mathbf{v}_2) is the Thomas precession and the circle denotes the nonlinear relativisti velocity addition. Be aware that the domain of velocities in special relativity is \vert v\vert<1, in units with c set to unity.

Both groups (Galileo and Lorentz) contain as a subroupt the group of al spatial rotations G_R\equiv L_R. The set of galilean or lorentzian boosts G_v and L_v are invariant under conjugation by G_R=L_R, since

    \[ G_RG_vG_R^{-1}=G_{Rv}\]

    \[ L_RL_vL_R^{-1}=L_{Rv}\]

are boosts as well. In the case of the Galileo group, the set of (galilean) boost forms an (abelian) subgroup and then, it provides an invariant group. We can calculate the factor group with respect to it and we will obtain an isomorphic group to the subgroup of space rotations. Using the group law for the Galileo group:

    \[ \underbrace{G_{R_1}G_{v_1}}\underbrace{G_{R_2}G_{v_2}}=G_{R_1R_2}G_{R_2^{-1}v_1+v_2}=G_{R_3}G_{v_3}\]

with R_3=R_2R_1 and v_3=R_2^{-1}v_1+v_2. As a consequence, the homogenous Galileo group (without reversals) is called a semidirect product of the rotation group with the Abelian group \mathbb{R}^3 of all boosts given by \mathbf{v}.

The case of Lorentz group is more complicated/complex. The reason is the Thomas precession. Indeed, the set of boost does NOT form a subgroup of the Lorentz group! We can define a product in this group:

    \[ \boxed{L_{v_1} oL_{v_2}=L_{v_1 o v_2}}\]

but, in the contrary to the result we got with the Galileo group, this condition does NOT define a group structure. In fact, mathematicians call objects with this property groupoids. The domain of velocities of the this lorentzian grupoid becomes a groupoid under the multiplication v_1 o v_2. It has dramatic consequences. In particular, the associative does not hold for this multiplication and this groupoid structure! Anyway, a weaker form of it is true, involving the Thomas precession/rotation formula:

    \[ \boxed{(v_1 o v_2) o v_3=(R^{-1}(v_2,v_3)v_1) o (v_2 o v_3)}\]

In an analogue way, the multiplication is not commuative in general too, but it satisfies a weaker form of commutativity. While in general groupoids require to distinguish between right and left unit elements (if any), we have indeed \mathbf{v}=\mathbf{0} as a “two-sided” unit element for the velocity groupoid. In the same manner, while in general groupoids right and left inverses may differ (if any), in the case of Lorentz group, the groupoid associated to Thomas precession has a unique two-sided inverse -\mathbf{v} for any \mathbf{v} relative to the groupoid multiplication law. It is NON-trivial ( due to non-associativeness), albeit true, that the equation given by

    \[ v_1 o v_2=v_3\]

may be solved uniquely for v_2 and, provided we plug v_2, v_3, it may be solve uniquely for any v_1. A groupoid satisfying this property (i.e., a groupoid that allows such a uniqueness in the solutions of its equation) is called quasi-group.

In conclusion, we can say that the Lorentz group IS, in sharp contrast to the Galileo group, in no way a semidirect product, being what mathematicians and physicists call a simple group, i.e., it is a noncommutative group having no nontrivial invariant subgroup! It is due to the fact that the multiplication rule of the Lorentz group without reversals makes it, in the sense of our previous definitions, the quasidirect product of the rotation group (as a subgroup of the automorphism group of the velocity groupoid)  with the so-called “weakly associative groupoid of velocities”. Here, weakly associative(-commutative) groupoid means the following: a groupoid with a left-sided unit and left-sided inverses with the next properties:

1. Weak associativeness: R(\mathbf{0},\mathbf{v})=R(-\mathbf{v},\mathbf{v})=\mathbf{1}

2. Loop property (from Thomas precession formula): R(v_1,v_2)=R(v_1,v_1 o v_2)

and where the automorphims group of the velocity groupoid is defined with the next equations

Definition (Automorphism group of the velocity groupoid): (Sv_1)o(Sv_2)=S(v_1 o v_2)

Note: an associative groupoid is called semigroup and and a semigroup with two-sided unit element is called a monoid.

This algebraic structure hidden in the Lorentz group has been rediscovered several times along the History of mathematical physics. A groupoid satisfying the loop property has been named in other ways. For instance, in 1988, A. A. Ungar derived the above composition laws and the automorphism group of the Thomas precession R. Independently, A. Nesterov and coworkers in the Soviet Union had studied the same problem and quasigroup since 1986. And we can track this structure even more. 20 years before the Ungar “rediscovery”, H. Karzel had postulated a version of the same abstract object, and it was integrated into a richer one with two compositions (laws). He called it “near-domain”, where the automorphims R (Thomas precessions) were to be realized by the (distributive) left multiplication with suitable elements of the near-domian ( the reference is Abh. Math.Sem.Uni. Hamburg, 1968).

However, Ungar himself developed a more systematic treatment and description for the Thomas precession “groupoid” that is behind all this weird non-associative stuff in the Lorentz-group in 3+1 dimensions. Accorging to his new approach and terminology, the structure is called “gyrocommutative gyrogroup” and it includes the Thomas precession as “Thomas gyration” in this framework. If you want to learn more about gyrogroups and gyrovector spaces, read this article


Some other authors, like Wefelscheid and coworkers, called K-loops to these gyrogroups. Even more, there are two extra sources from this nontrivial mathematical structure.

Firstly, in Japan, M.Kikkawa had studied certain loops with a compatible differentiable structure called “homegeneous symmetric Lie groups” ( Hiroshima Math. J.5, 141 (1975)). Even though he did not discuss any concrete example, it is natural from his definitions that it was the same structure Karzel found. Being romantic, we can observe certain justice to call K-loops to gyrogroups (since Kikkawa and Karzel discovered them first!). The second source can be tracked in time since the same ideas were already known by L.Sabinin et alii circa 1972 ( Sov. Math. Dokl.13,970(1972)). Their relation to symmetric homogeneous spaces of noncompact type has been discussed some years ago by W. Krammer and H.K.Urbatke, e.g., in Res. Math.33, 310 (1998).

Finally, a purely algebraic loop theory approach (with motivations far way from geometry or physics) was introduced by D. A. Robinson in 1966. In 1995, A. Kreuzer showed thath it was indeed identical to K-loops, again adding some extra nomenclature ( Math.Proc.Camb. Phylos.Soc.123, 53 (1998)).


We have seen that the composition of 2 Lorentz boosts, generally with 2 non collinear velocities, results in a Lorentz transformation that IS NOT a pure boost but a composition of a single Lorentz transformation or boost and a single spatial rotation. Indeed, this phenomenon is also called Wigner-Thomas rotation. The final consequence, any body moving on a curvilinear trajectory undergoes and experiences a rotational precession, firstly noted by Thomas in the relativistic theory of the spinning electron.

In this final section, I am going to review the really simple deduction of the Thomas precession formula given in the paper http://arxiv.org/abs/1211.1854

Imagine 3 different inertial observers Anna, Bob and Charles and their respective inertial frames A, B, and C attached to them. We choose A as a non-rotated frame with respect to B, and B as a non-rotated reference frame w.r.t. C. However, surprisingly, C is going to be rotated w.r.t. A and it is inevitable! We are going to understand it better. Let Bob embrace Charles and let them move together with constant velocity \mathbf{v} w.r.t. Anna. In some point, Charles decides to run away from Bob with a tiny velocity \mathbf{dv'} w.r.t. Bob. Then, Bob is moving with relative velocity -\mathbf{dv'} w.r.t. C and Anna is moving with relative velocity -\mathbf{v} w.r.t. B. We can show these events with the following diagram:


Now, we can write Charles’ velocity in the Anna’s frame by the sum \mathbf{v+dv}. Since the frame C is rotated with respect to the A frame, his velocity in the C frame will be \hat{\mathbf{v}} will be calculated step to step as follows. Firstly, we remark that

    \[ \hat{\mathbf{v}}\neq -\mathbf{v}-d\mathbf{v}\]

Secondly, the angle d\mathbf{\Omega} of an infinitesimal rotation is given by:

    \[ d\mathbf{\Omega}=-\dfrac{\hat{\mathbf{v}}}{\vert \hat{\mathbf{v}}\vert }\times \dfrac{\mathbf{v}+d\mathbf{v}}{\vert \mathbf{v}+d\mathbf{v}\vert}\approx -\dfrac{\hat{\mathbf{v}}}{v^2}\times (\mathbf{v}+ d\mathbf{v})\;\;\; (1)\]

The precession rate in the A frame will be provided using the general nonlinear composition rule in SR. If the motion is parallel to the x-axis with velocity V, we do know that

    \[ u'_x=\dfrac{u_x-V}{1-\dfrac{u_x V}{c^2}}\]

    \[ u'_y=\dfrac{u_y\sqrt{1-\dfrac{V^2}{c^2}}}{1-\dfrac{u_x V}{c^2}}\]

    \[ u'_z=\dfrac{u_z\sqrt{1-\dfrac{V^2}{c^2}}}{1-\dfrac{u_x V}{c^2}}\]

and where \mathbf{u}=(u_x,u_y,u_z) and \mathbf{u}'=(u'_x,u'_y,u'_z) are the velocities of some object in the rest frame and the moving frame, respectively. For an arbitrary non-collinear, non-orthogonal, i.e., non parallel velocity \mathbf{V}=(V_x,V_y,V_z) we obtain the transformations

    \[ \boxed{\mathbf{u}'=\dfrac{\sqrt{1-\dfrac{V^2}{c^2}}\left(\mathbf{u}-\dfrac{\mathbf{u}\cdot\mathbf{V}}{V^2}\mathbf{V}\right)-\left( \mathbf{V}-\dfrac{\mathbf{u}\cdot\mathbf{V}}{V^2}\mathbf{V}\right)}{1-\dfrac{\mathbf{u}\cdot\mathbf{V}}{c^2}}\;\;\; (2)}\]

and where the unprimed and primed frames are mutually non-rotated to each other. Using this last equation, (2), we can easily describe the transition from the frame A to the frame B. It involves the substitutions:

    \[ \mathbf{V}\rightarrow \mathbf{v}\]

    \[ \mathbf{u}\rightarrow \mathbf{v}+d\mathbf{v}\]

    \[ \mathbf{u}'\rightarrow d\mathbf{v}'\]

After leaving the first order terms in d\mathbf{v}, we can get the following expansion from eq.(2):

    \[ d\mathbf{v}'\approx \dfrac{1}{\sqrt{1-\dfrac{v^2}{c^2}}}\left(d\mathbf{v}-\dfrac{\mathbf{v}\cdot d\mathbf{v}}{v^2}\mathbf{v}\right)+\dfrac{1}{1-\dfrac{v^2}{c^2}}\dfrac{\mathbf{v}\cdot d\mathbf{v}}{v^2}\mathbf{v}\;\;\; (3)\]

Using again eq.(2) to make the transition between the B frame to the C frame, i.e., making the substitutions:

    \[ \mathbf{V}\rightarrow d\mathbf{v}'\]

    \[ \mathbf{u}\rightarrow -\mathbf{v}\]

    \[ \mathbf{u}'\rightarrow \hat{\mathbf{v}}\]

and dropping out higher order differentials in d\mathbf{v}', we obtain the next formula after we neglect those terms

    \[ \boxed{\hat{\mathbf{v}}\approx -\mathbf{v}+\dfrac{\mathbf{v}\cdot d\mathbf{v}'}{c^2}\mathbf{v}-d\mathbf{v}'\;\;\; (4)}\]

The final step consists is easy: we plug eq.(3) into eq.(4) and the resulting expression into eq.(1). Then, we divice by the differential dt in the final formula to provide the celebrated Thomas precession formula:

    \[ \boxed{\dot{\Omega}=\dfrac{d\Omega}{dt}=\omega_T=-\dfrac{1}{v^2}\left(\dfrac{1}{\sqrt{1-\dfrac{v^2}{c^2}}}-1\right)\mathbf{v}\times \dot{\mathbf{v}}\;\;\; (5)}\]

or equivalently

    \[ \boxed{\dot{\Omega}=\dfrac{d\Omega}{dt}=\omega_T=-\dfrac{1}{v^2}\left(\gamma_{\mathbf{v}}-1\right)\mathbf{v}\times \mathbf{a}\;\;\; (6)}\]

It can easily shown that these formulae is the same as the given previously above, writing v^2 in terms of \gamma and performing some elementary algebraic manipulations.

Aren’t you fascinated by how these wonderful mathematical structures emerge from the physical world? I can say it: Fascinating is not enough for my surprised mind!

LOG#047. The Askaryan effect.


I discussed and reviewed the important Cherenkov effect and radiation in the previous post, here:


Today we are going to study a relatively new effect ( new experimentally speaking, because it was first detected when I was an undergraduate student, in 2000) but it is not so new from the theoretical aside (theoretically, it was predicted in 1962). This effect is closely related to the Cherenkov effect. It is named Askaryan effect or Askaryan radiation, see below after a brief recapitulation of the Cherenkov effect last post we are going to do in the next lines.

We do know that charged particles moving faster than light through the vacuum emit Cherenkov radiation. How can a particle move faster than light? The weak speed of a charged particle can exceed the speed of light. That is all. About some speculations about the so-called tachyonic gamma ray emissions, let me say that the existence of superluminal energy transfer has not been established so far, and one may ask why. There are two options:

1) The simplest solution is that superluminal quanta just do not exist, the vacuum speed of light being the definitive upper bound.

2) The second solution is that the interaction of superluminal radiation with matter is very small, the quotient of tachyonic and electric fine-structure constants being q_{tach}^2/e^2<10^{-11}. Therefore superluminal quanta and their substratum are hard to detect.

A related and very interesting question could be asked now related to the Cherenkov radiation we have studied here. What about neutral particles? Is there some analogue of Cherenkov radiation valid for chargeless or neutral particles? Because neutrinos are electrically neutral, conventional Cherenkov radiation of superluminal neutrinos does not arise or it is otherwise weakened. However neutrinos do carry electroweak charge and may emit certain Cherenkov-like radiation via weak interactions when traveling at superluminal speeds. The Askaryan effect/radiation is this Cherenkov-like effect for neutrinos, and we are going to enlighten your knowledge of this effect with this entry.

We are being bombarded by cosmic rays, and even more, we are being bombarded by neutrinos. Indeed, we expect that ultra-high energy (UHE) neutrinos or extreme ultra-high energy (EHE) neutrinos will hit us as too. When neutrinos interact wiht matter, they create some shower, specifically in dense media. Thus, we expect that the electrons and positrons which travel faster than the speed of light in these media or even in the air and they should emit (coherent) Cherenkov-like radiation.

Who was Gurgen Askaryan?

Let me quote what wikipedia say about him: Gurgen Askaryan (December 14, 1928-1997) was a prominent Soviet (armenian) physicist, famous for his discovery of the self-focusing of light, pioneering studies of light-matter interactions, and the discovery and investigation of the interaction of high-energy particles with condensed matter. He published more than 200 papers about different topics in high-energy physics.

Other interesting ideas by Askaryan: the bubble chamber (he discovered the idea independently to Glaser, but he did not published it so he did not win the Nobel Prize), laser self-focussing (one of the main contributions of Askaryan to non-linear optics was the self-focusing of light), and the acoustic UHECR detection proposal. Askaryan was the first to note that the outer few metres of the Moon’s surface, known as the regolith, would be a sufficiently transparent medium for detecting microwaves from the charge excess in particle showers. The radio transparency of the regolith has since been confirmed by the Apollo missions.

If you want to learn more about Askaryan ideas and his biography, you can read them here: http://en.wikipedia.org/wiki/Gurgen_Askaryan

What is the Askaryan effect?

The next figure is from the Askaryan radiation detected by the ANITA experiment:


The Askaryan effect is the phenomenon whereby a particle traveling faster than the phase velocity of light in a dense dielectric medium (such as salt, ice or the lunar regolith) produces a shower of secondary charged particles which contain a charge anisotropy  and thus emits a cone of coherent radiation in the radio or microwave  part of the electromagnetic spectrum. It is similar, or more precisely it is based on the Cherenkov effect.

High energy processes such as Compton, Bhabha and Moller scattering along with positron annihilation  rapidly lead to about a 20%-30% negative charge asymmetry in the electron-photon part of a cascade. For instance, they can be initiated by UHE (higher than, e.g.,100 PeV) neutrinos.

1962, Askaryan first hypothesized this effect and suggested that it should lead to strong coherent radio and microwave Cherenkov emission for showers propagating within the dielectric. Since the dimensions of the clump of charged particles are small compared to the wavelength of the radio waves, the shower radiates coherent radio Cherenkov radiation whose power is proportional to the square of the net charge in the shower. The net charge in the shower is proportional to the primary energy so the radiated power scales quadratically with the shower energy, P_{RF}\propto E^2.

Indeed, these radio and coherent radiations are originated by the Cherenkov effect radiation. We do know that:

    \[ \dfrac{P_{CR}}{d\nu}\propto \nu d\nu\]

from the charged particle in a dense (refractive) medium experimenting Cherenkov radiation (CR). Every charge emittes a field \vert E\vert\propto \exp (i\mathbf{k}\cdot\mathbf{r}). Then, the power is proportional to E^2. In a dense medium:

    \[ R_{M}\sim 10cm\]

We have two different experimental and interesting cases:

A) The optical case, with \lambda <<R_M. Then, we expect random phases and P\propto N.

B) The microwave case, with \lambda>>R_M. In this situation, we expect coherent radiation/waves with P\propto N^2.

We can exploit this effect in large natural volumes transparent to radio (dry): pure ice, salt formations, lunar regolith,…The peak of this coherent radiation for sand is produced at a frequency around 5GHz, while the peak for ice is obtained around 2GHz.

The first experimental confirmation of the Askaryan effect detection were the next two experiments:

1) 2000 Saltzberg et.al., SLAC. They used as target silica sand. The paper is this one http://arxiv.org/abs/hep-ex/0011001

2) 2002 Gorham et.al., SLAC. They used a synthetic salt target. The paper appeared in this place http://arxiv.org/abs/hep-ex/0108027

Indeed, in 1965, Askaryan himself proposes ice and salt as possible target media. The reasons are easy to understand:
1st. They provide high densities and then it means a higher probability for neutrino interaction.
2nd. They have a high refractive index. Therefore, the Cerenkov emission becomes important.
3rd. Salt and ice are radio transparent, and of course, they can be supplied in large volumes available throughout the world.

The advantages of radio detection of UHE neutrinos provided by the Askaryan effect are very interesting:

1) Low attenuation: clear signals from large detection volumes.
2) We can observe distant and inclined events.
3) It has a high duty cycle: good statistics in less time.
4) I has a relative low cost: large areas covered.
5) It is available for neutrinos and/or any other chargeless/neutral particle!

Problems with this Askaryan effect detection are, though: radio interference, correlation with shower parameters (still unclear), and that it is limited only to particles with very large energies, about E>10^{17}eV.

In summary:

Askaryan effect = coherent Cerenkov radiation from a charge excess induced by (likely) neutral/chargeless particles like (specially highly energetic) neutrinos passing through a dense medium.


Why the Askaryan effect matters?

It matters since it allows for the detection of UHE neutrinos, and it is “universal” for chargeless/neutral particles like neutrinos, just in the same way that the Cherenkov effect is universal for charged particles. And tracking UHE neutrinos is important because they point out towards its source, and it is suspected they can help us to solve the riddle of the origin and composition of cosmic rays, the acceleration mechanism of cosmic radiation, the nuclear interactions of astrophysical objects, and tracking the highest energy emissions of the Universe we can observe at current time.

Is it real? Has it been detected? Yes, after 38 years, it has been detected. This effect was firstly demonstrated in sand (2000), rock salt (2004) and ice (2006), all done in a laboratory at SLAC and later it has been checked in several independent experiments around the world. Indeed, I remember to have heard about this effect during my darker years as undergraduate student. Fortunately or not, I forgot about it till now. In spite of the beauty of it!

Moreover, it has extra applications to neutrino detection using the Moon as target: GLUE (detectors are Goldstone RTs), NuMoon (Westerbork array; LOFAR), or RESUN (EVLA), or the LUNASKA project. Using ice as target, there has been other experiments checking the reality of this effect: FORTE (satellite observing Greenland ice sheet), RICE (co-deployed on AMANDA strings, viewing Antarctic ice), and the celebrated ANITA (balloon-borne over Antarctica, viewing Antarctic ice) experiment.

Furthermore, even some experiments have used the Moon (an it is likely some others will be built in the near future) as a neutrino detector using the Askaryan radiation (the analogue for neutral particles of the Cherenkov effect, don’t forget the spot!).

Askaryan effect and the mysterious cosmic rays.

Askaryan radiation is important because is one of the portals of the UHE neutrino observation coming from cosmic rays. The mysteries of cosmic rays continue today. We have detected indeed extremely energetic cosmic rays beyond the 10^{20}eV scale. Their origin is yet unsolved. We hope that tracking neutrinos we will discover the sources of those rays and their nature/composition. We don’t understand or know any mechanism being able to accelerate particles up to those incredible particles. At current time, IceCube has not detected UHE neutrinos, and it is a serious issue for curren theories and models. It is a challenge if we don’t observe enough UHE neutrinos as the Standard Model would predict. Would it mean that cosmic rays are exclusively composed by heavy nuclei or protons? Are we making a bad modelling of the spectrum of the sources and the nuclear models of stars as it happened before the neutrino oscillations at SuperKamiokande and Kamikande were detected -e.g.:SN1987A? Is there some kind of new Physics living at those scales and avoiding the GZK limit we would naively expect from our current theories?

LOG#046. The Cherenkov effect.


The Cherenkov effect/Cherenkov radiation, sometimes also called Vavilov-Cherenkov radiation, is our topic here in this post.

In 1934, P.A. Cherenkov was a post graduate student of S.I.Vavilov. He was investigating the luminescence of uranyl salts under the incidence of gamma rays from radium and he discovered a new type of luminiscence which could not be explained by the ordinary theory of fluorescence. It is well known that fluorescence arises as the result of transitions between excited states of atoms or molecules. The average duration of fluorescent emissions is about \tau>10^{-9}s and the transition probability is altered by the addition of “quenching agents” or by some purification process of the material, some change in the ambient temperature, etc. It shows that none of these methods is able to quench the fluorescent emission totally, specifically the new radiation discovered by Cherenkov. A subsequent investigation of the new radiation ( named Cherenkov radiation by other scientists after the Cherenkov discovery of such a radiation) revealed some interesting features of its characteristics:

1st. The polarization of luminescence changes sharply when we apply a magnetic field. Cherenkov radiation luminescence is then causes by charged particles rather than by photons, the \gamma-ray quanta! Cherenkov’s experiment showed that these particles could be electrons produced by the interaction of \gamma-photons with the medium due to the photoelectric effect or the Compton effect itself.

2nd. The intensity of the Cherenkov’s radiation is independent of the charge Z of the medium. Therefore, it can not be of radiative origin.

3rd. The radiation is observed at certain angle (specifically forming a cone) to the direction of motion of charged particles.

The Cherenkov radiation was explained in 1937 by Frank and Tamm based on the foundations of classical electrodynamics. For the discovery and explanation of Cherenkov effect, Cherenkov, Frank and Tamm were awarded the Nobel Prize in 1958. We will discuss the Frank-Tamm formula later, but let me first explain how the classical electrodynamics handle the Vavilov-Cherenkov radiation.

The main conclusion that Frank and Tamm obtained comes from the following observation. They observed that the statement of classical electrodynamics concerning the impossibility of energy loss by radiation for a charged particle moving uniformly and following a straight line in vacuum is no longer valid if we go over from the vacuum to a medium with certain refractive index n>1. They went further with the aid of an easy argument based on the laws of conservation of momentum and energy, a principle that rests in the core of Physics as everybody knows. Imagine a charged partice moving uniformly in a straight line, and suppose it can loose energy and momentum through radiation. In that case, the next equation holds:

    \[ \left(\dfrac{dE}{dp}\right)_{particle}=\left(\dfrac{dE}{dp}\right)_{radiation}\]

This equation can not be satisfied for the vacuum but it MAY be valid for a medium with a refractive index gretear than one n>1. We will simplify our discussion if we consider that the refractive index is constant (but similar conclusions would be obtained if the refractive index is some function of the frequency).

By the other hand, the total energy E of a particle having a non-null mass m\neq 0 and moving freely in vacuum with some momentum p and velocity v will be:

    \[ E=\sqrt{p^2c^2+m^2c^4}\]

and then

    \[ \left(\dfrac{dE}{dp}\right)_{particle}=\dfrac{pc^2}{E}=\beta c=v\]

Moreover, the electromagnetic radiation in vaccum is given by the relativistic relationship

    \[ E_{rad}=pc\]

From this equation, we easily get that

    \[ \left(\dfrac{dE}{dp}\right)_{radiation}=c\]

Since the particle velocity is v<c, we obtain that

    \[ \left(\dfrac{dE}{dp}\right)_{particle}<\left(\dfrac{dE}{dp}\right)_{radiation}\]

In conclusion: the laws of conservation of energy and momentum prevent that a charged particle moving with a rectilinear and uniform motion in vacuum from giving away its energy and momentum in the form of electromagnetic radiation! The electromagnetic radiation can not accept the entire momentum given away by the charged particle.

Anyway, we realize that this restriction and constraint is removed and given up when the aprticle moves in a medium with a refractive index n>1. In this case, the velocity of light in the medium would be

    \[ c'=c/n<c\]

and the velocity v of the particle may not only become equal to the velocity of light c' in the medium, but even exceed it when the following phenomenological condition is satisfied:

    \[ \boxed{v\geq c'=c/n}\]

It is obvious that, when v=c' the condition

    \[ \left(\dfrac{dE}{dp}\right)_{particle}=\left(\dfrac{dE}{dp}\right)_{radiation}\]

will be satisfied for electromagnetic radiation emitted strictly in the direction of motion of the particle, i.e., in the direction of the angle \theta=0^\circ. If v>c', this equation is verified for some direction \theta along with v=c', where

    \[ v'=v\cos\theta\]

is the projection of the particle velocity v on the observation direction. Then, in a medium with n>1, the conservation laws of energy and momentum say that it is allowed that a charged particle with rectilinear and uniform motion, v\geq c'=c/n can loose fractions of energy and momentum dE and dp, whenever those lost energy and momentum is carried away by an electromagnetic radiation propagating in the medium at an angle/cone given by:

    \[ \boxed{\theta=arccos\left(\dfrac{1}{n\beta}\right)=\cos^{-1}\left(\dfrac{1}{n\beta}\right)}\]

with respect to the observation direction of the particle motion.

These arguments, based on the conservation laws of momenergy, do not provide any ide about the real mechanism of the energy and momentum which are lost during the Cherenkov radiation. However, this mechanism must be associated with processes happening in the medium since the losses can not occur ( apparently) in vacuum under normal circumstances ( we will also discuss later the vacuum Cherenkov effect, and what it means in terms of Physics and symmetry breaking).

We have learned that Cherenkov radiation is of the same nature as certain other processes we do know and observer, for instance, in various media when bodies move in these media at a velocity exceeding that of the wave propagation. This is a remarkable result! Have you ever seen a V-shaped wave in the wake of a ship? Have you ever seen a conical wave caused by a supersonic boom of a plane or missile? In these examples, the wave field of the superfast object if found to be strongly perturbed in comparison with the field of a “slow” object ( in terms of the “velocity of sound” of the medium). It begins to decelerate the object!

Question: What is then the mechanism behind the superfast  motion of a charged particle in a medium wiht a refractive index n>1 producing the Cherenkov effect/radiation?

Answer:  The mechanism under the Cherenkov effect/radiation is the coherent emission by the dipoles formed due to the polarization of the medium atoms by the charged moving particle!

The idea is as follows. Dipoles are formed under the action of the electric field of the particle, which displaces the electrons of the sorrounding atoms relative to their nuclei. The return of the dipoles to the normal state (after the particle has left the given region) is accompanied by the emission of an electromagnetic signal or beam. If a particle moves slowly, the resulting polarization will be distribute symmetrically with respect to the particle position, since the electric field of the particle manages to polarize all the atoms in the near neighbourhood, including those lying ahead in its path. In that case, the resultant field of all dipoles away from the particle are equal to zero and their radiations neutralize one to one.

Then, if the particle move in a medium with a velocity exceeding the velocity or propagation of the electromagnetic field in that medium, i.e., whenever v>c'=c/n, a delayed polarization of the medium is observed, and consequently the resulting dipoles will be preferably oriented along the direction of motion of the particle. See the next figure:


It is evident that, if it occurs, there must be a direction along which a coherent radiation form dipoles emerges, since the waves emitted by the dipoles at different points along the path of the particle may turn our to be in the same phase. This direction can be easiy found experimentally and it can be easily obtained theoretically too. Let us imagine that a charged particle move from the left to the right with some velocity v in a medium with a n>1 refractive index, with c'=c/n. We can apply the Huygens principle to build the wave front for the emitted particle. If, at instant t, the aprticle is at the point x=vt, the surface enveloping the spherical waves emitted by the same particle on its own path from the origin at x=0 to the arbitrary point x. The radius of the wave at the point x=0 at such an instant t is equal to R_0=c't. At the same moment, the wave radius at the point x is equal to R_x=c'(t-(x/v))=0. At any intermediate point x’, the wave radius at instant t will be R_{x'}=c'(t-(x'/v)). Then, the radius decreases linearly with increasing x'. Thus, the enveloping surface is a cone with angle 2\varphi, where the angle satisfies in addition

    \[ \sin\varphi=\dfrac{R_0}{x}=\dfrac{c't}{vt}=\dfrac{c'}{v}=\dfrac{c}{vn}=\dfrac{1}{\beta n}\]

The normal to the enveloping surface fixes the direction of propagation of the Cherenkov radiation. The angle \theta between the normal and the x-axis is equal to \pi/2-\varphi, and it is defined by the condition

    \[ \boxed{\cos\theta=\dfrac{1}{\beta n}}\]

or equivalently

    \[ \boxed{\tan\theta=\sqrt{\beta^2n^2-1}}\]

This is the result we anticipated before. Indeed, it is completely general and Quantum Mechanics instroudces only a light and subtle correction to this classical result. From this last equation, we observer that the Cherenkov radiation propagates along the generators of a cone whose axis coincides with the direction of motion of the particle an the cone angle is equal to 2\theta. This radiation can be registered on a colour film place perpendicularly to the direction of motion of the particle. Radiation flowing from a radiator of this type leaves a blue ring on the photographic film. These blue rings are the archetypical fingerprints of Vavilov-Cherenkov radiation!

The sharp directivity of the Cherenkov radiation makes it possible to determine the particle velocity \beta from the value of the Cherenkov’s angle \theta. From the Cherenkov’s formula above, it follows that the range of measurement of \beta is equal to

    \[ 1/n\leq\beta<1\]

For \beta=1/n, the radiation is observed at an angle \theta=0^\circ, while for the extreme with \beta=1, the angle \theta reaches a maximum value

    \[ \theta_{max}=\cos^{-1}\left(\dfrac{1}{n}\right)=arccos \left(\dfrac{1}{n}\right)\]

For instance, in the case of water, n=1.33 and \beta_{min}=1/1.33=0.75. Therefore, the Cherenkov radiation is observed in water whenever \beta\geq 0.75. For electrons being the charged particles passing through the water, this condition is satisfied if

    \[ T_e=m_ec^2\left(\dfrac{1}{\sqrt{1-\beta^2}}-1\right)=0.5\left( \dfrac{1}{\sqrt{1-0.75^2}}-1\right)=0.26MeV\]

As a consequence of this, the Cherenkov effect should be observed in water even for low-energy electrons ( for instance, in the case of electrons produced by beta decay, or Compton electrons, or photoelectroncs resulting from the interaction between water and gamma rays from radioactive products, the above energy can be easily obtained and surpassed!). The maximum angle at which the Cherenkov effec can be observed in water can be calculated from the condition previously seen:

    \[ \cos\theta_{max}=1/n=0.75\]

This angle (for water) shows to be equal to about \theta\approx 41.5^\circ=41^\circ 30'. In agreement with the so-called Frank-Tamm formula ( please, see below what that formula is and means), the number of photons in the frequency interval \nu and \nu+d\nu emitted by some particle with charge Z moving with a velocity \beta in a medium with a refractive indez n is provided by the next equation:

    \[ \boxed{N(\nu) d\nu=4\pi^2\dfrac{(Zq)^2}{hc^2}\left(1-\dfrac{1}{n^2\beta^2}\right) d\nu}\]

This formula has some striking features:

1st. The spectrum is identical for particles with Z=constant, i.e., the spectrum is exactly the same, irespectively the nature of the particle. For instance, it could be produced both by protons, electrons, pions, muons or their antiparticles!

2nd. As Z increases, the number of emitted photons increases as Z^2.

3rd. N(\nu) increases with \beta, the particle velocity, from zero ( with \beta=1/n) to


with \beta\approx 1.

4th. N(\nu) is approximately independent of \nu. We observe that dN(\nu)\propto d\nu.

5th. As the spectrum is uniform in frequency, and E=h\nu, this means that the main energy of radiation is concentrated in the extreme short-wave region of the spectrum, i.e.,

    \[ \boxed{dE_{Cherenkov}\propto \nu d\nu}\]

And then, this feature explains the bluish-violet-like colour of the Cherenkov radiation!

Indeed, this feature also indicates the necessity of choosing materials for practical applications that are “transparent” up to the highest frequencies ( even the ultraviolet region). As a rule, it is known that n<1 in the X-ray region and hence the Cherenkov condition can not be satisfied! However, it was also shown by clever experimentalists that in some narrow regions of the X-ray spectrum the refractive index is n>1 ( the refractive index depends on the frequency in any reasonable materials. Practical Cherenkov materials are, thus, dispersive! ) and the Cherenkov radiation is effectively observed in apparently forbidden regions.

The Cherenkov effect is currently widely used in diverse applications. For instance, it is useful to determine the velocity of fast charged particles ( e.g, neutrino detectors can not obviously detect neutrinos but they can detect muons and other secondaries particles produced in the interaction with some polarizable medium, even when they are produced by (electro)weak interactions like those happening in the presence of chargeless neutrinos). The selection of the medium fo generating the Cherenkov radiation depends on the range of velocities \beta over which measurements have to be produced with the aid of such a “Cherenkov counter”. Cherenkov detectors/counters are filled with liquids and gases and they are found, e.g., in Kamiokande, Superkamiokande and many other neutrino detectors and “telescopes”. It is worth mentioning that velocities of ultrarelativistic particles are measured with Cherenkov detectors whenever they are filled with some special gasesous medium with a refractive indes just slightly higher than the unity. This value of the refractive index can be changed by realating the gas pressure in the counter! So, Cherenkov detectors and counters are very flexible tools for particle physicists!

Remark: As I mentioned before, it is important to remember that (the most of) the practical Cherenkov radiators/materials ARE dispersive. It means that if \omega is the photon frequency, and k=2\pi/\lambda is the wavenumber, then the photons propagate with some group velocity v_g=d\omega/dk, i.e.,

    \[ \boxed{v_g=\dfrac{d\omega}{dk}=\dfrac{c}{\left[n(\omega)+\omega \frac{dn}{d\omega}\right]}}\]

Note that if the medium is non-dispersive, this formula simplifies to the well known formula v_g=c/n. As it should be for vacuum.

Accodingly, following the PDG, Tamm showed in a classical paper that for dispersive media the Cherenkov radiation is concentrated in a thin  conical shell region whose vertex is at the moving charge and whose opening half-angle \eta is given by the expression

    \[ \boxed{cotan \theta_c=\left[\dfrac{d}{d\omega}\left(\omega\tan\theta_c\right)\right]_{\omega_0}=\left(\tan\theta_c+\beta^2\omega n(\omega) \dfrac{dn}{d\omega} cotan (\theta_c)\right)\bigg|_{\omega_0}}\]

where \theta_c is the critical Cherenkov angle seen before, \omega_0 is the central value of the small frequency range under consideration under the Cherenkov condition. This cone has an opening half-angle \eta (please, compare with the previous convention with \varphi for consistency), and unless the medium is non-dispersive (i.e. dn/d\omega=0, n=constant), we get \theta_c+\eta\neq 90^\circ. Typical Cherenkov radiation imaging produces blue rings.


When we considered the Cherenkov effect in the framework of QM, in particular the quantum theory of radiation, we can deduce the following formula for the Cherenkov effect that includes the quantum corrections due to the backreaction of the particle to the radiation:

    \[ \boxed{\cos\theta=\dfrac{1}{\beta n}+\dfrac{\Lambda}{2\lambda}\left(1-\dfrac{1}{n^2}\right)}\]

where, like before, \beta=v/c, n is the refraction index, \Lambda=\dfrac{h}{p}=\dfrac{h}{mv} is the De Broglie wavelength of the moving particle and \lambda is the wavelength of the emitted radiation.

Cherenkov radiation is observed whenever \beta_n>1 (i.e. if v>c/n), and the limit of the emission is on the short wave bands (explaining the typical blue radiation of this effect). Moreover, \lambda_{min} corresponds to \cos\theta\approx 1.

By the other hand, the radiated energy per particle per unit of time is equal to:

    \[ \boxed{-\dfrac{dE}{dt}=\dfrac{e^2V}{c^2}\int_0^{\omega_{max}}\omega\left[1-\dfrac{1}{n^2\beta^2}-\dfrac{\Lambda}{n\beta\lambda}\left(1-\dfrac{1}{n^2}\right)-\dfrac{\Lambda^2}{4\lambda^2}\left(1-\dfrac{1}{n^2}\right)\right]d\omega}\]

where \omega=2\pi c/n\lambda is the angular frequency of the radiation, with a maximum value of \omega_{max}=2\pi c/n\lambda_{min}.
Remark: In the non-relativistic case, v<<c, and the condition \beta n>1 implies that n>>1. Therefore, neglecting the quantum corrections (the charged particle self-interaction/backreaction to radiation), we can insert the limit \Lambda/\lambda\rightarrow 0 and the above previous equations will simplify into:

    \[ \boxed{\cos\theta=\dfrac{1}{n\beta}-\dfrac{c}{nv}}\]

    \[ \boxed{-\dfrac{dE}{dt}=\dfrac{e^2 v}{c^2}\int_0^{\omega_{max}}\omega\left(1-\dfrac{c^2}{n^2v^2}\right)d\omega}\]

Remember: \omega_{max} is determined with the condition \beta n(\omega_{max})=1, where n(\omega_{max}) represents the dispersive effect of the material/medium through the refraction index.


The number of photons produced per unit path length and per unit of energy of a charged particle (charge equals to Zq) is given by the celebrated Frank-Tamm formula:

    \[ \boxed{\dfrac{d^2N}{dEdx}=\dfrac{\alpha Z^2}{\hbar c}\sin^2\theta_c=\dfrac{\alpha^2 Z^2}{r_em_ec^2}\left(1-\dfrac{1}{\beta^2n^2(E)}\right)}\]

In terms of common values of fundamental constants, it takes the value:

    \[ \boxed{\dfrac{d^2N}{dEdx}\approx 370Z^2\sin^2\theta_c(E)eV^{-1}\cdot cm^{-1}}\]

or equivalently it can be written as follows

    \[ \boxed{\dfrac{d^2N}{dEdx}=\dfrac{2\pi \alpha Z^2}{\lambda^2}\left(1-\dfrac{1}{\beta^2n^2(\lambda)}\right)}\]

The refraction index is a function of photon energy E=\hbar \omega, and it is also the sensitivity of the transducer used to detect the light with the Cherenkov effect! Therefore, for practical uses, the Frank-Tamm formula must be multiplied by the transducer response function and integrated over the region for which we have \beta n(\omega)>1.

Remark: When two particles are close toghether ( to be close here means to be separated a distance d<1 wavelength), the electromagnetic fields form the particles may add coherently and affect the Cherenkov radiation. The Cherenkov radiation for a electron-positron pair at close separation is suppressed compared to two independent leptons!

Remark (II): Coherent radio Cherenkov radiation from electromagnetic showers is significant and it has been applied to the study of cosmic ray air showers. In addition to this, it has been used to search for electron neutrinos induced showers by cosmic rays.


The applications of Cherenkov detectors for particle identification (generally labelled as PID Cherenkov detectors) are well beyond the own range of high-energy Physics. Its uses includes: A) Fast particle counters. B) Hadronic particle indentifications. C) Tracking detectors performing complete event reconstruction. The PDG gives some examples of each category: a) Polarization detector of SLD, b) the hadronic PID detectors at B factories like BABAR or the aerogel threshold Cherenkov in Belle, c) large water Cherenkov counters liket those in Superkamiokande and other neutrino detector facilities.

Cherenkov detectors contain two main elements: 1) A radiator/material through which the particle passes, and 2) a photodetector. As Cherenkov radiation is a weak source of photons, light collection and detection must be as efficient as possible. The presence of a refractive material specifically designed to detect some special particles is almost vindicated in general.

The number of photoelectrons detected in a given Cherenkov radiation detector device is provided by the following formula (derived from the Tamm-Frank formula simply taking into account the efficiency in a straightforward manner):

    \[ \boxed{N=L\dfrac{\alpha^2 Z^2}{r_em_ec^2}\int \epsilon (E)\sin^2\theta_c(E)dE}\]

where L is the path length of the particle in the radiator/material, \epsilon (E) is the efficiency for the collector of Cherenkov light and transducing it in photoelectrons, and

    \[ \boxed{\dfrac{\alpha^2}{r_em_ec^2}=370eV^{-1}cm^{-1}}\]

Remark: The efficiencies and the Cherenkov critical angle are functions of the photon energy, generally speaking. However, since the typical energy dependen variation of the refraction index is modest, a quantity sometimes called Cherenkov detector quality fact N_0 can be defined as follows

    \[ \boxed{N_0=\dfrac{\alpha^2Z^2}{r_em_ec^2}\int \epsilon dE}\]

In this case, we can write

    \[ \boxed{N\approx LN_0<\sin^2\theta_c>}\]

Remark(II): Cherenkov detectors are classified into imaging or threshold types, depending on its ability to make use of Cherenkov angle information. Imaging counters may be used to track particles as well as identify particles.

Other main uses/applications of the Vavilov-Cherenkov effect are:

1st. Detection of labeled biomolecules. Cherenkov radiation is widely used to facilitate the detection of small amounts and low concentrations of biomolecules. For instance, radioactive atoms such as phosphorus-32 are readily introduced into biomolecules by enzymatic and synthetic means and subsequently may be easily detected in small quantities for the purpose of elucidating biological pathways and in characterizing the interaction of biological molecules such as affinity constants and dissociation rates.

2nd. Nuclear reactors. Cherenkov radiation is used to detect high-energy charged particles. In pool-type nuclear reactors, the intensity of Cherenkov radiation is related to the frequency of the fission events that produce high-energy electrons, and hence is a measure of the intensity of the reaction. Similarly, Cherenkov radiation is used to characterize the remaining radioactivityof spent fuel rods.

3rd. Astrophysical experiments. The Cherenkov radiation from these charged particles is used to determine the source and intensity of the cosmic ray,s which is used for example in the different classes of cosmic ray detection experiments. For instance, Ice-Cube, Pierre-Auger, VERITAS, HESS, MAGIC, SNO, and many others. Cherenkov radiation can also be used to determine properties of high-energy astronomical objects that emit gamma rays, such as supernova remnants and blazars. In this last class of experiments we place STACEE, in new Mexico.

4th. High-energy experiments. We have quoted already this, and there many examples in the actual LHC, for instance, in the ALICE experiment.


Vacuum Cherenkov radiation (VCR) is the alledged and  conjectured phenomenon which refers to the Cherenkov radiation/effect of a charged particle propagating in the physical vacuum. You can ask: why should it be possible? It is quite straightforward to understand the answer.

The classical (non-quantum) theory of relativity (both special and general)  clearly forbids any superluminal phenomena/propagating degrees of freedom for material particles, including this one (the vacuum case) because a particle with non-zero rest mass can reach speed of light only at infinite energy (besides, the nontrivial vacuum itself would create a preferred frame of reference, in violation of one of the relativistic postulates).

However, according to modern views coming from the quantum theory, specially our knowledge of Quantum Field Theory, physical vacuum IS a nontrivial medium which affects the particles propagating through, and the magnitude of the effect increases with the energies of the particles!

Then, a natural consequence follows: an actual speed of a photon becomes energy-dependent and thus can be less than the fundamental constant c=299792458m/s of  speed of light, such that sufficiently fast particles can overcome it and start emitting Cherenkov radiation. In summary, any charged particle surpassing the speed of light in the physical vacuum should emit (Vacuum) Cherenkov radiation. Note that it is an inevitable consequence of the non-trivial nature of the physical vacuum in Quantum Field Theory. Indeed, some crazy people saying that superluminal particles arise in jets from supernovae, or in colliders like the LHC fail to explain why those particles don’t emit Cherenkov radiation. It is not true that real particles become superluminal in space or collider rings. It is also wrong in the case of neutrino propagation because in spite of being chargeless, neutrinos should experiment an analogue effect to the Cherenkov radiation called the Askaryan effect. Other (alternative) possibility or scenario arises in some Lorentz-violating theories ( or even CPT violating theories that can be equivalent or not to such Lorentz violations) when a speed of a propagating particle becomes higher than c which turns this particle into the tachyon.  The tachyon with an electric charge would lose energy as Cherenkov radiation just as ordinary charged particles do when they exceed the local speed of light in a medium. A charged tachyon traveling in a vacuum therefore undergoes a constant proper-time acceleration and, by necessity, its worldline would form an hyperbola in space-time. These last type of vacuum Cherenkov effect can arise in theories like the Standard Model Extension, where Lorentz-violating terms do appear.

One of the simplest kinematic frameworks for Lorentz Violating theories is to postulate some modified dispersion relations (MODRE) for particles , while keeping the usual energy-momentum conservation laws. In this way, we can provide and work out an effective field theory for breaking the Lorentz invariance. There are several alternative definitions of MODRE, since there is no general guide yet to discriminate from the different theoretical models. Thus, we could consider a general expansion  in integer powers of the momentum, in the next manner (we set units in which c=1):

    \[ \boxed{E^2=f(p,m,c_n)=p^2+m^2+\sum_{n=-\infty}^{\infty}c_n p^n}\]

However, it is generally used a more soft expansion depending only on positive powers of the momentum in the MODRE. In such a case,

    \[ \boxed{E^2=f(p,m,a_n)=p^2+m^2+\sum_{n=1}^{\infty}a_n p^n}\]

and where p=\vert \mathbf{p}\vert. If Lorentz violations are associated to the yet undiscovered quantum theory of gravity, we would get that ordinary deviations of the dispersion relations in the special theory of relativity should appear at the natural scale of the quantum gravity, say the Planck mass/energy. In units where c=1 we obtain that Planck mass/energy is:

    \[ \boxed{M_P=\sqrt{\hbar^5/G_N}=1.22\cdot 10^{19}GeV=1.22\cdot 10^{16}TeV}\]

Lets write and parametrize the Lorentz violations induced by the fundamental scale of quantum gravity (naively this Planck mass scale) by:

    \[ \boxed{a_n=\dfrac{\Xi_n}{M_P^{n-2}}}\]

Here, \Xi_n is a dimensionless quantity that can differ from one particle (type) to another (type). Considering, for instance n=3,4, since the n<3 seems to be ruled out by previous terrestrial experiments, at higer energies the lowest non-null term will dominate the expansion with n\geq 3. The MODRE reads:

    \[ E^2=p^2+m^2+\dfrac{\Xi_a p^n}{M_P^{n-2}}\]

and where the label a in the term l \Xi_a is specific of the particle type. Such corrections might only become important at the Planck scale, but there are two exclusions:

1st. Particles that propagate over cosmological distances can show differences in their propagation speed.
2nd. Energy thresholds for particle reactions can be shifted or even forbidden processes can be allowed. If the p^n-term is comparable to the m^2-term in the MODRE. Thus, threshold reactions can be significantly altered or shifted, because they are determined by the particle masses. So a threshold shift should appear at scales where:

    \[ \boxed{p_{dev}\approx\left(\dfrac{m^2M_P^{n-2}}{\Xi}\right)^{1/n}}\]

Imposing/postulating that \Xi\approx 1, the typical scales for the thresholds for some diffent kind of particles can be calculated. Their values for some species are given in the next table:


We can even study some different sources of modified dispersion relationships:

1. Measurements of time of flight.

2. Thresholds creation for: A) Vacuum Cherenkov effect, B) Photon decay in vacuum.

3. Shift in the so-called GZK cut-off.

4. Modified dispersion relationships induced by non-commutative theories of spacetime. Specially, there are time shifts/delays of photon signals induced by non-commutative spacetime theories.

We will analyse this four cases separately, in a very short and clear fashion. I wish!

Case 1. Time of flight. This is similar to the recently controversial OPERA experiment results. The OPERA experiment, and other similar set-ups, measure the neutrino time of flight. I dedicated a post to it early in this blog


In fact, we can measure the time of flight of any particle, even photons. A modified dispersion relation, like the one we introduced here above, would lead to an energy dependent speed of light. The idea of the time of flight (TOF) approach is to detect a shift in the arrival time of photons (or any other massless/ultra-relativistic particle like neutrinos) with different energies, produced simultaneous in a distant object, where the distance gains the usually Planck suppressed effect. In the following we use the dispersion relation for n=3 only, as modifications in higher orders are far below the sensitivity of current or planned experiments. The modified group velocity becomes:

    \[ v=\dfrac{\partial E}{\partial p}\]

and then, for photons,

    \[ v\approx 1-\Xi_\gamma\dfrac{p}{M}\]

The time difference in the photon shift detection time will be:

    \[ \Delta t=\Xi_\gamma \dfrac{p}{M}D\]

where D is the distance multiplied (if it were the case) by the redshift (1+z) to correct the energy with the redshift. In recent years, several measurements on different objects in various energy bands leading to constraints up to the order of 100 for \Xi. They can be summarized in the next table ( note that the best constraint comes from a short flare of the Active Galactic Nucleus (AGN) Mrk 421, detected in the TeV band by the Whipple Imaging Air Cherenkov telescope):


There is still room for improvements with current or planned experiments, although the distance for TeV-observations is limited by absorption of TeV photons in low energy metagalactic radiation fields. Depending on the energy density of the target photon field one gets an energy dependent mean free path length, leading to an energy and redshift dependent cut off energy (the cut off energy is defined as the energy where the optical depth is one).

2. Thresholds creation for: A) Vacuum Cherenkov effect, B) Photon decay in vacuum. By the other hand, the interaction vertex in quantum electrodynamics (QED) couples one photon with two leptons. When we assume for photons and leptons the following dispersion relations (for simplicity we adopt all units with M=1). Then:

    \[ \omega_k^2=k^2+\xi k^n\]


    \[E^2_p=p^2+m^2+\Xi p^n\]

Let us write the photon tetramomentum like \mathbb{P}=(\omega_k,\mathbf{k}) and the lepton tetramomentum \mathbb{P}=(E_p,\mathbf{p}) and \mathbb{Q}=(E_q,\mathbf{q}). It can be shown that the transferred tetramomentum will be

    \[ \xi k^n+\Xi p^n-\Xi q^n=2(E_p\omega_k-\mathbf{p}\cdot\mathbf{k})\]

where the r.h.s. is always positive. In the Lorentz invariant case the parameters \xi, \Xi  are zero, so that this equation can’t be solved and all processes of the single vertex are forbidden. If these parameters are non-zero, there can exist a solution and so these processes can be allowed. We now consider two of these interactions to derive constraints on the parameters \Xi, \xi. The vacuum
Cherenkov effect e^-\rightarrow \gamma e^- and the spontaneous photon-decay \gamma\rightarrow e^+e^-.

A) As we have studied here, the vacuum Cherenkov effect is a spontaneous emission of a photon by a charged particle 0<E_\gamma<E_{par}.  These effect occurs if the particle moves faster than the slowest possible radiated photon in vacuum!
In the case of \Xi>0, the maximal attainable speed for the particle c_{max} is faster than c. This means, that the particle can always be faster than a zero energy photon with

    \[ \displaystyle{c_{\gamma_0}=c\lim_{k\rightarrow 0}\dfrac{\partial \omega}{\partial k}=c\lim_{k\rightarrow 0}\dfrac{2k+n\xi k^{n-1}}{2\sqrt{k^2+\xi k^n}}=c}\]

and it is independent of \xi. In the case of \Xi<0, i.e., l c_{par} decreases with energy, you need a photon with c_\gamma<c_{par}<x. This is only possible if \xi<\Xi.

Therefore, due to the radiation of photons such an electron loose energy. The observation of high energetic electrons allows to derive constraints on \Xi and \xi.  In the case of \Xi<0, in the case with n=3, we have the bound

    \[ \Xi<\dfrac{m^2}{2p^3_{max}}\]

Moreover, from the observation of 50 TeV photons in the Crab Nebula (and its pulsar) one can conclude the existens of 50 TeV electrons due to the inverse Compton scattering of these electrons with those photons. This leads to a constraint on \Xi of about

    \[ \Xi<1.2\times 10^{-2}\]

where we have used \Xi>0 in this case.

B) The decay of photons into positrons and electrons \gamma\rightarrow e^+e^- should be a very rapid spontaneous decay process. Due to the observation of Gamma rays from the Crab Nebula on earth with an energy up to E\sim 50TeV. Thus, we can reason that these rapid decay doesn’t occur on energies below 50 TeV. For the constraints on \Xi and \xi these condition means (again we impose n=3):

    \[ \xi<\dfrac{\Xi}{2}+0.08, \mbox{for}\; \xi\geq 0\]

    \[ \xi<\Xi+\sqrt{-0.16\Xi}, \mbox{for}\;\Xi<\xi<0\]


3. Shift in the GZK cut-off. As the energy of a proton increases,the pion production reaction can happen with low energy photons of the Cosmic Microwave Background (CMB).

This leads to an energy dependent mean free path length of the particles, resulting in a cutoff at energies around E_{GZK}\approx 10^{20}eV. This is the the celebrated Greisen-Kuzmin-Zatsepin (GZK) cut off. The resonance for the GZK pion photoproduction with the CMB backgroud can be read from the next condition (I will derive this condition in a future post):

    \[ \boxed{E_{GZK}\approx\dfrac{m_p m_\pi}{2E_\gamma}=3\times 10^{20}eV\left(\dfrac{2.7K}{E_\gamma}\right)}\]

Thus in Lorentz invariant world, the mean free path length of a particle of energy 5.1019 eV is 50 Mpc i.e. particle over this energy are readily absorbed due to pion photoproduction reaction. But most of the sources of particle of ultra high energy are outside 50 Mpc. So, one expects no trace of particles of energy above 10^{20}eV on Earth. From the experimental point of view AGASA has found a few particles having energy higher than the constraint given by GZK cutoff limit and claimed to be disproving the presence of GZK cutoff or at least for different threshold for GZK cutoff, whereas HiRes is consistent with the GZK effect. So, there are two main questions, not yet completely unsolved:

i) How one can get definite proof of non-existence GZK cut off?
ii) If GZK cutoff doesn’t exist, then find out the reason?

The first question could by answered by observation of a large sample of events at these energies, which is necessary for a final conclusion, since the GZK cutoff is a statistical phenomena. The current AUGER experiment, still under construction, may clarify if the GZK cutoff exists or not. The existence of the GZK cutoff would also yield new limits on Lorentz or CPT violation. For the second question, one explanation can be derived from Lorentz violation. If we do the calculation for GZK cutoff in Lorentz violated world we would get the modified proton dispersion relation as described in our previous equations with MODRE.

4. Modified dispersion relationships induced by non-commutative theories of spacetime. As we said above, there are time shifts/delays of photon signals induced by non-commutative spacetime theories. Noncommutative spacetime theories introduce a new source of MODRE: the fuzzy nature of the discreteness of the fundamental quantum spacetime. Then, the general ansatz of these type of theories comes from:

    \[ \boxed{\left[\hat{x}^\mu,\hat{x}^\nu\right]=i\dfrac{\theta^{\mu\nu}}{\Lambda_{NC}^2}}\]


    \[ \theta^{\mu\nu}\]

are the components of an antisymmetric Lorentz-like tensor which components are the order one. The fundamental scale of non-commutativity \Lambda^2_{NC} is supposed to be of the Planck length. However, there are models with large extra dimensions that induce non-commutative spacetime models with scale near the TeV scale! This is interesting from the phenomenological aside as well, not only from the theoretical viewpoint. Indeed, we can investigate in the following whether astrophysical observations are able to constrain certain class of models with noncommutative spacetimes which are broken at the TeV scale or higher. However, there due to the antisymmetric character of the noncommutative tensor, we need a magnetic and electric background field in order to study these kind of models (generally speaking, we need some kind of field inducing/producing antisymmetric field backgrounds), and then the dispersion relation for photons remains the same as in a commutative spacetime. Furthermore, there is no photon energy dependence of the dispersion relation. Consequently, the time-of-flight experiments are inappopriate because of their energy-dependent dispersion. Therefore, we suggest the next alternative scenario: suppose, there exists a strong magnetic field  (for instance, from a star or a cluster of stars) on the path photons emitted at a light source (e.g. gamma-ray bursts). Then, analogous to gravitational lensing, the photons experience deflection and/or change in time-of-arrival, compared to the same path without a magnetic background field. We can make some estimations for several known objects/examples are shown in this final table:


In summary:

1st. Vacuum Cherenkov and related effects modifying the dispersion relations of special relativity are natural in many scenarios beyond the Standard Relativity (BSR) and beyond the Standard Model (BSM).

2nd. Any theory allowing for superluminal propagation has to explain the null-results from the observation of the vacuum Cherenkov effect. Otherwise, they are doomed.

3rd. There are strong bounds coming from astrophysical processes and even neutrino oscillation experiments that severely imposes and kill many models. However, it is true that current MODRE bound are far from being the most general bounds. We expect to improve these bounds with the next generation of experiments.

4th. Theories that can not pass these tests (SR obviously does) have to be banned.

5th. Superluminality has observable consequences, both in classical and quantum physics, both in standard theories and theories beyond standard theories. So, it you buid a theory allowing superluminal stuff, you must be very careful with what kind of predictions can and can not do. Otherwise, your theory is complentely nonsense.

As a final closing, let me include some nice Cherenkov rings from Superkamiokande and MiniBoone experiments. True experimental physics in action. And a final challenge…

FINAL CHALLENGE: Are you able to identify the kind of particles producing those beautiful figures? Let me know your guesses ( I do know the answer, of course).

Figure 1. Typical SuperKamiokande Ring.  I dedicate this picture to my admired Japanase scientists there. I really, really admire that country and their people, specially after disasters like the 2011 Earthquake and the Fukushima accident. If you are a japanase reader/follower, you must know we support your from abroad. You were not, you are not and you shall not be alone.


Figure 2. Typical MiniBooNe ring. History: I used this nice picture in my Master Thesis first page, as the cover/title page main picture!


LOG#045. Fake superluminality.

Before becoming apparent superluminal readers, we are going to remember and review some elementary notation and concepts from the relativistic Doppler effect and the starlight aberration we have already studied in this blog.

Let us consider and imagine the next gedankenexperiment/thought experiment. Some moving object emits pulses of light during some time interval, denoted by \Delta \tau_e in its own frame. Its distance from us is very large, say

    \[ D>>c\Delta \tau_e\]

Question: Does it (light) arrive at time t=D/c? Suppose the object moves forming certain angle \theta according to the following picture


Time dilation means that a second pulse would be experiment a time delay \Delta t_e=\gamma \Delta \tau_e, later of course from the previous pulse, and at that time the object would have travelled a distance \Delta x=v\Delta t_e\cos\theta away from the source, so it would take it an additional time \Delta x/c to arrive at its destination. The reception time between pulses would be:

    \[ \Delta t_r=\Delta t_e+\beta \Delta t_e\cos\theta=\gamma (1+\beta \cos\theta)\Delta \tau_e\]


    \[ \boxed{\Delta t_r=(1+\beta\cos\theta)\gamma \Delta \tau_e}\]

In the range 0<\theta<\pi, the time interval separation measured from both pulses in the rest frame on Earth will be longer than in the rest frame of the moving object. This analysis remains valid even if the 2 events are not light beams/pulses but succesive packets or “maxima” of electromagnetic waves ( electromagnetic radiation).

Astronomers define the dimensionless redshift

    \[ \boxed{(1+z)\equiv \dfrac{\Delta t_r}{\Delta \tau_e}=\gamma (1+\beta \cos\theta)}\]

where, as it is common in special relativity, \beta=v/c, \gamma^2=\dfrac{1}{1-\beta^2}

The 3 interesting limits of the above expression are:

1st. Receding emitter case. The moving object moves away from the receiver. Then, we have \theta=0 supposing a completely radial motion in the line of sight, and then a literal “redshift” ( lower frequencies than the proper frequencies)

    \[ (1+z)=\sqrt{\dfrac{1+\beta}{1-\beta}}\]

2nd. Approaching emitter case. The moving object approaches and goes closer to the observer. Then, we get \theta=\pi, or motion inward the radial direction, and then a “blueshift” ( higher frequencies than those of the proper frequencies)

    \[ (1+z)=\sqrt{\dfrac{1-\beta}{1+\beta}}\]

3rd. Tangential or transversal motion of the source. This is also called second-order redshift. It has been observed in extremely precise velocity measurements of pulsars in our Galaxy.

    \[ (1+z)=\gamma\]

Furthermore, these redshifts have all been observed in different astrophysical observations and, in addition, they have to be taken into account for tracking the position via GPS, geolocating satellites and/or following their relative positions with respect to time or calculating their revolution periods around our planet.

Remark: Quantum Mechanics and Special Relativity would be mutually inconsistent IF we did not find the same formula for the ratios between energy and frequencies at different reference frames.

EXAMPLE: The emission line of the oxygen (II) [O(II)] is, in its rest frame, \lambda_0=3727\AA. It is observed in a distant galaxy to be at \lambda=9500\AA. What is the redshift z and the recession velocity of this galaxy?

Solution.  From the definition of wavelength in electromagnetism cT=\lambda, adn c\tau=\lambda_0. Then,

    \[ (1+z)=\dfrac{T}{\tau}=\dfrac{\lambda}{\lambda_0}=\dfrac{9500}{3727}=2.55\]

and thus z=1.55.

From the radial velocity hypothesis, we get

    \[ (1+z)=\sqrt{\dfrac{1+\beta}{1-\beta}}\]


    \[ \beta=\dfrac{(1+z)^2-1}{(1+z)^2+1}=0.73\]

and thus

    \[ \beta=0.73\]


    \[ v=0.73c\]

Note that this result follows from the hypothesis of the expansion of the Universe, and it holds in the relativistic theory of gravity, General Relativity, and it should also holds in extensions of it, even in Quantum Gravity somehow!

Remember: Stellar aberration causes taht the positions on the sky of the celestial objects are changing as the Earth moves around the Sun. As the Earth’s velocity is about v_E\approx 30km/s, and then \beta_E\approx 10^{-4}, it implies an angular separation about \Delta \theta\approx 10^{-4}rad. Anyway, it is worth mentioning that the astronomer Bradley observed this starlight aberration in 1729! A moving observer observes that light from stars are at different positions with respect to a rest observer, and that the new position does not depend on the distance to the star. Thus, as the relative velocity increases, stars are “displaced” further and further towards the direction of observation.

Now, we are going to the main subject of the post. I decided to review this two important effects because it is useful to remember then and to understand that they are measured and they are real effects. They are not mere artifacts of the special theory of relativity masking some unknown reality. They are the reality in the sense they are measured. Alternative theories trying to understand these effects exist but they are more complicated and they remember me those people trying to defend the geocentric model of the Universe with those weird metaphenomenon known as epicycles in order to defend what can not be defended from the experimental viewpoint.

In order to make our discussion visual and phenomenological, I am going to consider a practical example. Certain radio-galaxy, denoted by 3C 273 moves with a velocity

    \[ \omega=0.8 miliarc sec/yr=4\cdot 10^{-9}\dfrac{rad}{yr}\]

Note that

    \[ 1 miliarc sec=\left(\dfrac{10^{-3}}{3600}\right)^{\circ}\]

Knowing the rate expansion of the universe and the redshift of the radiogalaxy, its distance is calculated to be about 2.6\cdot 10^9 lyr. To obtain the relative tangential velocity, we simply multiply the angular velocity by the distance, i.e. v_{r\perp}=\omega D.

From the above data, we get that the apparent tangential radial velocity of our radiogalaxy would be about v_{r\perp}\approx 10c. Indeed, this observation is not isolated. There are even jets of matter flowing from some stars at apparent superluminal velocities. Of course this is an apparent issue for SR. How can we explain it? How is it possible in the SR framework to obtain a superluminal velocity? It shows that there is no contradiction with SR. The (fake and apparent) superluminal effect CAN BE EXPLAINED naturally in the SR framework in a very elegant way. Look at the following picture:


It shows:

-A moving object with velocity v=\vert \mathbf{v}\vert with respect to Earth, approaching to Earth.

-There is some angle \theta in the direction of observation. And as it moves towards Earth, with our conventions, \theta\approx\pi=180^\circ

-The moving object emits flashes of light at two different points, A and B, separated by some time interval \Delta t_e in the Earth reference frame.

-The distance between those two points A and B, is very small compared with the distance object-Earth, i.e., d(A,B)<< D.

Question: What is the time separation \Delta t_r between the receptions of the pulses at the Earth surface?

The solution is very cool and intelligent. We get

A: time interval \Delta t_e=t_A=\dfrac{D}{c}

B: time interval t_B=t_A+\dfrac{v\Delta t_e\cos\theta}{c}

Note that \cos\theta<0!

From this equations, we get a combined equation for the time separation of pulses on Earth

    \[ \boxed{\Delta t_r=\Delta t_e (1+\beta \cos\theta)}\]

The tangential separation is defined to be

    \[ \Delta Y=Y_B-Y_A=v\Delta t_e\sin\theta\]

so, the apparent velocity of the source, seen from the Earth frame, is showed to be:

    \[ \boxed{v_a=\dfrac{\Delta Y}{\Delta t_r}=\dfrac{\beta\sin\theta}{1+\beta\cos\theta}c}\]

Remark (I): v_a>>c IFF \beta\approx 1 AND \cos\theta\approx -1!

Remark (II): There are some other sources of fake superluminality in special relativity or general relativity (the relativist theory of gravity). One example is that the phase velocity and the group velocity can indeed exceed the speed of light, since from the equation v_{ph}v_{g}=c^2, it is obvious that whenever that one of those two velocities (group or phase velocity) are lower than the speed of light at vacuum, the another has to be exceeding the speed of light. That is not observable but it has an important rôle in the de Broglie wave-particle portrait of the atom. Other important example of apparent and fake superluminal motion is caused by gravitational (micro)lensing in General Relativity. Due to the effect of intense gravitational fields ( i.e., big concentrations of mass-energy), light beams from slow-moving objects can be magnified to make them, apparently, superluminal. In this sense, gravity acts in an analogue way of a lens, i.e., as it there were a refraction index and modifying the propagation of the light emitted by the sources.

Remark (III): In spite of the appearance, I am not opposed to the idea of superluminal entities, if they don’t break established knowledge that we do know it works. Tachyons have problems not completely solved and many physicists think (by good reasons) they are “unphysical”.  However, my own experience working with theories beyond special/general relativity and allowing superluminal stuff (again, we should be careful with what we mean with superluminality and with “velocity” in general) has showed me that if superluminal objects do exist, they have observable consequences. And as it has been showed here, not every apparent superluminal motion is superluminal!Indeed, it can be handled in the SR framework. So, be aware of crackpots claiming that there are superluminal jets of matter out there, that neutrinos are effectively superluminal entities ( again, an observation refuted by OPERA, MINOS and ICARUS and in complete disagreement with the theory of neutrino oscillations and the real mass that neutrino do have!) or even when they say there are superluminal protons and particles in the LHC or passing through the atmosphere without any effect that should be vissible with current technology. It is simply not true, as every good astronomer, astrophysicist or theoretical physicist do know! Superluminality, if it exists, it is a very subtle thing and it has observable consequences that we have not observed until now. As far as I know, there is no (accepted) observation of any superluminal particle, as every physicist do know. I have discussed the issue of neutrino time of flight here before:


Final challenge: With the date given above, what would the minimal value of \beta be in order to account for the observed motion and apparent (fake) superluminal velocity of the radiogalaxy 3C 273?

LOG#044. Hydrodynamics and SR.


Relativistic hydrodynamics is a branch of Relativity Theory that faces with fluids and/or molecules (“gases”) moving at relativistic speeds. Today, this area of Special Relativity has been covered with many applications. However, it has not been so since, not so long ago, the questions was:

Where could one encounter fluids or “gases” that would propagate with velocities close to the the speed of light?

It was thought that it seemed a question to be very far away from any realistic or practical use. At present time, relativistic hydrodynamics IS an importan part of Cosmology and the theory of processes going on in the sorrounding and ambient space of neutron stars (likely, of the quark stars as well), compact massive objects and black holes. When the relativistic fluid flows under strong gravitational fields existing in those extreme conditions at relativistic speeds, it drives to a big heating and X-ray emission, for instance. And then, a relativistic treatment of matter is inevitable there.

Caution note: I will use units with c=1 in this post, in general, without loss of generality.

Let me review a bit the non-relativistic hydrodynamics of ideal fluids and gases. Their dynamics is governed by the continuity equation (mass conservation) and the Euler equation:

    \[ \boxed{\mbox{Continuity equation:\;\;}\dfrac{\partial \rho}{\partial t}+\nabla \cdot(\rho \mathbf{v})=0}\]

    \[ \boxed{\mbox{Euler equation:\;\;}\rho \dfrac{d\mathbf{v}}{dt}+\nabla p=\mathbf{f}}\]

where the mass density and pressure of the fluid are respectively \rho and p. To complete the fluid equations, these equations need to be supplemented by an equation of state:

    \[ \boxed{\mbox{Equation of state:\;\;}p=p(\rho)}\]

The continuity equation expresses the fact that mass in an invariant in classical fluid theory. The Euler equation says how the changes of pressure and forces affect to the velocity of the fluid, and finally the equation of state encodes the type of fluid we have at macroscopic level from the microscopic degrees of freedom that fluid theory itself can not see.

By the other hand, it is worth mentioning that we can not write the continuity equation into a covariant form \partial_\mu j^\mu=0 in such “a naive” way, with j^\mu=\rho (x)v^\mu. Why? It is pretty easy: it is a characteristic property of special relativity that the mass density \rho (x) does not satisty such an equation, but we can derive a modified continuity equation that holds in SR. To build the right equations, we can proceed using an analogy with the electromagnetic field. Suppose we write an “stress-energy-momentum” tensor for an ideal fluid in the following way:

    \[ T_{\mu\nu}=\begin{pmatrix}\rho & 0 & 0& 0\\ 0 & p & 0 & 0\\ 0 & 0 & p & 0\\ 0 & 0 & 0 & p \end{pmatrix}\]

This tensor is written in the rest frame of a fluid. Note, then, that ideal fluids are characterized b the feature that their stress tensor T_{ij} contains no shear stresses ( off-diagonal terms) and they are thus “proportional” to the Kronecker delta tensor \delta_{\mu\nu}.

The generalization of the above tensor (an ideal fluid) to an arbitrary reference frame, in which the fluid element moves with some 4-velocity components u^\mu is given by the next natural generalization:

    \[ T_{\mu\nu}=(\rho +p)u_\mu u_\nu-\eta_{\mu \nu}p\]

where again, \rho (x) represents the density, p(x) is the local fluid pressure field as measured in the rest frame of the fluid element, and \eta_{\mu\nu} is the Minkovski metric. Therefore, the equations of motion can be found, in the absence of external forces, from the conservation laws:

    \[ \boxed{\mbox{Relativistic continuity equation:\;\;}\partial_\nu T^{\mu\nu}=0}\]

Inserting the above tensor for the fluid, we get

    \[ \partial_\nu \left[ \left( \rho +p\right)u^\mu u^\nu\right]-\partial^\mu p=0\]

We can find this equation to the non-relativistic Euler equation. The trick is easy: we firstly multiply by u^\mu and, after a short calculation, as we have that u^\mu u_\mu=-1, and (\partial_\nu u^\mu) u_\mu=0, it provides us with

    \[ \partial_\mu(\rho u^\mu)+p\partial_\mu u^\mu=0\]

This last equation shows, indeed, that mass current \rho u^\mu is not conserved itself. Recall that from the spatial part of our relativist fluid equations:

    \[ \partial_i \left[ \left( \rho +p\right)\mathbf{v} u^i\right]-\nabla p=0\]

If we define the so-called “convective” or “comoving” derivative of an arbitrary tensor field T as the derivative:

    \[ \dot{T}=\dfrac{DT}{d\lambda}=\partial_i Tu^i\]

we can rewrite the spatial part of the relativistic fluid dynamics as follows:

    \[ \boxed{\mbox{Relativistic Euler equation in 3d-space:\;}\dot{p}\mathbf{v}+(\rho+p)\dot{\mathbf{v}}+\nabla p=0}\]

We can check that it effectively corresponds to the classical Euler equation moving to the comoving frame where u^\mu=(1,\mathbf{0})^T, excepting for a pressure term equals to p/c^2 if we reintroduce units with the speed of light, and then it is the generalized mass-energy density conservation law from relativistic hydrodynamics!

Remark: In the case of electromagnetic radiation, we have a pressure term equals to p=\rho/3 due to the tracelessneess of the electromagnetic stress-energy-momentum tensor.

Returning to our complete relativistic equation, we observe that the time component of that equation has NOT turned out to be the relativistic version of the continuity equation, as we warned before. The latter rather has to be postulated separately, using additional insights from elementary particle physics! In this way, instead of a mass density conservation, we do know, e.g., that the baryon density n(x) does satisfy an equation of continuity (at least from current knowledge of fundamental physics):

\partial_\mu (n u^\mu)=0

and it merely says that baryon number is conserved under reasonable conditions ( of course, we do suspect at current time that baryon number conservation is not a good symmetry in the early Universe, but today it holds with good accuracy). Similarly, it can be said that for an electron gas, the baryon density has to be replaced by the so-called lepton density in the equation of continuity, where we could consider a gas with electrons, muons, tauons and their antiparticles. We could guess the neutrino density as well with suitable care. However, for phtoons and “mesons” there is NO continuity equation since tehy can be created and annihilated arbitrarily.

The relationship between n, p and \rho can be obtained from the equation of state and basic thermodynamics from the definition of pressure:

    \[ p=-\dfrac{\mbox{Energy per baryon}}{\mbox{volume per baryon}}=-\dfrac{(\rho/n)}{(1/n)}=n\dfrac{d\rho}{dn}-\rho\]


    \[ \int \dfrac{d\rho}{p(\rho)+\rho}=\int \dfrac{dn}{n}\]

Thus, with these equations, we can know the density n(\rho). The mass density \rho and the baryon density n differ by the density n\epsilon of the inner energy ( we have defined \epsilon as the specific inner energy or equivalently the inner energy per baryon):

    \[ \rho=n(1+\epsilon)\]

The inner energy is negative if energy is released at the formation of the state \rho, e.g. in the binding energy of a nucleus, and it is positive if energy has to be spent, e.g. if we make a compressional work onto the state.

Moreover, the specific entropy s=S/B or entropy per baryon and the temperature are defined by postulating the thermodynamical equilibrium and, with that integrating factor 1/T coming from elementary thermodynamics, we can write

    \[ ds=\dfrac{1}{T}\left(d\epsilon+pd\left(\dfrac{1}{n}\right)\right)\]

since we have an specific volume equal to v=1/n. Entropy is constant along a stream line of the ideal fluid, and it follows from

    \[ \dot{p}=\partial_\mu\left[(p+\rho)u^\mu\right]=\partial_\mu\left[(n+\epsilon n+p)u^\mu\right]=n\dot{\epsilon}+p\partial_\mu u^\mu+\dot{p}\]

if we divide by n

    \[ T\dot{s}=\dot{\epsilon}+p\left(\dfrac{1}{n}\right)\]

In conclusion, we deduce that the time conservation of the relativistic conservation law of fluid hydrodynamics tell us that in the case of an ideal fluid no energy is converted into heat, and entropy itself remains constant! Isn’t it amazing? Entropy rocks!

Final remark (I): For non-ideal fluids, the ansatz for the energy-momentum-stress tensor hast to be modified into this one

    \[ T_{\mu\nu}=(\rho+p)u_\mu u_\nu+(q_\mu u_\nu+q_\nu u_\mu)-\eta_{\mu\nu}p-\pi_{\mu\nu}\]

Final remark (II): Relativistic hydrodynamics can be generalized to charged fluids and plasmas, and in that case, it is called relativistic magnetohydrodynamics. One simply adds the stress-energy-momentum tensor of the electromagnetic field to the tensor of the fluid in such a way it is conserved as well, i.e., with zero divergence. Thus, we would get an extra Lorentz-like force into the relativistic generalization of the Euler equation!

Final remark (III): The measurement of thermodynamical quantities like pressure, entropy or temperature, and its treatment with classical thermodynamics suppose that the thermodynamical equilibrium state is reached. Please, note that if it were not the case, we should handle the problem with tools of non-equilibrium thermodynamics or some other type of statistical mechanics that could describe the out-of-the equilibrium states ( there are some suggestions of such a generalization of thermodynamics, and/or statistical mechanics, from the works of I. Prigogine, C.Tsallis, C.Beck and many other physicists).

LOG#043. Tachyons and SR.


“(…)Suppose that someone studying the distribution of population on the Hindustan Peninsula cockshuredly believes that there are no poeple north of the Himalayas, because nobody ca pass throught the mountain ranges! That would be an absurd conclusion. The inhabitants of Central Asia have been born there; they are not obliged to be born in India and tehn cross the mountain ranges. The same can be said about superluminal particles(…)”  This is a quote by George Sudharhan.

I had the honour to meet George Sudarshan ( his full name is Ennackal Chandy George Sudarshan, http://en.wikipedia.org/wiki/Sudarshan,_E._C._George) some years ago, in Jaca (Huesca), during the Sudarshanfest celebrating his 75th birthday. I also met some really cool people like Susumo Okubo ( yes, the man behind the Gell-Mann-Okubo mass formula for hadrons-see http://en.wikipedia.org/wiki/Gell-Mann-Okubo_mass_formula! ). I really enjoy knowing (japanese) scientists when they are really gentle and generous. Mr. Okubo indeed gave me a copy of his wonderful book about non-associative stuff. By the other hand, Indian scientists are also fascinating because they use to be people very uncommon and some of them, like Ramanujan in Mathematics, own exceptional gifts and talents.

George Sudarshan has made contributions in many branches of Physics. He originally proposed as well the V-A nature of the electroweak interactions that recover the Fermi theory of weak interactions in the low energy limit. In addition to this, he has developed in the field of optical coherence the so-called Sudarhsan-Glauber representation in Quantum Optics, he invented the theory of the Quantum Zeno effect, he has worked in open quantum systems, the relationship spin-statistics in general QFT contexts, and he is one of the defenders of the existence of tachyons, via an interpretation of relativity including those faster than light particles created by Feinberg and studied by E. Recami, Feinberg, M.Pavsic, Gregory Benford (yes, the writter and physicist/astronomer) and other physicists. His works about relativity with tachyons are usually labelled under the name “metarelativity”. I would like to dedicate him this post.

According to wikipedia, a tachyon or tachyonic particle is a hypothetical particle that, a priori, always moves faster than light. The word comes from the Greek word: ταχύς or tachys, meaning “swift, quick, fast, rapid”, and was coined by Gerald Feinbergin a 1967 paper.Feinberg proposed that tachyonic particles could be quanta of a quantum field with negative squared mass. However, it was soon realized that excitations of such imaginary mass fields do not in fact propagate faster than light,but instead represent an instability known as tachyon condensation.  Nevertheless, they are still commonly known as “tachyons”, and have come to play an important role in modern physics, for instance, their role in string theory is still being studied. I will not discuss about advanced topics like tachyon condensation in this post, I will expect to do it in the near future some day.

Superluminality and velocity are subtle concepts in SR. I have to discuss more about (apparent) superluminality in SR, but the goal of this post is somewhat more modest. I am going to introduce you tachyonic particles and some of its curious features.

First point. From the addition theorem of velocities in special relativity:

    \[ V=\dfrac{v_1+v_2}{1+\dfrac{v_1 v_2}{c^2}}\]

we can see that, a priori, the ranges of the velocities is not restricted from that concrete formula. the real issue with faster than light particles in special relativity comes from the relativistic expression of energy:

    \[ E=Mc^2=m\gamma c^2=\dfrac{mc^2}{\sqrt{1-\dfrac{v^2}{c^2}}}\]

This formula blows up at v=c for any finite value of the rest mass m! Indeed, we can say that particles slower than light, also called “tardyons”, have that energy-mass relation, while for photons ( or luxons), we do know that E=pc and they are massless particles. What happens if we forget our prejudices and we allow for v>c velocities saving the SR formula for mass-energy? Well, it is easy to realize that E becomes an imaginary number! Imaginary numbers are complex numbers without real part squaring a negative value. For instance, the solution to the equation x^2=-1 is the imaginary number unit x=i=\sqrt{-1}. Don’t forget that complex numbers are more general numbers verifying z=a+bi, with magnitude \vert z \vert ^2=a^2+b^2.

If we plug v>c in the SR formula for mass-energy, we get a negative number inside the square root. After some easy algebra, for v>c we obtain:

    \[ E=\dfrac{mc^2}{\sqrt{1-\dfrac{v^2}{c^2}}}=\dfrac{mc^2}{i\sqrt{\dfrac{v^2}{c^2}-1}}=\dfrac{\overline{m}c^2}{\sqrt{\dfrac{v^2}{c^2}-1}}\]

and where we have defined the imaginary mass quantity \overline{m}=\dfrac{m}{i}=-im. We could also have defined the “barred” gamma factor as

    \[ \boxed{\gamma \equiv -i\overline{\gamma}=\dfrac{1}{\sqrt{1-\dfrac{v^2}{c^2}}}}\leftrightarrow \boxed{\overline{\gamma}=\dfrac{1}{\sqrt{\dfrac{v^2}{c^2}-1}}}\]

so, for tachyons,

    \[ \boxed{E=\overline{m}\overline{\gamma}c^2=-im\overline{\gamma}c^2=\dfrac{\overline{m}c^2}{\sqrt{\dfrac{v^2}{c^2}-1}}}\]

Moreover, with this imaginary gamma/boost factor, we would define metarelativity or Lorentz transformations for tachyons:

    \[ x'=-i\overline{\gamma}(x-vt)\]

    \[ y'=y\]

    \[ z'=z\]

    \[ t'=-i\overline{\gamma}\left(t-\dfrac{v}{c^2}x\right)\]

and so on with any other transformation including imaginary boosts. Thus, we have extended the special relativity to the imaginary realm and we call this theory metarelativity since it includes superluminal transformations. For superluminal transformations, we also have the same invariants than those for usual relativity. For instance:

    \[ c^2t^{'2}-x^{'2}=c^2t^2-x^2=c^2\overline{t'}^2-\overline{x'}^2\]

The prize is that we have imaginary position and time coordinates ( or imaginary boosts, if you prefer that idea).

A curious property of the relationships \overline{m}=-im and \overline{\gamma}=i\gamma is that

    \[ \overline{m}\overline{\gamma}=m\gamma\]

This equation shows that a tachyonic imaginary mass and a imaginary gamma factor are equivalent, when they are multiplied, to the real valued common relativistic expression for the relativistic mass. So, we could handle tachyons with imaginary masses and imaginary gamma factors, at least in principle, in the same operational way we handle normal particles in SR, excepting for the sign inside the square root and the strange inertial properties of those tachyonic particles.

In addition to this fact, the energy for tachyonic particles some has interesting properties:

1st. E=\overline{m}\overline{\gamma}c^2 decreases as v increases! That is, tachyons are less energetic and so “more stable” when they have higher velocities. This behaviour is very different from the common inertial properties of normal matter. Unlike ordinary particles, the speed of a tachyon increases as its energy decreases.

2nd. A bradyon, also known as a tardyon or ittyon, is a particle that travels slower than light. The term “bradyon”, from Greek word βραδύς (bradys, “slow”), was coined to contrast with the name of the tachyon. Just as bradyons are forbidden to break the light-speed barrier, so too are tachyons forbidden from slowing down to below c, because infinite energy is required to reach the barrier from either above or below.

3rd. Einstein, Tolman and others noted that special relativity allowing for tachyons/superluminal transmissions would imply that they could be used to send signals backwards in time. This tool or use of tachyon pulses is called the tachyonic antitelephone device.

An electric charged tachyon would loose energy as Cherenkov radiation just as ordinary charged particles do when they exceed the local speed of light in a medium. A charged tachyon traveling in a vacuum therefore undergoes a constant proper time acceleration and, by necessity, its wordline forms a hyperbola in space-time.

However,  if we reduce the tachyon’s energy, then it increases its speed, so that the single hyperbola formed is of two oppositely charged tachyons with opposite momenta (same magnitude, opposite sign) which annihilate each other when they simultaneously reach infinite speed at the same place in space. (At infinite speed the two tachyons have no energy each and finite momentum of opposite direction, so no conservation laws are violated in their mutual annihilation. The time of annihilation is frame dependent.)

Even an electrically neutral tachyon would be expected to loose energy via gravitational Cherenkov radiation, at least in theory, because it has a gravitational mass, and therefore increase in speed as it travels, as described above. However, we have not detected gravitational Cherenkov radiation, as far as I know. If the tachyon interacts with any other particles, it can also radiate Cherenkov energy into those particles. Neutrinos interact with the other particles of the Standard Model and, recently,  Andrew Cohen and Sheldon Glashow used this to argue that the neutrino anomaly seen by the OPERA experiment could not be explained by making neutrinos propagate faster than light. Indeed, we do know that neutrino have a non-zero REAL mass from neutrino oscillation experiments.

Coming back to tachyons, we have seen that SR allows then if you allow for “imaginary energies”. A tachyon has the strange feature that its mass has the value

    \[ \mbox{TACHYON MASS}= \mbox{(SOMETHING)}\times i\]

so, while I can weight something like 70 kg, a tachyon clone of me would weight 70i kg. You can wonder what the imaginary mass means in terms of inertia with the SR equation above, but, of course, it is a weird result after all. And there are more “problems” and weird results for tachyons. For instance, tachyons, it they do gravitate according to Newton’s gravitation equation:

    \[ F_N=-G\dfrac{M_1 M_2}{d^2}\]

then they would experience “antigravitation”/antigravity. You can observe and “deduce” that the gravitational force between two tachyons with masses M_1=M_2=i separated by a distance of 1m. Then, the gravitational pull between those tachyons would be repulsive, since the sign of the gravitational Newton force would be positive instead of negative! Is it not amazing? Yes, you can wonder about the Dark Energy enigma, mysterious stuff out there, but there are quantum problems related with “superluminal” tachyons. I will discuss them in the future, I promise. So, it is not easy at all to associate a tachyonic field/mass origin to the Dark Energy. And of course, this hypothesis of antigravitating tachyons face problems when we think about what an imaginary gravitational force between a tachyon and a tardyon would mean. It shows that the mysteries of tachyons are yet not completely understood, and they are connected with the theory of scalar fields and the phenomenon, previously commented, of tachyon condensation. Let me know if you understand them better!

Morever, the transversal length contraction of a tachyon, and the time dilation of a tachyon in metarelativity are imaginary quantities as well. It is an easy exercise to derive the following relationships:

    \[ L_{\updownarrow}=iL_0\overline{\gamma}^{-1}\]

    \[ \tau=i\tau_0 \overline{\gamma}\]

A second post about tachyons and metarelativity is coming, but before that, you will have to wait for a while. I have other topics in my current agenda to be published, previously, to more tachyonic posts. I suppose I am not being beamed with tachyons from the future.

Let the tachyons be with you ;)!

LOG#042. Pulley with variable mass.


This interesting problem was recently found in certain public examination. I will solve it here since I found it fascinating and useful, since it is a problem with “variable mass”.

A pulley with negligible mass is given (see the figures above). Its radius is R. We tie two masses, m_1 and m_2 to the extremes of the pulley with the aid of a string. The string has a linear mass density \lambda and it has a total length equal to L. Calculate:

a) The forces in the extremes of the pulley over the masses and the acceleration when we release the system from the rest.

b) The velocity of the masses and the position of a the masses as a function of time.

Remark: We have the contraint q_2+\pi R+q_1=L. L is the total length of the string, R is the radius of the pulley, and q_2 and q_1 are the pieces of string unwrapped to the pulley for the masses m_2 and m_1, respectively. We can suppose that initially q_1=0 and q_2=l. Note that due to this constraint \dot{q_1}+\dot{q_2}=0, where the dot denotes derivative with respect to time, i.e., velocity of both masses, respecively too.


a) Suppose m_1>m_2. Thus, the forces acting on the masses are:

\boxed{F_1=m_1 g+\lambda gx-T=\overline{m_1}a=(m_1+\lambda x)a}

\boxed{F_2=T-m_2 g-\lambda g (l-x)=\overline{m_2}a=(m_2+(L-x)\lambda) a}

Adding the two equations, we get

m_1g-m_2g+2\lambda g x-\lambda g l=(m_1+m_2+L\lambda)a

and then the acceleration will be

\boxed{a=\dfrac{m_1g-m_2g-\lambda gl-2\lambda gx}{m_1+m_2+\lambda L}=\dfrac{(m_1-m_2)g-l\lambda g+2g\lambda x}{m_1+m_2+\lambda L}}

b) Firstly, we have to derive the differential equations for the system with variable mass.
The equations for both masses are, respectively, using Newton’s laws:

(m_1+\lambda x)x''=m_1 g+\lambda gx-T

(m_2+(L-x)\lambda)x''=T-m_2g-\lambda g(L-x)

Adding these two equations we get the differential equation to be solved:

\boxed{a=x''=\dfrac{m_1g-m_2g-\lambda g L+2g\lambda x}{m_1+m_2+\lambda L}}

with x(0)=x'(0)=0

We can solve the differential equation

x''=\dfrac{(m_1-m_2)g-L\lambda g+2g\lambda x}{m_1+m_2+\lambda L}

with the initial conditions v(0)=x'(0)=0 and x(0)=0 easily, to obtain

\boxed{v(t)=x'(t)=\dfrac{((m_1-m_2)-\lambda L)\sqrt{\lambda g}}{\lambda \sqrt{2(m_1+m_2+\lambda L)}}\sinh \left(\sqrt{\dfrac{2\lambda g}{m_1+m_2+\lambda L}}t\right)}

\boxed{x(t)=\dfrac{(m_1-m_2-\lambda L)}{\lambda}\sinh^2 \left(\sqrt{\dfrac{\lambda g}{2(m_1+m_2+\lambda L)}}t\right)}

As the mass of the string depends on the linear density, the usual procedure to obtain the variation of the kinetic energy as the work done by the potential forces is something more complicated:

\Delta E_c=\Delta W (x)=\int_0^x F(x)dx

It can be shown that

\dfrac{1}{2}(m_1+m_2+\lambda L)v^2=(m_1-m_2+\lambda x-\lambda l)g x

v=\sqrt{\dfrac{2gx(m_1-m_2+\lambda x-\lambda l)}{m_1+m_2+\lambda L}}

Please, note that in this case, we have to be very careful with l and L. They are related with a contraint, but they are not the same thing.

LOG#041. Muons and relativity.



QUESTION: Is the time dilation real or is it an artifact of our current theories?

There are solid arguments why time dilation is not an apparent effect but a macroscopic measurable effect. Today, we are going to discuss the “reality” of time dilation with a well known result:

Muon detection experiments!

Muons are enigmatic elementary particles from the second generation of the Standard Model with the following properties:

1st. They are created in upper atmosphere at altitudes of about 9000 m, when cosmic rays hit the Earth and they are a common secondary product in the showers created by those mysterious yet cosmic rays.
2nd. The average life span is 2\times 10^{-6}s\approx 2ms.
3rd. Typical speed is 0.998c or very close to the speed of light.
So we would expect that they could only travel at most d=0.998c\times 2 \times 10^{-6}\approx 600m
However, surprisingly at first sight, they can be observed at ground level! SR provides a beautiful explanation of this fact. In the rest frame S of the Earth, the lifespan of a traveling muon experiences time dilation. Let us define

A) t= half-life of muon with respect to Earth.

B) t’=half-life of muon of the moving muon (in his rest frame S’ in motion with respect to Earth).

C) According to SR, the time dilation means that t=\gamma t', since the S’ frame is moving with respect to the ground, so its ticks are “longer” than those on Earth.

A typical dilation factor \gamma for the muon is about 15-100, although the value it is quite variable from the observed muons. For instance, if the muon has v=0.998c then \gamma \approx 15. Thus, in the Earth’s reference frame, a typical muon lives about 2×15=30ms, and it travels respect to Earth a distance

d'=0.998c\times 30ms\approx 9000m.

If the gamma factor is bigger, the distance d’ grows and so, we can detect muons on the ground, as we do observe indeed!

Remark:  In the traveling muon’s reference frame, it is at rest and the Earth is rushing up to meet it at 0.998c. The distance between it and the Earth thus is shorter than 9000m by length contraction. With respect to the muon, this distance is therefore 9000m/15 = 600m.

An alternative calculation, with approximate numbers:

Suppose muons decay into other particles with half-life of about 0.000001sec. Cosmic ray muons have speed now about v = 0.99995 c.
Without special relativity, muon would travel

d= 0.99995 \times 300000 km/s\times 0.00000156s=0.47 km only!

Few would reach earth’s surface in that case. It we use special relativity, then plugging the corresponding gamma for v=0.99995c, i.e.,  \gamma =100, then muons’ “tics” run slower and muons live 100 times longer. Then, the traveled distance becomes

d'=100\times 0.9995\times 300000000 m/s\times 0.000001s= 30000m

Conclusion: a lot of muons reach the earth’s surface. And we can detect them! For instance, with the detectors on colliders, the cosmic rays detectors, and some other simpler tools.

LOG#040. Relativity: Examples(IV).

Example 1. Compton effect.

Let us define as  “a” a photon of frequency \nu. Then, it hits an electron “b” at rest, changing its frequency into \nu', we denote “c” this new photon, and the electron then moves after the collision in certain direction with respect to the line of observation. We define that direction with \theta.

We use momenergy conservation:


We multiply this equation by P_{\mu c} to deduce that

P^\mu_a P_{\mu c}+P^\mu_{b}P_{\mu c}=P^\mu_c P_{\mu c}+P^\mu_d P_{\mu c}

Using that the photon momenergy squared is zero, we obtain:

P^\mu_a P_{\mu c}+P^\mu_bP_{\mu c}=P^\mu_dP_{\mu c}

P^\mu _a=\left(\dfrac{h\nu}{c},\dfrac{h\nu}{c},0,0\right)



Remembering the definitions \dfrac{c}{\lambda}=\nu and \dfrac{c}{\lambda'}=\nu' and inserting the values of the momenta into the respective equations, we get



\boxed{\Delta \lambda\equiv \lambda'-\lambda=\dfrac{h}{mc}\left(1-\cos\theta\right)}




\boxed{\dfrac{\omega'}{\omega}=\mbox{Energy transfer}=\left[1+\dfrac{\hbar \omega}{mc^2}\right]^{-1}}

It is generally defined the so-callen electron Compon’s wavelength as:


l \bar{\lambda_C}=\dfrac{\hbar}{mc}\approx 2.42\cdot 10^{-12}m

Remark: There are some current discussions and speculative ideas trying to use the Compton effect as a tool to define the kilogram in an invariant and precise way.

Example 2. Inverse Compton effect.

Imagine an electron moving “to the left” denoted by “a”, it hits a photon “b” chaging its frequency into another photon “c” and the electron changes its direction of motion, being the velocity -u_b and the angle with respect to the direction of motion \theta.

The momenergy reads


P^\mu_b=\left(\gamma_b mc,-\gamma_b m u_b,0,0\right)


Using the same conservation of momenergy than above

\dfrac{2EE'}{c^2}+\gamma_b mE'-\gamma_b m\dfrac{u_b}{c}E'=\gamma_b m E+\gamma_b \dfrac{mu_b E}{c}

Supposing that u_b\approx c, and then 1-u_b/c\approx \dfrac{1}{2}\left(1+\dfrac{u_b}{c}\right)\left(1-\dfrac{u_b}{c}\right)=\dfrac{1}{2}\left(1-\dfrac{u_b^2}{c^2}\right)=\dfrac{1}{2}\dfrac{1}{\gamma_b^2}


\dfrac{2EE'}{c^2}+\dfrac{mE'}{2\gamma_b}=2\gamma_b mE

\dfrac{E'}{E}=\dfrac{2\gamma_b m}{\dfrac{2E}{c^2}+\dfrac{m}{2\gamma_b}}=\dfrac{4\gamma_b^2}{1+\dfrac{4\gamma_b E}{mc^2}}

This inverse Compton effect is important of importance in Astronomy. PHotons of the microwave background radiation (CMB), with a very low energy of the order of E\approx 10^{-3}eV, are struck by very energetic electrons (rest energy mc²=511 keV). For typical values of \gamma_b >>10^8, the second term in the denominator dominates, giving

E'\approx \gamma_b\times 511keV

Therefore, the inverse Compton effect can increase the energy of a photon in a spectacular way. If we don’t plut u_b\approx c we would get from the equation:

\dfrac{2EE'}{c^2}+\gamma_b mE'-\gamma_b m\dfrac{u_b}{c}E'=\gamma_b m E+\gamma m \dfrac{mu_b E}{c}

\gamma_b m E'\left(1-\dfrac{u_b}{c}+\dfrac{2E}{\gamma_b mc^2}\right)=\gamma_b m E\left(1+\dfrac{u_b}{c}\right)

\boxed{\dfrac{E'}{E}=\dfrac{1+\dfrac{u_b}{c}}{1-\dfrac{u_b}{c}+\dfrac{2E}{\gamma_b mc^2}}}

If we suppose that the incident electron arrives with certain angle l \alpha_i and it is scattered an angle \alpha_f. Then, we would obtain the general inverse Compton formula:

\boxed{\dfrac{E'_f}{E'_i}=\dfrac{1-\beta_i\cos\alpha_i}{1-\beta_i\cos\alpha_f+\dfrac{E'_i}{\gamma_i mc^2}\left(1-\cos\theta\right)}}


In the case of \alpha_f \approx 1/\gamma<<1, i.e., \cos\alpha_f\approx 1, and then

\dfrac{E'}{E}\approx \dfrac{1-\beta_i\cos\alpha_i}{1-\beta_i}\approx \left(1-\beta_i\cos\alpha_i\right)2\gamma_i^2

In conclusion, there is an energy transfer proportional to \gamma_i^2. There are some interesting “maximal boosts”, depending on the final energy (frequency). For instance, if \gamma_i\approx 10^3-10^5, then E_f\approx \gamma_i^2\times 511 keV provides:

a) In the radio branch: 1GHz=10^9Hz, a maximal boost 10^{15}Hz. It corresponds to a wavelength about 300nm (in the UV band).

b) In the optical branch: 4\times 10^{14}Hz, a maximal boost 10^{20}Hz\approx 1.6MeV. It corresponds to photons in the Gamma ray band of the electromagnetic spectrum.

Example 3. Bremsstrahlung.

An electron (a) with rest mass m_a arrives from the left with velocity u_a and it hits a nucleus (b) at rest with mass m_b. After the collision, the cluster “c” moves with speed u_c, and a photon is emitted (d) to the left. That photon is considered “a radiation” due to the recoil of the nucleus.

The equations of momenergy are now:









\boxed{E=\dfrac{(\gamma_a-1)m_am_bc^2}{\gamma_a m_a(1+\beta_a)+m_b}}

In clusters of galaxies, typical temperatures of T\sim 10^7-10^8K provide a kinetic energy of proton and electron at clusters about 1.3-13keV. Relativistic kinetic energy is E_k=(\gamma_a-1)m_ac^2 and it yields \gamma_a\sim 1.0025-1.025 for  hydrogen nuclei (i.e., protons p^+). If \gamma_am_a(1+\beta_a)<<1, then we have E\approx (\gamma_a-1)m_ac^2=(\gamma_a-1)\times 511keV. Then, the electron kinetic energy is almost completely turned into radiation (bremsstrahlung). In particular, bremsstrahlung is a X-ray radiation with E\sim 1.3-13keV.

LOG#039. Relativity: Examples(III).

Example 1. Absorption of a photon by an atom.

In this process, we have from momenergy conservation:

P^\mu_a P_{a\mu}+2P^\mu_a P_{b\mu}+P^\mu_b P_{b\mu}=P^\mu_c P_{c\mu}

If the atom is the rest frame, before absorption we get

P^\mu_a =(m_a c,0,0,0)


Description: an atom “a” at rest with mass m_a absorbs a photon “b” propagating in the x-direction turning itself into an excited atom “c”, moving in the x-axis (suffering a “recoil” after the photon hits it).

The atom after absorption has

P^\mu_c=(m_c c,0,0,0)

Therefore, since the photon verifies:

P^\mu_b P_{b\mu}=\left(\dfrac{h\nu}{c}\right)^2-\left(\dfrac{h\nu}{c}\right)^2=0

and it is true in every inertial frame. Thus,

l (m_a c)^2+2m_a c\dfrac{h\nu}{c}+0=(m_a c)^2


\boxed{m_c=\sqrt{m_a^2+2m_a\dfrac{h\nu}{c}}=m_a\sqrt{1+2\dfrac{h\nu}{m_a c^2}}}

Expanding the square root

m_c\approx m_a\left[ 1+\dfrac{h\nu}{m_a c^2}-\dfrac{1}{2}m_a\left(\dfrac{h\nu}{m_a c}\right)^2+\mathcal{O}(h^3)\right]

In this case,

m_c\approx m_a+\dfrac{h\nu}{ c^2}-\dfrac{1}{2}\left(\dfrac{h\nu}{m_a c}\right)^2

Atom’s rest mass increases by an amount \dfrac{h\nu}{c^2} up to first order in the Planck’s constant, and it decreases up to second order in h a quanity -\dfrac{1}{2}m_a\left(\dfrac{h\nu}{m_a c^2}\right)^2 due to motion ( “recoil”). Therefore,

\dfrac{1}{2}m_a\left(\dfrac{h\nu}{m_a c^2}\right)^2=\dfrac{1}{2}m_a\left( \dfrac{u_a}{c}\right)^2

In this way, identifying terms: u_c=\dfrac{h\nu}{m_a c}

In the laboratory frame, the excited atom velocity is calculated by momenergy conservation. It is simple:

\dfrac{h\nu}{c}=\gamma_c m_c u_c

m_a c^2+h\nu=\gamma_c m_c c^2

\dfrac{h\nu}{u_c c}=m_a+\dfrac{h\nu}{c^2}

Then, we obtain that:

u_c=\dfrac{h\nu c}{m_a c^2+h\nu}=\dfrac{c}{\left(\dfrac{m_a c^2}{h\nu}\right)+1}\approx \dfrac{h\nu}{m_a}

where we have used in the last step m_ac^2>>h\nu.

Example 2. Emission of a photon by an atom.

An atom c at rest, with m_c the rest mass, emits a photon b with frequency \dfrac{E_b}{h}=\nu in the x-direction, turning itself into a non-excited atom “a”, with m_a. What is the energy shift \Delta E=E_c-E_a?

P^\mu_a P_{a\mu}+P^\mu_b P_{b\mu}=P^\mu_c P_{c\mu}


P^\mu_a is P^\mu_a =(m_a,0,0,0) in the rest frame of “a”.

P^\mu_a in the rest from “c” reads P^\mu_a =\left(\gamma_a m_a c, -\dfrac{E_b}{c},0,0\right)

P^\mu_b in the rest frame of “c” is P^\mu_b =\left(\dfrac{E_b}{c},\dfrac{E_b}{c},0,0\right)

P^\mu_a in the rest frame of “c” is P^\mu_c=(m_c c,0,0,0)

In this way, we have

(m_a c)^2+\gamma _a m_a E_b +\left( \dfrac{E_b}{c}\right)^2=\gamma _am_am_c c^2

\gamma_a m_a c^2+E_b=m_c c^2

m_a^2c^2+\dfrac{E_b}{c^2}\left(m_c c^2-E_b\right)^2=m_c^2c^2-m_c E_b

m_a^2c^2+m_cE_b=m^2_c c^2-m_c E_b


From this equations we deduce that

E_b=\dfrac{m_c c^2-m_a^2c^2}{2m_c}=\dfrac{E_c^2-E_a^2}{2E_c}=(E_c-E_a)\left(1-\dfrac{E_c-E_a}{2E_c}\right)

And from the definition E_c-E_a we get E_b=\Delta E\left(1-\dfrac{\Delta E}{2E_c}\right)

Note: the photon’s energy IS NOT equal to the difference of the atomic rest energies but it is less than that due to the emission process. This fact implies that the atom experieces “recoil”, and it gains kinetic energy at the expense of the photon. There is a good chance for the photon not to be absorbed by an atom of some kind. However, “resonance absorption” becomes problematic. The condition for recoilless resonant absorption to occur nonetheless, e.g., the reabsorption of gamma ray photons by nuclei of the some kind were investigated by Mössbauer. The so-called Mössbauer effect has been important not only to atomic physics but also to verify the theory of general relativity. Furthermore, it is used in materials reseach in present time as well. In 1958, Rudolf L. Mössbauer reported the 1st reoilless gamma emission. It provided him the Nobel Prize in 1961.

Example 3. Decay of two particles at rest.

The process we are going to study is the reaction C\rightarrow AB

The particle C decays into A and B. It is the inverse process of the completely inelastic collision we studied in a previous example.

From the conservation of the tetramomentum

(m_A c)^2+2\gamma_A\gamma_Bm_Am_B(c^2-u_Au_B)+(m_Bc)^2=(m_C c)^2

Choose the frame in which the following equation holds


Let u_A,u_B be the laboratory frame velocities in the rest frame of “C”. Then, we deduce



P^\mu_C=(m_C c,0,0,0)

From these equations (m_Ac)^2+\gamma_A\gamma_Bm_Am_B(c^2-u_Au_B)=\gamma_Am_Am_C c^2







Therefore, the total kinetic energy of the two particles A,B is equal to the mass defect in the decay of the particle.

Example 4. Pair production by a photon.

Suppose the reaction \gamma \rightarrow e^+e^-, in which a single photon (\gamma)decays into a positron-electron  pair.

That is, h\nu\rightarrow e^+e^-.


Squaring the momenergy in both sides:

P^\mu_a P_{\mu a}=P^\mu_c P_{\mu c}+2P^\mu_c P_{\mu d}+P^\mu_d P_{\mu d}

In the case of the photon: P^\mu_a P_{a\mu}=0

In the case of the electron and the positron: P^\mu_c P_{\mu c}=P^\mu_d P_{\mu d}=-(m_e c)^2=-(mc)^2

We calculate the components of momenergy in the center of mass frame:

P^\mu_c=\left( \dfrac{E_c}{c},p_{cx},p_{cy},p_{cz}\right)

P^\mu_d=\left( \dfrac{E_d}{c},-p_{dx},-p_{dy},-p_{dz}\right)

with E_c=E_d=mc^2. Therefore,

2 P^\mu_c P_{\mu d}=-2\left( \dfrac{E_c^2}{c^2}+p_x^2+p_y^2+p_z^2\right)


-2(mc)^2-2\left( \dfrac{E_c^2}{c^2}+\mathbf{p}^2\right)=0

2(mc)^2+2\left( \dfrac{E_c^2}{c^2}+\mathbf{p}^2\right)=0

This equation has no solutions for any positive solution of the photon energy! It’s logical. In vacuum, it requires the pressence of other particle. For instance \gamma \gamma \rightarrow e^+e^- is the typical process in a “photon collider”. Other alternative is that the photon were “virtual” (e.g., like QED reactions e^+e^-\rightarrow \gamma^\star\rightarrow e^+e^-). Suppose, alternatively, the reaction AB\rightarrow CDE. Solving this process is hard and tedious, but we can restrict our attention to the special case of three particles C,D,E staying together in a cluster C. In this way, the real process would be instead AB\rightarrow C. In the laboratory frame:


we get

P^\mu_A=\left( \dfrac{E_A}{c},\dfrac{E_A}{c},0,0\right)

P^\mu_B=\left( Mc,0,0,0\right)

and in the cluster reference frame


Squaring the momenergy:





In the absence of an extra mass M, i.e., when M\rightarrow 0, the energy E_A would be undefined, and it would become unphysical. The larger M is, the smaller is the additional energy requiere for pair production. If M is the electron mass, and M=m, the photon’s energy must be twice the size of the rest energy of the pair, four times the rest energy of the photon. It means that we would obtain


and thus \gamma =4\rightarrow


\beta=\sqrt{\dfrac{15}{16}}=\dfrac{\sqrt{15}}{4}\approx 0.97

v=\dfrac{\sqrt{15}}{4}c\approx 0.97c

In general, if m\neq M we would deduce:


Example 5. Pair annihilation of an  electron-positron couple.

Now, the reaction is the annihilation of a positron-electron pair into two photons, turning mass completely into (field) energy of light quanta. l e^+e^-\rightarrow \gamma \gamma implies the momenergy conservation


where “a” is the moving electron and “b” is a postron at rest. Squaring the identity, it yields

l P^\mu_aP_{\mu a}+2P^\mu_aP_{\mu b}+P^\mu_bP_{\mu b}=P^\mu_cP_{\mu c}+2P^\mu_cP_{\mu c}+P^\mu_dP_{\mu d}

Then, we deduce


P^\mu_b=\left( m_e c,0,0,0\right)

while we do know that

P^\mu_aP_{\mu a}=P^\mu_bP_{\mu b}=-(m_e c)^2 for the electron/positron and

P^\mu_cP_{\mu c}=P^\mu dP_{\mu d}=0 since they are photons. The left hand side is equal to -2(m_e c)^2+2E_am_e, and for the momenergy in the right hand side


P^\mu_d=\left( \dfrac{E_d}{c},p_{dx},0,0\right)

Combining both sides, we deduce

(m_a c)^2+E_a m_e=\dfrac{E_cE_d}{c^2}-p_{cx}p_{dx}

The only solution to the right hand side to be not zero is when we select p_{cx}=\pm \dfrac{E_c}{c} and p_{dx}=\pm\dfrac{E_d}{c} and we plug values with DIFFERENT signs. In that case,

\boxed{(mc)^2+E_a m=\dfrac{2E_cE_d}{c^2}}

From previous examples:

P^\mu_aP_{\mu_b}+P^\mu_bP_{\mu b}=P^\mu_c P_{b\mu}+P^\mu_aP_{\mu b}

and we evaluate it in the laboratory frame to give

\boxed{E_a m+(mc)^2=E_c m+E_d m}

The last two boxed equations allow us to solve for E_d


If we insert this equation into the first boxed equation of this example, we deduce that

(mc)^2+mE_a=\dfrac{2E_c}{c^2}\left( E_a-E_c+mc^2\right)


l \dfrac{1}{2}mc^2\left(mc^2+E_a\right)=-E_c^2+E_c(E_a+mc^2)

Solving for E_c this last equation


\boxed{E_c^{1,2}=E_d^{1,2}=\dfrac{\left(E_a+mc^2\right)\pm \sqrt{\left(E_a+mc^2\right)\left(E_a-mc^2\right)}}{2}}