LOG#028. Rockets and relativity.


The second post in this special thread of 3 devoted to Neil Armstrong memory has to do with rocketry.

Firstly, for completion, we are going to study the motion of a rocket in “vacuum” according to classical physics. Then, we will deduce the relatistic rocket equation and its main properties.


The fundamental law of Dynamics, following Sir Isaac Newton, reads:

    \[ \mathbf{F}=\dfrac{d\mathbf{p}}{dt}\]

Suppose a rocket with initial mass M_i and initial velocity u_i=0. It ejects mass of propellant “gas” with “gas speed” (particles of gas have a relative velocity or speed with respect to the rest observer when the rocket move at speed \mathbf{v}) equals to u_0 (note that the relative speed will be u_{rel} and the propellant mass is m_0. Generally, this speed is also called “exhaust velocity” by engineers. The motion of a variable mass or rocket is given by the so-called Metcherski’s equation:

    \[ \boxed{M\dfrac{d\mathbf{v}}{dt}=-\mathbf{u_0}\dfrac{dM}{dt}+\mathbf{F}}\]

where -\mathbf{u_0}=\mathbf{v_{gas}}-\mathbf{v}. The Metcherski’s equation can be derived as follows: the rocket changes its mass and velocity so M'=M+dM and V'=V+dV, so the change in momentum is equal to M'V'=(M+dM)(V+dV), plus an additional term v_{gas}dv_{gas} and -mV. Therefore, the total change in momentum:

    \[ dP=Fdt=(M+dM)(V+dV)+v_{gas}dv_{gas}-mV\]

Neglecting second order differentials, and setting the conservation of mass (we are in the non-relativistic case)

    \[ dM+dm_{gas}=0\]

we recover

    \[ MdV=v_{rel}dM+Fdt\]

that represents (with the care of sign in relative speed) the Metcherski equation we have written above.

Generally speaking, the “force” due to the change in “mass” is called thrust.  With no external force, from the remaining equation of the thrust and velocity, and it can be easily integrated

    \[ u_f=-u_0\int_{M_i}^{M_f}dM\]

and thus we get the Tsiolkowski’s rocket equation:

    \[ \boxed{\mathbf{u_f}=\mathbf{u_0}\ln \dfrac{M_i}{M_f}}\]

Engineers use to speak about the so-called mass ratio R=\dfrac{M_f}{M_i}, although sometimes the reciprocal definition is also used for such a ratio so be aware, and in terms of this the Tsiokolski’s equation reads:

    \[ \boxed{\mathbf{u_f}=\mathbf{u_0}\ln \dfrac{1}{R}}\]

We can invert this equation as well, in order to get

    \[ \boxed{R=\dfrac{M_f}{M_i}=\exp\left(-\dfrac{u_f}{u_0}\right)}\]

Example: Calculate the fraction of mass of a one-stage rocket to reach the Earth’s orbit. Typical values for u_f=8km/s and u_0=4km/s show that the mass ratio is equal to R=0.14. Then, only the 14\% of the initial mass reaches the orbit, and the remaining mass is fuel.

Multistage rockets offer a good example of how engineer minds work. They have discovered that a multistage rocket is more effective than the one-stage rocket in terms of maximum attainable speed and mass ratios. The final n-stage launch system for rocketry states that the final velocity is the sum of the different gains in the velocity after the n-th stage, so we can obtain

    \[ \displaystyle{u_f=\sum_{i=1}^{n}u_i^f=u_1^f+\cdots+u_n^f}\]

After the n-th step, the change in velocity reads

    \[ u_i^f=c_i\ln \dfrac{1}{R_i}\]

where the i-th mass ratios are defined recursively as the final mass in the n-th step and the initial mass in that step, so we have

    \[ \displaystyle{u_f=\sum_i c_i\ln \dfrac{1}{R_i}}\]

and we define the total mass ratio:

    \[ \displaystyle{R_T=\prod_i R_i}\]

If the average effective rocket exhaust velocity is the same in every step/stage, e.g. c_i=c, we get

    \[ \displaystyle{u_f=c\ln \left( \prod_{i=1}^{n} R_i^{-1}\right)}\]


    \[ \displaystyle{u_f=c \ln \left[ \left(\dfrac{M_0}{M_f}\right)_1\left(\dfrac{M_0}{M_f}\right)_2\cdots \left(\dfrac{M_0}{M_f}\right)_n\right]=c\ln \left[\left(\dfrac{M_0}{M_f}\right)_T\right]}\]

The influence of the number of steps, for a given exhaust velocity, in the final attainable velocity can be observed in the next plots:




We proceed now to the relativistic generalization of the previous rocketry. An observer in the laboratory frame observes that total momentum is conserved, of course, and so:

    \[ M'du'=-u'_0dM'\]

where du' is the velocity increase in the rocket with a rest mass M’ in the instantaneous reference frame of the moving rocket S’. It is NOT equal to its velocity increase measured in the unprimed reference frame, du. Due to the addition theorem of velocities in SR, we have

    \[ u+du=\dfrac{u+du'}{1+\dfrac{udu'}{c^2}}\]

where u is the instanteneous velocity of the rocket with respect to the laboratory frame S. We can perform a Taylor expansion of the denominator in the last equation, in order to obtain:

    \[ u+du=(u+du')\left(1-\dfrac{udu'}{c^2}\right)\]

and then

    \[ u+du=u+du'\left(1-\dfrac{u^2}{c^2}\right)\]

and finally, we get

    \[ du'=\dfrac{du}{1-\dfrac{u^2}{c^2}}=\gamma^2_u du\]

Plugging this equation into the above equation for mass (momentum), and integrating

    \[ \displaystyle{\int_{0}^{u_f}\dfrac{du}{1-\dfrac{u^2}{c^2}}=-u'_0\int_{M'_0}^{M'_f}dM}\]

we deduce that the relativistic version of the Tsiolkovski’s rocket equation, the so-called relativistic rocket equation, can be written as:

    \[ \dfrac{c}{2}\ln \dfrac{1+\dfrac{u_f}{c}}{1-\dfrac{u_f}{c}}=u'_0\ln\dfrac{M'_i}{M'_f}\]

We can suppress the primes if we remember that every data is in the S’-frame (instantaneously), and rewrite the whole equation in the more familiar way:

    \[ \boxed{u_f=c\dfrac{1-\left(\dfrac{M_f}{M_0}\right)^{\frac{2u_0}{c}}}{1+\left(\dfrac{M_f}{M_0}\right)^{\frac{2u_0}{c}}}=c\dfrac{1-R^{\frac{2u_0}{c}}}{1+R^{\frac{2u_0}{c}}}}\]

where the mass ratio is defined as before R=\dfrac{M_f}{M_i}. Now, comparing the above equation with the rapidity/maximum velocity in the uniformly accelerated motion:

    \[ u_f=c\tanh \left(\dfrac{g\tau}{c}\right)\]

we get that relativistic rocket equation can be also written in the next manner:

    \[ u_f=c\tanh \left[ -\dfrac{u_0}{c}\ln \left(\dfrac{1}{R}\right)\right]\]

or equivalently

    \[ u_f=c\tanh \left[ \dfrac{u_0}{c}\ln R\right]\]

since we have in this case

    \[ \dfrac{g\tau}{c}=-\dfrac{u_0}{c}\ln \left(\dfrac{1}{R}\right)=\dfrac{u_0}{c}\ln R\]

and thus

    \[ R^{\frac{u_0}{c}}=\left(\dfrac{M_f}{M_i}\right)^{\frac{u_0}{c}}=\exp \left(-\dfrac{g\tau}{c}\right)\]

If the propellant particles move at speed of light, e.g., they are “photons” or ultra-relativistic particles that move close to the speed of light we have the celebrated “photon rocket”. In that case, setting u_0=c, we would obtain that:

    \[ \boxed{u_f=c\dfrac{1-\left(\dfrac{M_f}{M_0}\right)^{2}}{1+\left(\dfrac{M_f}{M_0}\right)^{2}}=c\dfrac{1-R^{2}}{1+R^{2}}=c\tanh \ln R}\]

and where for the photon rocket (or the ultra-relativistic rocket) we have as well

    \[ \dfrac{g\tau}{c}=-\ln \left(\dfrac{1}{R}\right)=\ln R\]

Final remark: Instead of the mass ratio, sometimes is more useful to study the ratio fuel mass/payload. In that case, we set M_f=m and M_0=m+M, where M is the fuel mass and m is the payload. So, we would write

    \[ R=\dfrac{m}{m+M}\]

so then the ratio fuel mass/payload will be

    \[ \dfrac{M}{m}=R^{-1}-1=\exp \left(\dfrac{g\tau}{c}\right)-1\]

We are ready to study the interstellar trip with our current knowledge of Special Relativity and Rocketry. We will study the problem in the next and final post of this fascinating thread. Stay tuned!

LOG#027. Accelerated motion in SR.



Hi, everyone! This is the first article in a thread of 3 discussing accelerations in the background of special relativity (SR). They are dedicated to Neil Armstrong, first man on the Moon! Indeed,  accelerated motion in relativity has some interesting and sometimes counterintuitive results, in particular those concerning the interstellar journeys whenever their velocities are close to the speed of light(i.e. they “are approaching” c).

Special relativity is a theory considering the equivalence of every  inertial frame ( reference frames moving with constant relative velocity are said to be inertial frames) , as it should be clear from now, after my relativistic posts! So, in principle, there is nothing said about relativity of accelerations, since accelerations are not relative in special relativity ( they are not relative even in newtonian physics/galilean relativity). However, this fact does not mean that we can not study accelerated motion in SR. The own kinematical framework of SR allows us to solve that problem. Therefore, we are going to study uniform (a.k.a. constant) accelerating particles in SR in this post!

First question: What does “constant acceleration” mean in SR?   A constant acceleration in the S-frame would give to any particle/object a superluminal speed after a finite time in non-relativistic physics! So, of course, it can not be the case in SR. And it is not, since we studied how accelerations transform according to SR! They transform in a non trivial way! Moreover, a force growing beyond the limits would be required for a “massive” particle ( rest mass m\neq 0). Suppose this massive particle (e.g. a rocket, an astronaut, a vehicle,…) is at rest in the initial time t=t'=0, and it accelerates in the x-direction (to be simple with the analysis and the equations!). In addition, suppose there is an observer left behind on Earth(S-frame), so Earth is at rest with respect to the moving particle (S’-frame). The main answer of SR to our first question is that we can only have a constant acceleration in the so-called instantaneous rest frame of the particle.  We will call that acceleration “proper acceleration”, and we will denote it by the letter \alpha. In fact, in many practical problems, specially those studying rocket-ships, the acceleration is generally given the same magnitude as the gravitational acceleration on Earth (alpha=g\approx 9.8ms^{-2}\approx 10 ms^{-2}).

Second question: What are the observed acceleration in the different frames? If the instantaneous rest frame S’ is an inertial reference frame in some tiny time dt', at the initial moment, it has the same velocity as the particle (rocket,…) in the S-frame, but it is not accelerated, so the velocity in the S’-frame vanishes at that time:

    \[ \mathbf{u}'=(0,0,0)\]

Since the acceleration of the particle is, in the S’-frame, the proper acceleration, we get:

    \[ \mathbf{a}'=(a'_x,0,0)=(\alpha,0,0)=(g,0,0)=\mbox{constant}\]

Using the transformation rules for accelerations in SR we have studied, we get that the instantaneous acceleration in the S-frame is given by

    \[ \mathbf{a}=(a_x,0,0)=\left(\dfrac{g}{\gamma^3},0,0\right)\]

Since the relative velocity between S and S’ is always the same to the moving particle velocity in the S-frame, the following equation holds

    \[ v=u_x\]

We do know that

    \[ a_x=\dfrac{du_x}{dt}=\left(1-\dfrac{u_x^2}{c^2}\right)^{3/2}g\]

Due to time dilation

    \[ dt'=dt/\gamma\]

so in the S-frame, the particle moves with the velocity

    \[ du_x=\left(1-\dfrac{u_x^2}{c^2}\right)^{3/2}g dt\]

We can now integrate this equation

    \[ \int_0^{u_x}\dfrac{1}{(c^2-u_x^2)^{3/2}}du_x=\dfrac{g}{c^3}\int_0^t dt\]

The final result is:

    \[ \boxed{u_x=\dfrac{g t}{\sqrt{1+\left(\dfrac{g t}{c}\right)^2}}}\]

We can check some limit cases from this relativistic result for uniformly accelerated motion in SR.

1st. Short time limit: gt<< c\longrightarrow u_x\approx gt=\alpha t. This is the celebrated nonrelativistic result, with initial speed equal to zero (we required that hypothesis in our discussion above).

2nd. Long time limit: t\rightarrow \infty. In this case, the number one inside the root is very tiny compared with the term depending on acceleration, so it can be neglected to get u_x\approx \dfrac{gt}{gt/c}=c. So, we see that you can not get a velocity higher than the speed of light with the SR framework at constant acceleration!

Furthermore, we can use the definition of relativistic velocity in order to integrate the associated differential equation, and to obtain the travelled distance as a function of t, i.e. x(t), as follows

    \[ u_x=\dfrac{dx}{dt}=\dfrac{gt}{\sqrt{1+\left(\dfrac{g t}{c}\right)^2}}\]

    \[ \int_0^x dx=\int_0^t\dfrac{gt dt}{\sqrt{1+\left(\dfrac{g t}{c}\right)^2}}=\int_0^t\dfrac{ctdt}{\sqrt{\dfrac{c^2}{g^2}+t^2}}\]

We can perform the integral with the aid of the following known result ( see,e.g., a mathematical table or use a symbolic calculator or calculate the integral by yourself):

    \[ \int \dfrac{ctdt}{\sqrt{\left(\dfrac{c}{g}\right)^2+t^2}}=c\sqrt{\left(\dfrac{c}{g}\right)^2+t^2}+\mbox{constant}=c\sqrt{\left(\dfrac{c}{g}\right)^2+t^2}+C\]

From this result, and the previous equation, we get the so-called relativistic path-time law for uniformly accelerated motion in SR:

    \[ x=c\sqrt{\left(\dfrac{c}{g}\right)^2+t^2}-\dfrac{c^2}{g}\]

or equivalently

    \[ \boxed{x=x(t)=\dfrac{c^2}{g}\left(\sqrt{1+\left(\dfrac{gt}{c}\right)^2}-1\right)}\]

For consistency, we observe that in the limit of short times, the terms in the big brackets approach 1+\frac{1}{2}\left(\frac{gt}{c}\right)^2, in order to get x\approx \frac{1}{2}gt^2, so we obtain the nonrelativistic path-time relationship x=\frac{1}{2}gt^2 with g=a_x. In the limit of long times, the terms inside the brackets can be approximated to gt/c, and then, the final result becomes x\approx ct. Note that the velocity is not equal to the speed of light, this result is a good approximation whenever the time is “big enough”, i.e., it only works for “long times” asymptotically!

And finally, we can write out the transformations of acceleration between the two frames in a explicit way:

    \[ a_x=\left[1-\dfrac{\left(\dfrac{gt}{c}\right)^2}{1+\left(\dfrac{gt}{c}\right)^2}\right]^{3/2}g\]

that is

    \[ \boxed{a_x=\dfrac{1}{\left[1+\left(\dfrac{gt}{c}\right)^2\right]^{3/2}}g}\]

Check 1: For short times, a_x\approx g=\mbox{constant}, i.e., the non-relativistic result, as we expected!

Check 2: For long times, a_x\approx \dfrac{c^3} {g^2t^3}\rightarrow 0. As we could expect, the velocity increases in such a way that “saturates” its own increasing rate and the speed of light is not surpassed. The fact that the speed of light can not be surpassed or exceeded is the unifying “theme” through special relativity, and it rest in the “noncompact” nature of the Lorentz group due to the \gamma factor, since it would become infinity at v=c for massive particles.

It is inevitable: as time passes, a relativistic treatment is indispensable, as the next figures show




The next table is also remarkable (it can be easily built with the formulae we have seen till now with any available software):


Let us review the 3 main formulae until this moment

    \[ \boxed{a_x=\dfrac{1}{\left[1+\left(\dfrac{gt}{c}\right)^2\right]^{3/2}}g}\]

    \[ \boxed{u_x=\dfrac{\alpha t}{\sqrt{1+\left(\dfrac{g t}{c}\right)^2}}}\]

    \[ \boxed{x=x(t)=\dfrac{c^2}{g}\left(\sqrt{1+\left(\dfrac{gt}{c}\right)^2}-1\right)}\]

We have calculated these results in the S-frame, it is also important and interesting to calculate the same stuff in the S’-frame of the moving particle. The proper time \tau=t' is defined as:

    \[ \boxed{d\tau=dt\sqrt{1-\left(\dfrac{u_x}{c}\right)^2}}\]


    \[ d\tau=dt\left[1-\dfrac{\left(\dfrac{gt}{c}\right)^2}{1+\left(\dfrac{gt}{c}\right)^2}\right]^{1/2}\]

We can perform the integral as before

    \[ \displaystyle{\int_0^\tau d\tau=\int_0^t\dfrac{dt}{\sqrt{1+\left(\dfrac{gt}{c}\right)^2}}}\]

and thus

    \[ \tau=\dfrac{c}{g}\int_0^\tau\dfrac{dt}{\sqrt{\left(\dfrac{c}{g}\right)^2+t^2}}=\dfrac{c}{g}\ln \left(\dfrac{gt}{c}+\sqrt{\left(\dfrac{gt}{c}\right)^2+1}\right)\bigg|_0^t\]

Finally, the proper time(time measured in the S’-frame) as a function of the elapsed time on Earth (S-frame) and the acceleration is given by the very important formula:

    \[ \boxed{\tau=\dfrac{c}{g}\ln \left(\dfrac{gt}{c}+\sqrt{1+\left(\dfrac{gt}{c}\right)^2}\right)}\]

And now, let us set z=gt/c, therefore we can write the above equation in the following way:

    \[ \dfrac{g\tau}{c}=\ln \left( z+\sqrt{1+z^2}\right)\]

Remember now, from our previous math survey, that

    \[ \sinh^{-1}z=\ln \left( z+\sqrt{1+z^2}\right)\]

, so we can invert the equation in order to obtain t as function of the proper time since:

    \[ \boxed{\tau=\dfrac{c}{g}\sinh^{-1}\left(\dfrac{gt}{c}\right)}\]

    \[ \boxed{t=\dfrac{c}{g}\sinh \left(\dfrac{g\tau}{c}\right)}\]

Inserting this last equation in the relativistic equation path-time for the uniformly accelerated body in SR, we obtain:

    \[ x=x(\tau)=\dfrac{c^2}{g}\left(\sqrt{1+\sinh^2\left(\dfrac{g\tau}{c}\right)}-1\right)\]


    \[ \boxed{x=x(\tau)=\dfrac{c^2}{g}\left[\cosh \left(\dfrac{g\tau}{c}\right)-1\right]}\]

Similarly, we can calculate the velocity-proper time law. Previous equations yield

    \[ u_x=\dfrac{c\sinh\left(\dfrac{g\tau}{c}\right)}{\sqrt{1+\sinh^2\left(\dfrac{g\tau}{c}\right)}}=\dfrac{c\sinh \left(\dfrac{g\tau}{c}\right)}{\cosh \left(\dfrac{g\tau}{c}\right)}\]

and thus the velocity-proper time law becomes

    \[ \boxed{u_x=c\tanh \left(\dfrac{g\tau}{c}\right)}\]

Remark: this last result is compatible with a rapidity factor \varphi= \left(\dfrac{g\tau}{c}\right).


    \[ a_x=\dfrac{du_x}{dt}=\left(1-\dfrac{u_x^2}{c^2}\right)^{3/2}g=\left(1-\tanh^2\left(\dfrac{g\tau}{c}\right)\right)^{3/2}g=\dfrac{1}{\cosh^3\left(\dfrac{g\tau}{c}\right)}g\]


From this, we can read the reason why we said before that constant acceleration is “meaningless” unless we mean or fix certain proper time in the S’-frame since whenever we select a proper time, and this last relationship gives us the “constant” acceleration observed from the S-frame after the transformation. Of course, from the S-frame, as this function shows, acceleration is not “constant”, it is only “instantaneously” constant. We have to take care in relativity with the meaning of the words. Mathematics is easy and clear and generally speaking it is more precise than “words”, common language is generally fuzzy unless we can explain what we are meaning!

As the final part of this log entry, let us summarize the time-proper time, velocity-proper time, acceleration-proper time-proper acceleration and distance- proper time laws for the S’-frame:

    \[ \boxed{t=\dfrac{c}{g}\sinh \left(\dfrac{g\tau}{c}\right)}\]

    \[ \boxed{u_x=c\tanh \left(\dfrac{g\tau}{c}\right)}\]

    \[ \boxed{a_x=\dfrac{1}{\cosh^3\left(\dfrac{g\tau}{c}\right)}g}\]

    \[ \boxed{x=x(\tau)=\dfrac{c^2}{g}\left[\cosh \left(\dfrac{g\tau}{c}\right)-1\right]}\]

My last paragraph in this post is related to express the acceleration g\approx 10ms^{-2} in a system of units where space is measured in lightyears (we take c=300000km/s) and time in years (we take 1yr=365 days). It will be useful in the next 2 posts:

    \[ g=10\dfrac{m}{s^2}\dfrac{1ly}{9.46\cdot 10^{15}m}\dfrac{9.95\cdot 10^{14}s^2}{1yr^2}=1.05\dfrac{lyr}{yr^2}\approx 1\dfrac{lyr}{yr^2}\]

Another election you can choose is

    \[ g=9.8\dfrac{m}{s^2}=1.03\dfrac{lyr}{yr^2}\approx 1\dfrac{lyr}{yr^2}\]

so there is no a big difference between these two cases with terrestrial-like gravity/acceleration.

LOG#026. Boosts, rapidity, HEP.


In euclidean two dimensional space, rotations are easy to understand in terms of matrices and trigonometric functions. A plane rotation is given by:

    \[ \boxed{\begin{pmatrix}x'\\ y'\end{pmatrix}=\begin{pmatrix}\cos \theta & \sin \theta \\ -\sin \theta & \cos \theta \end{pmatrix}\begin{pmatrix}x\\ y\end{pmatrix}}\leftrightarrow \boxed{\mathbb{X}'=\mathbb{R}(\theta)\mathbb{X}}\]

where the rotation angle is \theta, and it is parametrized by 0\leq \theta \leq 2\pi.

Interestingly, in minkovskian two dimensional spacetime, the analogue does exist and it is written in terms of matrices and hyperbolic trigonometric functions. A “plane” rotation in spacetime is given by:

    \[ \boxed{\begin{pmatrix}ct'\\ x'\end{pmatrix}=\begin{pmatrix}\cosh \varphi & -\sinh \varphi \\ -\sinh \varphi & \cosh \varphi \end{pmatrix}\begin{pmatrix}ct\\ x\end{pmatrix}}\leftrightarrow \boxed{\mathbb{X}'=\mathbb{L}(\varphi)\mathbb{X}}\]

Here, \varphi = i\psi is the so-called hiperbolic rotation angle, pseudorotation, or more commonly, the rapidity of the Lorentz boost in 2d spacetime. It shows that rapidity are a very useful parameter for calculations in Special Relativity. Indeed, it is easy to check that

    \[ \mathbb{L}(\varphi_1+\varphi_2)=\mathbb{L}(\varphi_1)\mathbb{L}\mathbb(\varphi_2)\]

So, at least in the 2d spacetime case, rapidities are “additive” in the written sense.

Firstly, we are going to guess the relationship between rapidity and velocity in a single lorentzian spacetime boost. From the above equation we get:

    \[ ct'=ct\cosh \varphi -x\sinh \varphi \]

    \[ x'=-ct\sinh \varphi +x\cosh \varphi\]

Multiplying the first equation by \cosh \varphi and the second one by \sinh \varphi, we add the resulting equation to obtain:

    \[ ct'\cosh\varphi+x'\sinh \varphi =ct\cosh^2 \varphi -ct\sinh^2 \varphi =ct\]

that is

    \[ ct'\cosh\varphi+x'\sinh \varphi =ct\]

From this equation (or the boxed equations), we see that \varphi=0 corresponds to x'=x and t'=t. Setting x'=0, we deduce that

    \[ x'=0=-ct\sinh \varphi +x\cosh \varphi\]

and thus

    \[ ct\tanh \varphi =x\]


    \[ x=ct\tanh\varphi\]


Since t\neq 0, and the pseudorotation seems to have a “pseudovelocity” equals to V=x/t, the rapidity it is then defined through the equation:

    \[ \boxed{\tanh \varphi=\dfrac{V}{c}=\beta}\leftrightarrow\mbox{RAPIDITY}\leftrightarrow\boxed{\varphi=\tanh^{-1}\beta}\]

If we remember what we have learned in our previous mathematical survey, that is,

    \[ \tanh^{-1}z=\dfrac{1}{2}\ln \dfrac{1+z}{1-z}=\sqrt{\dfrac{1+z}{1-z}}\]

We set z=\beta in order to get the next alternative expression for the rapidity:

    \[ \varphi=\ln \sqrt{\dfrac{1+\beta}{1-\beta}}=\dfrac{1}{2}\ln \dfrac{1+\beta}{1-\beta}\leftrightarrow \exp \varphi=\sqrt{\dfrac{1+\beta}{1-\beta}}\]

In experimental particle physics, in general 3+1 spacetime, the rapidity definition is extended as follows. Writing, from the previous equations above,

    \[ \sinh \varphi=\dfrac{\beta}{\sqrt{1-\beta^2}}\]

    \[ \cosh \varphi=\dfrac{1}{\sqrt{1-\beta^2}}\]

and using these two last equations, we can also write momenergy components using rapidity in the same fashion. Suppose that for some particle(object), its  mass is m, its energy is E, and its (relativistic) momentum is \mathbf{P}. Then:

    \[ E=mc^2\cosh \varphi\]

    \[ \lvert \mathbf{P} \lvert =mc\sinh \varphi\]

From these equations, it is trivial to guess:

    \[ \varphi=\tanh^{-1}\dfrac{\lvert \mathbf{P} \lvert c}{E}=\dfrac{1}{2}\ln \dfrac{E+\lvert \mathbf{P} \lvert c}{E-\lvert \mathbf{P} \lvert c}\]

This is the completely general definition of rapidity used in High Energy Physics (HEP), with a further detail. In HEP, physicists used to select the direction of momentum in the same direction that the collision beam particles! Suppose we select some orientation, e.g.the z-axis. Then, \lvert \mathbf{P} \lvert =p_z and rapidity is defined in that beam direction as:

    \[ \boxed{\varphi_{hep}=\tanh^{-1}\dfrac{\lvert \mathbf{P}_{beam} \lvert c}{E}=\dfrac{1}{2}\ln \dfrac{E+p_z c}{E-p_z c}}\]

In 2d spacetime, rapidities add nonlinearly according to the celebrated relativistic addition rule:

    \[ \beta_{1+2}=\dfrac{\beta_1+\beta_2}{1+\frac{\beta_1\beta_2}{c^2}}\]

Indeed, Lorentz transformations do commute in 2d spacetime since we boost in a same direction x, we get:

    \[ L_1^xL_2^x-L_2^xL_1^x=0\]


    \[ L_1^x=\begin{pmatrix}\gamma_1 & -\gamma_1\beta_1\\ -\gamma_1\beta_1 &\gamma_1 \end{pmatrix}\]

    \[ L_2^x=\begin{pmatrix}\gamma_2 & -\gamma_2\beta_2\\ -\gamma_2\beta_2 &\gamma_2 \end{pmatrix}\]

This commutativity is lost when we go to higher dimensions. Indeed, in spacetime with more than one spatial direction that result is not true in general. If we build a Lorentz transformation with two boosts in different directions V_1=(v_1,0,0) and V_2=(0,v_2,0), the Lorentz matrices are ( remark for experts: we leave one direction in space untouched, so we get 3×3 matrices):

    \[ L_1^x=\begin{pmatrix}\gamma_1 & -\gamma_1\beta_1 &0\\ -\gamma_1\beta_1 &\gamma_1 &0\\ 0& 0& 1\end{pmatrix}\]

    \[ L_2^y=\begin{pmatrix}\gamma_2 & 0&-\gamma_2\beta_2\\ 0& 1& 0\\ -\gamma_2\beta_2 & 0&\gamma_2 \end{pmatrix}\]

and it is easily checked that

    \[ L_1^xL_2^y-L_2^yL_1^x\neq 0\]

Finally, there is other related quantity to rapidity that even experimentalists do prefer to rapidity. It is called: PSEUDORAPIDITY!

Pseudorapidity, often denoted by \eta describes the angle of a particle relative to the beam axis. Mathematically speaking is:

    \[ \boxed{\eta=-\ln \tan \dfrac{\theta}{2}}\leftrightarrow \mbox{PSEUDORAPIDITY}\leftrightarrow \boxed{\exp (\eta)=\dfrac{1}{\tan\dfrac{\theta}{2}}}\]

where \theta is the angle between the particle momentum \mathbf{P}  and the beam axis. The above relation can be inverted to provide:

    \[ \boxed{\theta=2\tan^{-1}(e^{-\eta})}\]

The pseudorapidity in terms of the momentum is given by:

    \[ \boxed{\eta=\dfrac{1}{2}\ln \dfrac{\vert \mathbf{P}\vert +P_L}{\vert \mathbf{P}\vert -P_L}}\]

Note that, unlike rapidity, pseudorapidity depends only on the polar angle of its trajectory, and not on the energy of the particle.

In hadron collider physics,  and other colliders as well, the rapidity (or pseudorapidity) is preferred over the polar angle because, loosely speaking, particle production is constant as a function of rapidity. One speaks of the “forward” direction in a collider experiment, which refers to regions of the detector that are close to the beam axis, at high pseudorapidity \eta.

The rapidity as a function of pseudorapidity is provided by the following formula:

    \[ \boxed{\varphi=\ln\dfrac{\sqrt{m^2+p_T^2\cosh^2\eta}+p_T\sinh \eta}{\sqrt{m^2+p_T^2}}}\]

where p_T is the momentum transverse to the direction of motion and m is the invariant mass of the particle.

Remark: The difference in the rapidity of two particles is independent of the Lorentz boosts along the beam axis.

Colliders measure physical momenta in terms of transverse momentum p_T instead of the momentum in the direction of the beam axis (longitudinal momentum) P_L=p_z, the polar angle in the transverse plane (genarally denoted by l \phi) and pseudorapidity \eta. To obtain cartesian momenta (p_x,p_y,p_z)  (with the z-axis defined as the beam axis), the following transformations are used:

    \[ p_x=P_T\cos\phi\]

    \[ p_y=P_T\sin\phi\]

    \[ p_z=P_T\sinh\eta\]

Thus, we get the also useful relationship

    \[ \vert P \vert=P_T\cosh\eta\]

This quantity is an observable in the collision of particles, and it can be measured as the main image of this post shows.

LOG#025. Minkovski diagrams.


“(…)The views of space and time which I wish to lay before you have sprung from the soil of experimental physics, and therein lies their strength. They are radical. Henceforth space by itself, and time by itself, are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality.(…)”
— Hermann Minkowski, in ‘Space And Time’, a translation of an address delivered at the 80th Assembly of German Natural Scientists and Physicians, at Cologne, 21 Sep 1908. In H.A. Lorentz, H. Weyl, H. Minkowski, et al., The Principle of Relativity: A Collection of Original Memoirs on the Special and General Theory of Relativity (1952), 74.

In Special Relativity, via Lorentz transformations, space and time are intrinsically united. Therefore, it makes sense to call the abstract set of space and time together spacetime, as we have seen before in our relativistic thread. Hermann Minkovski, a former professor of Albert Einstein guessed a quite simple and pictorical representation of “events” in “space-time”. That representation shares very similarities with usual cartesian geometry in the plane, but with some differences. Let me explain it better. Generally, the position of a point P in the real plane is expressed by Cartesian ( also called parallel or rectangular) coordinates P(x_0,y_0). But one could just as well use two oblique-angled axes, x' and y' instead, i.e., we can choose “a rotated frame”. In order to determine the new coordinates x'_0,y'_0 of any point P in such coordinate system, we draw parallels to the oblique axes through P, and then, we can calculate x'_0,y'_0 at the intersection points with the new frame. This result is completely general, and the units of length on the new axes don’t need to be the same as those we used in the old coordinates. See the following diagram:


Lorentz transformations are what mathematicians call “affine transformations”, i.e., a one-to-one (a.k.a. bijective) mapping of a plane on itself that preserves parallelism and rectilinearity of straight lines! For a 2D Minkovski spacetime transformation, being general, we have

    \[ ct'=x'_0=L_{00}ct+L_{01}x+b_0\]

    \[ x'=L_{10}ct+L_{11}x+b_1\]

Setting that b_0=b_1=0 fixes a common origin of coordinates (i.e. an unchanged point of origin). It is very simple to calculate the direction and units of length of the new primed axes in the coordinate system of the unprimed (old) coordinates. We have to set

    \[ ct'=0, x'=1\]

into the inverse 2d Lorent transformations above, and we obtain the expected result ct=\gamma, x=\beta \gamma. It is a common practice to represent the time in the vertical axis, and the position in the horizontal axis. Curiously, it is the opposed practice in the more conventional plane motion with which we are more familiar,  and where we use to represent horizontally the time and vertically the position of some object. Of course, that is only a conventional election, we could draw the axis as we wanted to. We only have to establish the “rules”. And we will follow the common practices here. Furthermore, a Lorentz transformation does not cause the coordinate axes to rotate unidirectionally, since with the rotation of the coordinate frame around its origin, they will rotate counterdirectionally! Elementary trigonometry helps us now ( we reviewed it in a previous log), and the angle between the primed (new) and the unprimed (old) axes is showed to be:

    \[ \boxed{\tan \delta=\beta=\dfrac{v}{c}}\]

The units of length, L, on the new reference frame in spacetime can be calculated with a simple application of the Pythagoras theorem:

    \[ L=\sqrt{\gamma^2+\beta^2\gamma^2}\]

and thus

    \[ \boxed{L=\sqrt{\dfrac{1+\beta^2}{1-\beta^2}}}\]

This stuff can be reviewed in the following cool diagram:


Example: the relative velocity between S and S’ is c/2. It provides \tan \delta=\beta=1/2 and L=\sqrt{5/3}\approx 1.29. Therefore, coming back to the new reference frame, the spacing of the “tick” marks (the scale) is stretched by a factor of 1.29. Suppose in addition that the units of length are given in light-years (ly). Then, time coordinates will be given too in light-years since we have ct, ct' in the temporal axes. Finally, an important remark to relativity beginners: the obliquity of the x’-axis in a Minkovsky spacetime diagram DOES NOT IMPLY that there is an angle between the x and the x’ axes IN SPACE. The angles between frames are between space-time axis in general, it is not necessary that the x-x’ axes form an angle a priori.

Some practical exercises will aid us to experiment the power of the Minkovski diagramas to solve commonly given examples in Special Relativity.

Exercise 1. Events in space-time.

Observe the following Minkovski diagram


The event E is located 5 years in the future and 3 ly on the right of the origin (0,0), as it is given in the S frame. The S’ frame moves with velocity \beta=0.5 to the right. The question is: when and where does event E happen in the S’ frame? A simple calculation shows it:

    \[ \beta=0.5\rightarrow \gamma = \dfrac{2}{\sqrt{3}}\]

Therefore, the coordinates of the event E in the primed frame will be:

    \[ ct'=\gamma (ct-\beta x)=\dfrac{2}{\sqrt{3}}(5ly-0.5\cdot 3ly)\approx 4.04ly\]

    \[ x'=\gamma(x-\beta c t)=\dfrac{2}{\sqrt{3}}(3ly-0.5\cdot 5ly)\approx 0.59ly\]

In the primed reference frame S’, the event E happens in such a way that E’ is about 4.04 years in the future and about 0.58 ly on the right of the origin! Fascinating and elementary, from our limited knowledge at least. Consider a variation of this exercise, where the event E is located 2 years in the past and 2 ly on the left of the origin, as given in the S-frame. The S’-frame moves with velocity \beta=-0.6 to the left. Again, the question is: when and where does the event E happen in the S’-frame? Looking at the new Minkovski diagram:


We can proceed to make a simple calculation as before we did:

    \[ \beta=-0.6\rightarrow \gamma =\dfrac{5}{4}\]

And thus, the new coordinates will be boosted by Lorentz transformations:

    \[ ct'=\gamma(ct-\beta x)=\dfrac{5}{4}(-3ly-(-0.6)(-2ly))=-4ly\]

    \[ x'=\gamma (x-\beta ct)=\dfrac{5}{4}(-2ly-(-0.6)(-2ly))=-4ly\]

In the primed reference frame S’, the event is located 4 years in the past and 4 light-years (ly) on the left of the common origin of both frames.

Exercise 2. Simultaneity in action.

We will suppose now that \beta=0.5. All the events that are located on a straight line parallel to the x-axis (dashed in the next Minkovski diagram) are simultaneous in S.


Therefore, every event located on a straight line parallel to the x’-axis (dotted in our Minkovski diagrama) are simultaneous in S’. It is evident and obvious that two arbitrary events, E_1, E_2, that are simultaneous in one inertial frame can NOT be simultaneous in any other inertial frame. This fact drives many of the “paradoxes” of Special Relativity. They are not really paradoxes in general, since usually we can solve the apparent contradictions with our language using a proper language and a right physical insight.

The relativity of simultaneity can also be understood with other Minkovski diagram for different events:


Exercise 3. The world lines in space-time.

Choose any inertial reference frame, e.g., the S-frame and place some object in space with spacetime coordinates E(x,ct)=(x_0,ct_0). This point of the spacetime will be denoted as E_0. In a posterior moment, the object moves towards any other arbitrary location, e.g., x=x_1 and ct=ct_1. We define the event E_1(x,ct)=(x_1,ct_1).

If we track the object through the spacetime, we obtain a sequence of events! This sequence of events is called “worldline”.

The worldline of any particle/object in nothing but a path-time diagram, i.e., the diagram length-time of the particle, with the path plotted to the right an the time plotted towards the top. Moreover, the worldline of any object AT REST in the S-frame is parallel to the ct-axis. The faster the object is in the S-frame, the flatter its worldline is. If the object moves at the speed of light, the worldline is a straight line with certain slope that can be only either +1 or -1, since it moves one unit in the x-direction (1 ly) in one unit of the ct-direction ( 1year times the speed of light). No material object can move faster than the speed of light in Special Relativity (i.e., under the hypothesis of Special Relativity, any body is restricted to move wih velocities lower than “c”. In addition, worldlines intersecting the x-axis at angles less than 45º are thus excluded ( unless we allow the notion of tachyons, to be discussed in the near future in this blog).

These notions can be understood with the following diagram, sketching the worldline:

Imagine a light flash propagating from the event l E(ct,x)=(2ly,2ly) in a spherical shell expanding in all directions of space. In the one dimensional representation of space, given by the following Minkovski spacetime diagram, the spherical shell is, at any instant of time, reduced to two points ( intersection of a line with the spherical shell). As time flows, the two points turn into two diverging world lines, we drawed them dashed. We can read from the Minkovski diagram that the light reaches the observer at t=4yr in the S-frame, and t'\approx 2.3 yrs in the S’-frame (\beta=0.5). The calculation for S is trivial, but for S’ some harder calculation is to be done.


Exercise 4. Time dilation via diagrams.

Again we take \beta=0.5. The event P'(ct',x')=(3ly,0ly) has oblique coordinate axis. In the cartesian frame at rest, S, we have ct\approx 3.5 ly. Explanation: a process extending from the origin to E’ and it takes 3 years in the S’-frame, while it will table about 3.5 years in the S-frame. The observer at rest in S concludes that the “clocks” of the S’-observer are slower than his own clocks.

The point Q(ct,x)=(3ly,0ly) in the cartesian S-frame has the time coordinate ct'\approx 3.5 ly in the S’-frame. Meaning: a process, extending from the origin to Q, it takes 3 years in the S-frame and it will take about 3.5 years in the S’-frame. The S’-observer claims that the clocks of the S-observer are slower that his own clocks.

The diagram is:


Remember: there is no contradiction in the fact that both observers “measure” that their own clocks are both slower than the other, since, at last, the events P’ and Q are not happening in the same point in spacetime!

Exercise 5. Length contraction via diagrams.

Once again, \beta=0.5, and the Minkovski diagram is now this one


We put a rod extended from the origin to the point P'(ct',x')=(0ly,3.5ly) and it will be initially at rest in the S’-frame. Its length in that frame equals 3.5 ly. The position of its left end is a function of time, i.e., its worldline IS the ct’-axis while the worldline of its right end is parallel to the ct’-axis through P’ (marked as a dotted line). At any instant t’ in teh S’-frame, the reod is placed on a paralllel line to the x’-axis.

By the other hand, in the S-frame the ros is extended from the worldline of the left end (ct’-axis) to the worldline of the right end (dotted line in the diagram). However, at any moment t in the S-frame, the rod is placed on a parallel line to the x-axis. At time t=0, the rod is extending from the origin to the point P(ct,x)=(0,x\approx 3). Thus, its length is approximately 3ly instead of 3.5 ly, and the rod is contracted.

In a similar way, a rod at rest in the S-frame, extending from the origin to the point Q(ct,x)=(0ly,3.5ly) will be contracted in the S’-frame to the distance from the origin to the point Q'(ct',x')=(0ly,x\approx 3ly).

Remark: the phenomenon of length contraction has to do with simultaneity as well, length measurements of moving objects are reasonable only if the positions of both ends can be measured simultaneously! Since S and S’ don’t agree on simultaneity, they cannot agree on the results of their length measurements.

Exercise 6. The past, the future and causality.
The next Minkovski diagram owns the coordinate axes of three different reference frames S, S’, and S”. The beta parameters are respectively \beta=0, \beta '=-0.6, \beta ''=0.6. There are two wordlines of light rays (flashes of light) passing through the origin ( with the dashed bisectors). The projections of the events P, Q and R on the different time axis are dotted and clearly distinguished from one to another:


We observe and distinguish:

1st. P is located in the future, corresponding to the S’-frame ct’>0, in the present according to the S-frame since ct=0, and in the past in the S”-frame because of the relationship ct”<0.

2nd. Q is located in the future in every frame, R is located in the past in every frame.

We can simplify the above diagram with a more simple Minkovski diagram


The 2 sectors on the left and right, in medium-gray colour, comprise zones where, depending on the frame, can be comprehended as past, present or future. Every worldline are at least as steep as the worldlines of light (dashed here), and everyworldline created from any event therein have to intersect the ct-axis above the origin. Some worldlines, created at P, have been added to the figure. No event in the medium-gray zones can affect the origin. This is a notion of causality. Furthermore, no event in the medium-gray zones can be reached or be influenced by a worldline from the origin. These points outside the lightcone are said to be out of the causal influence of the past and future.

Events contained in the darker zone are regarded as future by every frame. A world line from the origin can reach any event therein. In this way, we can call this zone “absolute future”. In the same fashion, events in the lower zone (the lighter gray) are past for every frame. They are absolute past and any event happening inside that zone can be connected to the origin by a worldline as well.

Remark: it is impossible, a priori, for a worldline to run from the upper to the lower zone, i.e., for events like Q to influence R is impossible (according to SR). By no means can be Q the cause of R. However, R CAN be the cause of Q. The sequence of cause and effect, causality, cannot be inverted by the special theory of relativity. It it happens, it would be an argument against the theory.

Exercise 7. Faster than the speed of light?

Consider the next gedanken experiment (thought or imaginary experiment). Let us suppose that information can be transmitted at a speed faster than light. Observe then the following diagram and once again set \beta=0.5


And next, imagine the next experiment:

1st. A transmission tower is at rest in the origin of the S-frame and a relay station is at rest in the S’-frame at x'=x'_0.

2nd. At the time t=t'=0, a signal will be transmitted from the common origin of the two frames to the relay station at speed 10c, as measured from the S-frame (square dots in the diagram).

3rd. It is being received by the relay station in the event A, and being re-emitted after a short period of time to the source at speed of -10c, according to the S’-frame (round dots in the diagram).

4th. The signal is being received in the event B at a time t<0, i.e., at the source, but BEFORE the signal was emitted there.

Conclusion: If the signal carries a destructive energy, it can destroy the transmitter. Cause and effect would occur in an inverted sequence!

LOG#024. Strange derivative.



I have been fascinated (perhaps I am in love too with it) by Mathematics since I was a child. As a teenager in High School, I was a very curious student ( I am curious indeed yet)  and I tried to understand some weird results I got in the classroom. This entry is devoted to some of those problems that made me wonder and think a lot out of class, at home. It took me some years and to learn complex variable function theory to understand one of the issues I could not understand before I learned complex variable. 3 years to understand a simple derivate! Yes, it is too much time. However, understanding stuff deeply carries time. Sometimes more, sometimes less.

This is the problem. A very simple problem indeed! The following “weird” (real) trigonometric-like function has a null derivative:

    \[ g(x)= \tan^{-1}((1+x)/(1-x))-\tan ^{-1} (x)\]

where the \tan^{-1} (x) is the arctangent function \arctan (x)( the inverse of the tangent function \tan (x)).


    \[ g'(x) = \dfrac{dg}{dx}=\dfrac{\dfrac{(1-x)+(1+x)}{(1-x)^2}}{1+\left(\dfrac{(1+x)}{(1-x)}\right)^2}-\dfrac{1}{1+x^2}\]

    \[ g'(x) = \dfrac{\dfrac{2}{(1-x)^2}}{1+\left(\dfrac{1+x}{1-x}\right)^2}-1/(1+x^2)\]

    \[ g'(x) = \dfrac{2}{\left(1-x\right)^2+\left(1+x\right)^2}-1/(1+x^2)\]

    \[ g'(x) = \dfrac{2}{2+2x^2}-1/(1+x^2)\]

    \[ g'(x) = \dfrac{1}{1+x^2}-\dfrac{1}{1+x^2}\]

    \[ g'(x)=0 \]


Since the derivative g'(x) is zero, the two functions must differ by a constant. We can guess that constant with usual real variable calculus. It’s quite simple:

    \[ g(x)=constant.$ => $ g(x=0)= constant = \tan^{-1}((1+0)/(1-0))-\tan ^{-1} (0)\]

    \[ g(0)= \tan^{-1}(1)-\tan ^{-1} (0)=\pi /4\]

So, the difference is \pi/ 4. However, this calculation does not explain why the two functions differ by a constant. The secret lies in the complex function origin of the arctangent. In complex variable function theory, it can be proved that

    \[ \tan^{-1}(x)=\dfrac{1}{2i} \ln ((1+ix)/(1-ix))\]


    \[ \tan(x)=y=\dfrac{\left[\dfrac{1}{2i} (\exp (ix)-\exp (-ix))\right]}{\left[\dfrac{1}{2} \left(\exp (ix)+\exp (-ix)\right)\right]}=\dfrac{1}{i}\dfrac{\exp (ix)-\exp (-ix)}{\exp (ix)+\exp (-ix)}\]

then iy=\left(\exp (2ix)-1)/(\exp (2ix)+1\right) and thus \exp (2ix)=(1+iy)/(1-iy) and therefore x=\dfrac{1}{2i} \ln \left((1+iy)/(1-iy)\right) Q.E.D.

Now, we can calculate the functions in terms of complex variables:

    \[ \tan^{-1}\left((1+x)/(1-x)\right)=\arctan \left((1+x)/(1-x)\right)=\dfrac{1}{2i} \ln \left(\dfrac{(1+i\left(\frac{1+x}{1-x}\right)}{(1-i\left(\frac{1+x}{1-x}\right)}\right)\]

We can make some algebra inside the logarithm function to get:

    \[ \arctan \left((1+x)/(1-x)\right)=\dfrac{1}{2i} \ln \left( \dfrac{(1-x)+i(1+x)}{(1-x)-i(1+x)} \right)= \dfrac{1}{2i} \ln \left( \dfrac{(1+ix)+i(1+ix)}{(1-ix)-i(1-ix)} \right)\]

By the other hand, we also have

    \[ \tan^{-1}(x)=\arctan (x)=\dfrac{1}{2i} \ln \left( \dfrac{1+ix}{1-ix} \right)\]


    \[ g(x) = \dfrac{1}{2i} \left( \arctan ((1+x)/(1-x)) - \arctan (x) \right) = \dfrac{1}{2i} \left( \ln \left [ \dfrac{\dfrac{(1+ix)+i(1+ix)}{(1-ix)-i(1-ix)}}{ \dfrac{1+ix}{1-ix}} \right]\right)\]

i.e.,the terms depending on x cancel to get a pure complex number! The number is

    \[ g(x)=number=\dfrac{1}{2i} \ln \left( \dfrac{1+i}{1-i}\right)\]

We have to calculate the logarith of the complex number

    \[ z=\dfrac{1+i}{1-i}=\dfrac{(1+i)(1+i)}{(1-i)(1+i)}=\dfrac{2i}{2}=i=exp(i\pi/2)\]

Then, \ln ( i ) = i\pi/2. Of course, that is we were expecting to get since in this case

    \[ g(x)=\dfrac{1}{2i} \dfrac{i\pi}{2}=\dfrac{\pi}{4}\]

as before! That is, we recover the phase difference we also got with real calculus. The origin of the cancellation was in the complex origin of the arctangent function! Beautiful mathematics! The complex world is fascinating but very enlightening even for real functions!

LOG#023. Math survey.


What is a triangle? It is a question of definition in Mathematics. Of course you could disagree, but it is true. Look the above three “triangles”. Euclidean geometry is based in the first one. The second “triangle” is commonly found in special relativity. Specially, hyperbolic functions. The third one is related to spherical/elliptical geometry.

Today’s summary: some basic concepts in arithmetics, complex numbers and functions. We are going to study and review the properties of some elementary and well known functions. We are doing this in order to prepare a better background for the upcoming posts, in which some special functions will appear. Maybe, this post can be useful for understanding some previous posts too.

First of all, let me remember you that elementary arithmetics is based on seven basic “operations”: addition, substraction, multiplication, division, powers, roots, exponentials and logarithms. You are familiar with the 4 first operations, likely you will also know about powers and roots, but exponentials and logarithms are the last kind of elementary operations taught in the school ( high school, in the case they are ever explained!).

Let me begin with addition/substraction of real numbers (it would be also valid for complex numbers z=a+bi or even more general “numbers”, “algebras”, “rings” or “fields”, with suitable extensions).

    \[ a+b=b+a\]

    \[ (a+b)+c=a+(b+c)\]

    \[ a+(-a)=0\]

    \[ a+0=a\]

Multiplication is a harder operation. We have to be careful with the axioms since there are many places in physics where multiplication is generaliz loosing some of the following properties:

    \[ kA=\underbrace{A+..+A}_\text{k-times}\]

    \[ AB=BA\]

    \[ (AB)C=A(BC)=ABC\]

    \[ 1A=A1=A\]

    \[ (A+B)C=AC+BC\]

    \[ A(B+C)=AB+AC\]

    \[ A^{-1}A=AA^{-1}=1\]

Indeed the last rule can be undestood as the “division” rule, provided A\neq 0 since in mathematics or physics there is no sense to “divide by zero”, as follows.

    \[ A^{-1}=\dfrac{1}{A}\]

Now, we are going to review powers and roots.

    \[ x^a=\underbrace{x\cdots x}_\text{a-times}\]

    \[ (x^a)^b=x^{ab}\]

    \[ x^{-a}=\dfrac{1}{x^a}\]

    \[ x^ax^b=x^{a+b}\]

    \[ \sqrt[n]{x}=x^{1/n}\]

Note that the identity x^a+y^a=(x+y)^a is not true in general. Moreover, if x\neq 0 then x^0=1 as well, as it can be easily deduced from the previous axioms. Now, the sixth operation is called exponentiation. It reads:

    \[ \exp (a+b)=\exp (a)\exp (b)\]

Sometimes you can read e^{a+b}=e^{a}e^{b}, where \displaystyle{e=\lim_{x\to\infty}\left(a+\dfrac{1}{n}\right)^n} is the so-called “e” number. The definitions is even more general, since the previous property is the key feature for any exponential. I mean that,

    \[ a^x=\underbrace{a\cdots a}_\text{x-times}\]

    \[ a^xb^x=(ab)^x\]

    \[ \left(\dfrac{a}{b}\right)^x=\dfrac{a^x}{b^x}\]

We also get that for any x\neq 0, then 0^x=0. Finally, the 7th operation. Likely, the most mysterious for the layman. However, it is very useful in many different places. Recall the definition of the logarithm in certain base “a”:

    \[ \log_{(a)} x=y \leftrightarrow x=a^y\]

Please, note that this definition has nothing to do with the “deformed” logarithm of my previous log-entry. Notations are subtle, but you must always be careful about what are you talking about!

Furthermore, there are more remarks:

1st. Sometimes you write \log_e=\ln x. Be careful, some books use other notations for the Napier’s logarithm/natural logarithm. Then, you can find out there \log_e=L or even \log_e=\log.

2nd. Whenever you are using a calculator, you can generally find \log_e=\ln and \log_{(10)}=\log. Please, note that in this case \log is not the natural logarithm, it is the decimal logarithm.

Logarithms (caution: logarithms of real numbers, since the logarithms of  complex numbers are a bit more subtle) have some other cool properties:

    \[ \log_{(a)}(xy)=\log_{(a)}x+\log_{(a)}y\]

    \[ \log_{(a)}\dfrac{x}{y}=\log_{(a)}x-\log_{(a)}y\]

    \[ \log_{(a)}x^y=y\log_{(a)}x\]

    \[ \log_{(a)}=\dfrac{\log_{(b)}}{\log_{(b)}a}\]

Common values of the logarithm are:

    \[ \ln 0^+=-\infty;\; \ln 1=0;\; \ln e=1;\; \ln e^x=e^{\ln x}=x\]

Indeed, logarithms are also famous due to a remarkable formula by Dirac to express any number in terms of 2’s as follows:

    \[ \displaystyle{N=-\log_2\log_2 \sqrt{\sqrt{\underbrace{\cdots}_\text{(N-1)-times}2}}=-\log_2\log_2\sqrt{\underbrace{\cdots}_\text{(N)-times}2}}\]

However, it is quite a joke, since it is even easier to write N=\log_a a^N, or even N=\log_{(1/a)}a^{-N}

Are we finished? NO! There are more interesting functions to review. In particular, the trigonometric functions are the most important functions you can find in the practical applications.


Triangles are cool! Let me draw the basic triangle in euclidean trigonometry.


The trigonometric ratios/functions you can define from this figure are:

i)The function (sin), defined as the ratio of the side opposite the angle to the hypotenuse:

    \[ \sin A=\dfrac{\textrm{opposite}}{\textrm{hypotenuse}}=\dfrac{a}{\,c\,}\]

ii) The function (cos), defined as the ratio of the adjacent leg to the hypotenuse.

    \[ \cos A=\dfrac{\textrm{adjacent}}{\textrm{hypotenuse}}=\dfrac{b}{\,c\,}\]

iii) The function (tan), defined as the ratio of the opposite leg to the adjacent leg.

    \[ \tan A=\dfrac{\textrm{opposite}}{\textrm{adjacent}}=\dfrac{a}{\,b\,}=\dfrac{\sin A}{\cos A}\]

The hypotenuse is the side opposite to the 90 degree angle in a right triangle; it is the longest side of the triangle, and one of the two sides adjacent to angle ”A”. The ”’adjacent leg”’ is the other side that is adjacent to angle ”A”. The ”’opposite side”’ is the side that is opposite to angle ”A”. The terms ”’perpendicular”’ and ”’base”’ are sometimes used for the opposite and adjacent sides respectively. Many English speakers find it easy to remember what sides of the right triangle are equal to sine, cosine, or tangent, by memorizing the word SOH-CAH-TOA ( a mnemonics rule whose derivation and meaning is left to the reader).

The multiplicative inverse or reciprocals of these functions are named the cosecant (csc or cosec), secant(sec), and cotangent (cot), respectively:

    \[ \csc A=\dfrac{1}{\sin A}=\dfrac{c}{a} \]

    \[ \sec A=\dfrac{1}{\cos A}=\dfrac{c}{b}\]

    \[ \cot A=\dfrac{1}{\tan A}=\dfrac{\cos A}{\sin A}=\dfrac{b}{a}\]

The inverse trigonometric functions/inverse functions are called the arcsine, arccosine, and arctangent, respectively. These functions are what in common calculators are given by \sin^{-1},\cos^{-1},\tan^{-1}. Don’t confuse them with the multiplicative inverse trigonometric functions.

There are arithmetic relations between these functions, which are known as trigonometric identities.  The cosine, cotangent, and cosecant are so named because they are respectively the sine, tangent, and secant of the complementary angle abbreviated to “co-“. From the goniometric circle (a circle of radius equal to 1) you can read the Fundamental Theorem of (euclidean) Trigonometry:

    \[ \cos^2\theta+\sin^2\theta=1\]

Indeed, from the triangle above, you can find out that the pythagorean theorem implies

    \[ a^2+b^2=c^2\]

or equivalently

    \[ \dfrac{a^2}{c^2}+\dfrac{b^2}{c^2}=\dfrac{c^2}{c^2}\]

So, the Fundamental Theorem of Trigonometry is just a dressed form of the pythagorean theorem!
The fundamental theorem of trigonometry can be rewritten too as follows:

    \[ \tan^2\theta+1=\sec^2\theta\]

    \[ \cot^2+1=\csc^2\theta\]

These equations can be easily derived geometrically from the goniometric circle:


The trigonometric ratios are also related geometrically to this circle, and it can seen from the next picture:


Other trigonometric identities are:

    \[ \sin (x\pm y)=\sin x \cos y\pm \sin y\cos x\]

    \[ \cos (x\pm y)=\cos x \cos y -\sin x \sin y\]

    \[ \tan (x\pm y)=\dfrac{\tan x\pm \tan y}{1\mp \tan x \tan y}\]

    \[ \cot (x\pm y)=\dfrac{\cot x \cot y\mp 1}{\cot x\pm \cot y}\]

    \[ \sin 2x=2\sin x \cos x\]

    \[ \cos 2x=\cos^2 x-\sin^2 x\]

    \[ \tan 2x=\dfrac{2\tan x}{1-\tan^2 x}\]

    \[ \sin \dfrac{x}{2}=\sqrt{\dfrac{1-\cos x}{2}}\]

    \[ \cos \dfrac{x}{2}=\sqrt{\dfrac{1+\cos x}{2}}\]

    \[ \tan \dfrac{x}{2}=\sqrt{\dfrac{1-\cos x}{1+\cos x}}\]

    \[ \sin x\sin y=\dfrac{\cos(x-y)-\cos(x+y)}{2}\]

    \[ \sin x\cos y=\dfrac{\sin(x+y)+\sin(x-y)}{2}\]

    \[ \cos x\cos y=\dfrac{\cos (x+y)+\cos(x-y)}{2}\]

The above trigonometric functions are also valid for complex numbers with care enough. Let us write a complex number as either a binomial expression z=a+bi or like a trigonometric expression z=re^{i\theta}. The famous Euler identity:

    \[ e^{i\theta}=\cos \theta + i\sin \theta\]

allows us to relate both two expressions for a complex number since

    \[ z=r(\cos \theta + i\sin \theta)\]

implies that a=r\cos\theta and b=r\sin\theta. The Euler formula is also useful to recover the identities for the sin and cos of a sum/difference, since e^{iA}e^{iB}=e^{i(A+B)}

The complex conjugate of a complex number is \bar{z}=a-bi, and the modulus is

    \[ z\bar{z}=\vert z\vert^2=r^2\]

with \theta =\arctan \dfrac{b}{a}, \;\; \vert z \vert=\sqrt{a^2+b^2}

Moreover, \overline{\left(z_1\pm z_2\right)}=\bar{z_1}\pm\bar{z_2}, \vert \bar{z}\vert=\vert z\vert, and if z_2\neq 0, then

    \[ \overline{\left(\dfrac{z_1}{z_2}\right)}=\dfrac{\bar{z_1}}{\bar{z_2}}\]

We also have the so-called Moivre’s formula

    \[ z^n=r^n(\cos n\theta+i\sin n\theta)\]

and for the complex roots of complex numbers with w^n=z the identity:

    \[ w=z^{1/n}=r^{1/n}\left(cos\left(\dfrac{\theta +2\pi k}{n}\right)+i\sin\left(\dfrac{\theta +2\pi k}{n}\right)\right)\forall k=0,1,\ldots,n-1\]

The complex logarithm (or the complex power) is a multivalued functions (be aware!):

    \[ \ln( re^{i\theta})=\ln r +i\theta +2\pi k,\forall k\in \mathbb{Z}\]

The introduction of complex numbers and complex values of trigonometric functions are fun. You can check that

    \[ \cos z=\dfrac{\exp (iz)+\exp (-iz)}{2}\]


    \[ \sin z=\dfrac{\exp (iz)-\exp (-iz)}{2i}\]


    \[ \tan z=\dfrac{\exp (iz)-\exp (-iz)}{i(\exp (iz)+\exp(-iz))}\]

thanks to the Euler identity.

In special relativity, the geometry is “hyperbolic”, i.e., it is non-euclidean. Let me review the so-called hyperbolic trigonometry. More precisely, we are going to review the hyperbolic functions related to special relativity now.



We define the functions sinh, cosh and tanh ( sometimes written as sh, ch, th):

    \[ \sinh x=\dfrac{\exp (x) -\exp (-x)}{2}\]

    \[ \cosh x=\dfrac{\exp (x) +\exp (-x)}{2}\]

    \[ \tanh x=\dfrac{\exp (x) -\exp (-x)}{\exp (x)+\exp (-x)}\]

The fundamental theorem of hyperbolic trigonometry is

    \[ \cosh^2 x-\sinh^2 x=1\]

The hyperbolic triangles are objects like this:


The hyperbolic inverse functions are

    \[\sinh^{-1} x=\ln (x+\sqrt{x^2+1})\]

    \[ \cosh^{-1} x=\ln (x+\sqrt{x^2-1})\]

    \[ \tanh^{-1} x=\dfrac{1}{2}\ln\dfrac{1+x}{1-x}\]

Two specially useful formulae in Special Relativity (related to the gamma factor, the velocity and a parameter called rapidity) are:

    \[ \boxed{\sinh \tanh^{-1} x=\dfrac{x}{\sqrt{1-x^2}}}\]

    \[ \boxed{\cosh \tanh^{-1} x=\dfrac{1}{\sqrt{1-x^2}}}\]

In fact, we also have:

    \[ \exp(x)=\sinh x+\cosh x\]

    \[ \exp(-x)=-\sinh x+\cosh x\]

    \[ \sec\mbox{h}^2x+\tanh^2 x=1\]

    \[ \coth ^2 x-\csc\mbox{h}^2 x=1\]

There are even more identities to be known. The most remarkable and important are likely to be:

    \[ \sinh (x\pm y)=\sinh (x)\cosh (y)\pm\sinh (y)\cosh (x)\]

    \[ \cosh (x\pm y)=\cosh (x)\cosh (y)\pm\sinh (x)\sinh (y)\]

    \[ \tanh (x\pm y)=\dfrac{\tanh x\pm \tanh y}{1\pm \tanh x\tanh y}\]

    \[ \coth (x\pm y)=\dfrac{\coth x\coth y\pm 1}{\coth y\pm \coth x}\]

You can also relate euclidean trigonometric functions with hyperbolic trigonometric functions with the aid of complex numbers. For instance, we get

    \[ \sinh x=-i\sin ix\]

    \[ \cosh x=\cos ix\]

    \[ \tanh x= -i\tan ix\]

and so on. The hyperbolic models of geometry/trigonometry are also very known in arts. Escher’s drawings are very beautiful and famous:


or the colorful variation of this theme


I love Escher’s drawings. And I also love Mathematics, Physics, Physmatics, and Science. Equations are cool. And hyperbolic functions, and other functions we have reviewed here today, will arise naturally in the next posts.

LOG#022. Kaniadakis and relativity.

Hello, I am back! After some summer rest/study/introspection! And after an amazing July month with the Higgs discovery by ATLAS and CMS. After an amazing August month with the Curiosity rover, MSL(Mars Science Laboratory), arrival to Mars. After a hot summer in my home town…I have written lots of drafts these days…And I will be publishing all of them step to step.

We will discuss today one of interesting remark studied by Kaniadakis. He is known by his works on relatistivic physics, condensed matter physics, and specially by his work on some cool function related to non-extensive thermodynamics. Indeed, Kaniadakis himself has probed that his entropy is also related to the mathematics of special relativity. Ultimately, his remarks suggest:

1st. Dimensionless quantities are the true fundamental objects in any theory.

2nd. A relationship between information theory and relativity.

3rd. The important role of deformation parameters and deformed calculus in contemporary Physics, and more and more in the future maybe.

4nd. Entropy cound be more fundamental than thought before, in the sense that non-extensive generalizations of entropy play a more significant role in Physics.

5th. Non-extensive entropies are more fundamental than the conventional entropy.

The fundamental object we are going to find is stuff related to the following function:

    \[ exp_\kappa (x)=\left( \sqrt{1+\kappa^2x^2}\right)^{1/\kappa}\]

Let me first imagine two identical particles ( of equal mass) A and B, whose velocities, momenta and energies are, in certain frame S:

    \[ v_A, p_A=p(v_A), E_A=E(v_A)\]

    \[ v_B, p_B=p(v_B), E_B=E(v_B)\]

In the rest frame of particle B, S’, we have

    \[ p'_B=0\]

    \[ p'_A=p_A-p_B\]

If we define a dimensionless momentum paramenter

    \[ q=\dfrac{p}{p^\star}\]

    \[ \dfrac{p'_A}{p^\star}=\dfrac{p_A}{p^\star}-\dfrac{p_B}{p^\star}\]

we get after usual exponentiation


Galilean relativity says that the laws of Mechanics are unchanged after the changes from rest to an uniform motion reference frame. Equivalentaly, galilean relativity in our context means the invariance under a change q'_A\leftrightarrow q_A, and it implies the invariance under a change q_B\rightarrow -q_B. In turn, plugging these inte the last previous equation, we get the know relationship

    \[ \exp (q)\exp (-q)=1\]

Wonderful, isn’t it? It is for me! Now, we will move to Special Relativity. In the S’ frame where B is at rest, we have:

    \[ v'_B=0, p'_B=0, E'_B=mc^2\]

and from the known relativistic transformations for energy and momentum

    \[ v'_A=\dfrac{v_A-v_B}{1-\dfrac{v_Av_B}{c^2}}\]

    \[ p'_A=\gamma (v_B)p_A-\dfrac{v_B\gamma (v_B)E_A}{c^2}\]

    \[ E'_A=\gamma (v_B)E_A-v_B\gamma (v_B)p_A\]

where of course we define

    \[ \gamma (v_B)=\dfrac{1}{\sqrt{1-\dfrac{v_B}{c^2}}}\]

    \[ p_B=m \gamma (v_B) v_B\]

    \[ E_B=m \gamma (v_B) c^2\]

After this introduction, we can parallel what we did for galilean relativity. We can write the last previous equations in the equivalent form, after some easy algebra, as follows

    \[ p'_A=p_A\dfrac{E_B}{mc^2}-E_A\dfrac{p_B}{mc^2}\]

    \[ E'_A=E_AE_B\dfrac{1}{mc^2}-\dfrac{p_Ap_B}{m}\]

Now, we can introduce dimensionless variables instead of the triple (v, p, E), defining instead the adimensional set (u, q, \epsilon):

    \[ \dfrac{v}{u}=\dfrac{p}{mq}=\sqrt{\dfrac{E}{m\epsilon}}=\vert \kappa \vert c=v_\star<c\]

Note that the so-called deformation parameter \kappa is indeed related (equal) to the beta parameter in relativity. Again, from the special relativity requirement \vert \kappa \vert c<c we obtain, as we expected, that -1< \kappa <+1. Classical physics, the galilean relativity we know from our everyday experience, is recovered in the limit c\rightarrow \infty, or equivalently, if \kappa \rightarrow 0. In the dimensionless variables, the transformation of energy and momentum we wrote above can be showed to be:

    \[ q'_A=\kappa^2q_A\epsilon_B-\kappa^2q_B\epsilon_A\]

    \[ \epsilon'_A=\kappa^2\epsilon_A\epsilon_B-q_Aq_B\]

In rest frame of some particle, we get of course the result E(0)=mc^2, or in the new variables \epsilon (0)=\dfrac{1}{\kappa^2}. The energy-momentum dispersion relationship from special relativity p^2c^2-E^2=-m^2c^2 becomes:

    \[ q^2-\kappa^2\epsilon^2=-\dfrac{1}{\kappa^2}\]


    \[ \kappa^4\epsilon^2-\kappa^2q^2=1\]

Moreover, we can rewrite the equation

    \[ q'_A=\kappa^2q_A\epsilon_B-\kappa^2q_B\epsilon_A\]

in terms of the dimensionless energy-momentum variable

    \[ \epsilon_\kappa (q)=\dfrac{\sqrt{1+\kappa^2q^2}}{\kappa^2}\]

and we get the analogue of the galilean addition rule for dimensionless velocities

    \[ q'_A =q_A\sqrt{1+\kappa^2q_B^2}-q_B\sqrt{1+\kappa^2q_A^2}\]

Note that the classical limit is recovered again sending \kappa\rightarrow 0. Now, we have to define some kind of deformed exponential function. Let us define:

    \[ \exp_\kappa (q) =\left(\sqrt{1+\kappa^2q^2}+\kappa q\right)^{1/\kappa}\]

Applying this function to the above last equation, we observe that

    \[ \exp_\kappa (q'_A)=\exp_\kappa (q_A) \exp_\kappa (-q_B)\]

Again, relativity means that observers in uniform motion with respect to each other should observe the same physical laws, and so, we should obtain invariant equations under the exchanges q'_A\leftrightarrow q_A and q_B\rightarrow -q_B. Pluggint these conditions into the last equation, it implies that the following condition holds (and it can easily be checked from the definition of the deformed exponential).

    \[ \exp_\kappa (q)\exp_\kappa (-q)=1\]

One interesting question is what is the inverse of this deformed exponential ( the name q-exponential or \kappa-exponential is often found in the literature). It has to be some kind of deformed logarithm. And it is! The deformed logarithm, inverse to the deformed exponential, is the following function:

    \[ \ln_\kappa (q)=\dfrac{q^{\kappa}-q^{-\kappa}}{2\kappa}\]

Indeed, this function is related to ( in units with the Boltzmann’s constant set to the unit k_B=1) the so-called Kaniadakis entropy!

    \[ S_{K}=-\dfrac{q^{\kappa}-q^{-\kappa}}{2\kappa}\]

Furthermore, the equation \exp_\kappa (q)\exp_\kappa (-q)=1 also implies that

    \[ \ln_\kappa \left(\dfrac{1}{q}\right)=-\ln_\kappa (q)\]

The gamma parameter of special relativity is also recasted as

    \[ \gamma =\dfrac{1}{\sqrt{1-\kappa^2}}\]

More generally, in fact, the deformed exponentials and logarithms develop a complete calculus based on:

    \[ \exp_\kappa (q_A)\exp_\kappa (q_B)=\exp (q_A\oplus q_B)\]

and the differential operators

    \[ \dfrac{d}{d_\kappa q}=\sqrt{1+\kappa^2q^2}\dfrac{d}{dq}\]

so that, e.g.,

    \[ \dfrac{d}{d_\kappa q}\exp_\kappa (q)=\exp_\kappa (q)\]

This Kanadiakis formalism is useful, for instance, in generalizations of Statistical Mechanics. It is becoming a powertool in High Energy Physics. At low energy, classical statistical mechanics gets a Steffan-Boltmann exponential factor distribution function:

    \[ f\propto \exp(-\beta E)=\exp (-\kappa E)\]

At high energies, in the relativistic domain, Kaniadakis approach provide that the distribution function departures from the classical value to a power law:

    \[ f\propto E^{-1/\kappa}\]

There are other approaches and entropies that could be interesting for additional deformations of special relativity. It is useful also in the foundations of Physics, in the Information Theory approach that sorrounds the subject in current times. And of course, it is full of incredibly beautiful mathematics!

We can start from deformed exponentials and logarithms in order to get the special theory of relativity (reversing the order in which I have introduced this topic here). Aren’t you surprised?

LOG#021. Journey to the West (East).

I have chosen a very literary title for my article today. I hope you will forget my weakness by exotic arts. The Journey to the West is one of the masterpieces of Chinese literature. His main character is the Monkey King, Sun Wukong. He is a fun character, see him at these two artistical portraits:



For geeks only: Sun Wukong was the inspiring character for the manga/anime masterpiece Dragonbal/Dragonball Z, by A. Toriyama. Son Goku is indeed based on the legendary Sun Wukong.


The Journey to the West is Sun Wukong’s epical adventure. It is one of the Four Great Classical Novels of Chinese literature. These four novels are: Romance of the Three Kingdoms, Water Margin, Journey to the West, and The Plum in the Golden Vase (also called Golden Lotus). In the last times, The Dream of the Red Chamber replaced The Plum of the Golden Lotus as “great classical novel” in China.

By the other hand, Journey to the West was written by Wu Cheng’en in the 16th century during the Ming Dynasty. It is about the voyage of the Monkey King, Sun Wukong, to India and his fantastic and heroic acts. Wikipedia says it clearly:

“(…)The novel is a fictionalised account of the legendary pilgrimage to India of the Buddhist monk Xuanzang, and loosely based its source from the historic text Great Tang Records on the Western Regionsand traditional folk tales. The monk travelled to the “Western Regions” during the Tang Dynasty, to obtain sacred texts (sutras). The bodhisattva Avalokiteśvara (Guanyin) on instruction from the Buddha, gives this task to the monk and his three protectors in the form of disciples — namely Sun Wukong, Zhu Bajie and Sha Wujing— together with a dragon prince who acts as Xuanzang’s steed, a white horse. These four characters have agreed to help Xuanzang as an atonement for past sins(…).”

Western people had the analogue story in the historical (legendary?) voyages of Marco Polo to China…Travelling to the West and East is indeed a nice topic for the time dilation in special relativity and beyond (general relativity and non-inertial frames are also important). We will focus on the SR effect, although we will discuss further some remarks concerning important additional effects from rotating systems and general relativity.

In 1971, a very beautiful an inexpensive experiment was carried out to check the time dilation predicted by special relativity. The experiment is nowadays called Hafele-Keating experiment, and the idea was pretty simple: you can use scheduled airplane flights to carry two very precise atomic clocks around the Earth planet, close to the equator, one in eastbound flights and the other one in westbound flights. This can be sketched in the following picture:


The calculations for the relativistic dilation factor requires two velocity variables:

1st. The velocity of the Earth’s planet rotation at the equatior (or very close to it). We denote it by v_E. Moreover, we know it is equal to v_E=40000km/24 h= 463 m/s.

2nd. The velocity of the airplanes carrying the atomic clocks, measured relative to the Earth’s surface, our S-frame.

In addition to these velocities, we have also to distinguish 4 different frames:

Frame A. Hypothetical observer moving due west at velocity v_E relative the Earth’s surface, i.e., against the sense of the Earth’s rotation. For him, the sun is always at its zenith and the who Earth planet is rotating underneath him. We can consider he is resting in a good inertial reference frame because he does not participate in the circular motion around the Earth’s axis. Thus, Earth’s rotation arount the sun can be neglected for this frame.

Frame B. This is an observer on the ground. He is moving with v_E relative to the frame A.

Frame C. This observer rests in a plane heading east, he is moving at velocity v_E+v relative to A.

Frame D. This observer is placed in a plane heading west. He is moving at velocity v_E-v relative to A.

Obviously, the B, C, and D frames are not resting in inertial frames! Therefore, we have to perform the relativist calculations in the A’s reference frame. Let us simplify the physical situation with some additional assumption. Suppose that the two airplanes are starting simultaneously at the same point, for instance the point B above, and suppose that they are back simultaneously to B after they have traveled the round-trip around the Earth planet. In the A’s frame, the flight time is equal to \Delta t. The proper times measured by B, C, and D are defined to be \tau_B, \,\tau_C,\,\tau_D and they are related to the flight time in the following way:

    \[\Delta t= \dfrac{\tau_B}{\sqrt{1-\left(\dfrac{v_E}{c}\right)^2}}=\dfrac{\tau_C}{\sqrt{1-\left(\dfrac{v_E+v}{c}\right)^2}}=\dfrac{\tau_D}{\sqrt{1-\left(\dfrac{v_E-v}{c}\right)^2}}\]

Since the velocities are small compared with the speed of light, we can make a Taylor expansion of the roots. After that expansion, we get the next formulae for the proper times:

    \[ \tau_B \approx \Delta t\left[1-\dfrac{1}{2}\left( \dfrac{v_E}{c}\right)^2\right]\]

    \[ \tau_C \approx \Delta t\left[1-\dfrac{1}{2}\left( \dfrac{v_E+v}{c}\right)^2\right]\]

    \[ \tau_D \approx \Delta t\left[1-\dfrac{1}{2}\left( \dfrac{v_E-v}{c}\right)^2\right]\]

These equations are well enough to calculate the difference between the proper time elapsed in the east-bound airplane and the proper time on the ground:

    \[ \Delta \tau^{\mbox{east}}=\tau_C-\tau_B\approx -\dfrac{2v_Ev+v^2}{2c^2}\Delta t\]

    \[ \Delta \tau^{\mbox{west}}=\tau_D-\tau_B\approx +\dfrac{2v_Ev-v^2}{2c^2}\Delta t\]

If we impose a reasonable velocity for the airplane, say v=800km/h=222m/s, the flight in the A frame takes a time:

    \[ \Delta t\approx \tau_B=\dfrac{40000km}{800km/h}=50h=1.8\cdot 10^5s\]

This number provides, inserted into the previous expressions, the next time delays:

    \[ \Delta \tau^{\mbox{east}}=-255ns\]

    \[ \Delta \tau^{\mbox{west}}=+156ns\]

Some conclusions are straightforward now:

1. On the eastbound flight, time does not elapse as fast as on the ground, as expected. Equivalently, tic-tacs are slower in the eastbound flight compared with the ground tic-tacs.

2. On the westbound flight, time elapses faster. It is logical, since from the A frame viewpoint the airplane heading west has a lower velocity than the observer on the ground B!

I would like to add some additional cool stuff. There are two additional non negligible corrections to be accounted by this experiment:

1st. General relativistic corrections, i.e., the so called gravitational time dilation. Although we have not studied General Relativity, we can understand it from the viewpoint of the non-inertial frames. Gravity itself introduces an extra purely gravitational time delay when you move in the gravitational field. This gravitational time delay reads:

    \[ \Delta t_g=\dfrac{\tau}{\sqrt{1-\dfrac{2G_NM}{rc^2}}}\]

where \tau is the proper time, G_N is the gravitational constant, M is the Earth mass and r=R_T+h is the distance to the centre of the Earth. In the case the root is close to the unity (weak gravitational fields), we can Taylor expand in order to get:

    \[ \Delta t_g=\dfrac{g}{c^2}(h-h_0) \Delta \tau\]

where g is the surface gravity.

2nd. The Sagnac effect. A rotating non-inertial frame suffers an extra time dilation correction:

    \[ t_1=\dfrac{\tau}{1-\dfrac{\omega \cdot \mathbf{r}}{c}}=\dfrac{2\pi r}{c}\dfrac{1}{1-\dfrac{\omega \cdot \mathbf{r}}{c}}\]

    \[ t_2=\dfrac{\tau}{1+\dfrac{\omega \cdot \mathbf{r}}{c}}=\dfrac{2\pi r}{c}\dfrac{1}{1+\dfrac{\omega \cdot \mathbf{r}}{c}}\]

Thus, we have

    \[ \Delta t=t_2-t_1=\Delta t_{Sagnac}=-\dfrac{4\pi r\omega\cdot \mathbf{r}}{c^2}\dfrac{1}{1-\dfrac{(\omega \cdot \mathbf{r})^2}{c^2}}\]

i.e. if \dfrac{\omega\cdot \mathbf{r}}{c}\rightarrow 0, expanding the scalar “dot” product, we finally get

    \[ \Delta t_{Sagnac}\approx -\dfrac{4\pi r^2\omega \cos\phi}{c^2}= -\dfrac{4\pi R^2\omega \cos^2\phi}{c^2}\]

Thus, General Relativity predicts an extra time delay due to the gravitational potential: the deeper a clock is positioned in the gravitational source of this field, the slower is its elapsed time. Supposing a typical cruising altitude about 10000 m, we get and additional 196ns of elapsed time in the airplanes in the course of 50h, in relation to the fixed ground. Therefore:

    \[ \Delta \tau^{east}=-255nx+196ns=-59ns\]

    \[ \Delta \tau^{west}=+156ns+196ns=+352ns\]

From the exact flight data Hafele and Keating theirselves obtained:

    \[ \Delta \tau^{east}_{exp}=(-42\pm 23)ns\]

    \[ \Delta \tau^{west}_{exp}=(+275\pm 21)ns\]

The four atomic clocks measured

    \[ \Delta \tau^{east}_{clock}=(-59\pm 10)ns\]

    \[ \Delta \tau^{west}_{clock}=(+273\pm 7)ns\]

and hence, theory and experiment are in good agreement.

A last remark is also important. A full treatment of the Hafele-Keating experiment, including several paths (or sections) of flight, requires the use of the totally correct time delay (including all the relativistic and non-inertial corrections):

    \[ \Delta t_{HK}=\Delta \tau_{SR}+\Delta \tau_{g}+\Delta \tau_{Sagnac}\]


    \[ \displaystyle{\Delta \tau_{SR}=-\dfrac{1}{c^2}\sum_{i=1}^{k}v_i^2\Delta \tau_i}\]

    \[ \displaystyle{\Delta \tau_{g}=\dfrac{g}{c^2}\sum_{i=1}^{k}(h_i-h_o)\Delta \tau_i}\]

    \[ \displaystyle{\Delta \tau_{Sagnac}=-\dfrac{\omega}{c^2}\sum_{i=1}^{k}r_i^2\cos^2 \phi_i\Delta \lambda_i}\]

LOG#020. e=mc². Notions of mass.


My article today is dedicated to the most celebrated equation in Physics. Strictly speaking, it is not ONE single equation, but 3 or 4 different equations, despite the fact the the concept and physical idea behind its simple looking ARE “the same”. Thus, the popular and biased teaching of physical concepts by some authors, and the iconic figure of Einstein himself has driven to some very common unlucky misconceptions about what the equation means in several contexts. Of course, generally graduated students of Physics, physicists and experts in the theory of relativity are usually aware of these subtle issues but not always. They do know generally what they are doing but even Einstein himself wondered about these concepts, so sometimes you can even feel strange when you don’t know about what kind of mass people is talking about. Take care and think about it: the idea of mass is not a completely understood concept even in the 21st century!

My second purpose will be to explain some of the different notions of mass in classical physics and to introduce the issue of mass, its concept, as the essence of every current theory, either classical or quantum. I will not talk about the whole problem of mass in quantum theories, but I am trying to provide a broad perspective about one of the deepest concepts in Physics since the emergence of modern Physics: mass and inertia.

Let me be back to previous lessons I gave here. Special relativity, in the framework of Geometry, via spacetime vectors, merge the concepts of space and time into spacetime, the concepts of momentum and energy into momenergy. Indeed, it is just a way of rethinking classical concepts of space or time, momentum and energy, into a larger formalism. It does not say, a priori, anything new. It only explains that what we thought they were independent objects are really related ideas. What did we learn? Well, we did learn that

    \[ \mathbb{P}=(E/c,\mathbf{P})\]


    \[ \mathbb{P}\cdot\mathbb{P}=-(mc)^2\]

Moreover, we found that this last equation says that:

    \[ E^2=(mc^2)^2+(\mathbf{P}c)^2\]

and that the total relativistic energy and the relativistic rest mass are given by:

    \[ E=Mc^2\]


    \[ e=mc^2\]

where the last equation is sometimes written as E_0=m_0c^2. However, since the squared rest mass IS the truly invariant quantity up to a multiplicative constant, Okun and other people have remarked that the most correct equation relatinc mass and energy is E_0=mc^2! The equation E=mc^2 is “confusing”, misleading, and not completely correct from the relativistic viewpoint. So, be aware with the propaganda of the media! Motto: the “invariant mass” IS the fundamental and right notion of “mass” in special relativity. In spite of all you have read about this topic out there. It is NOT right that mass is velocity-dependent. That is wrong. Relativistic momentum IS NOT the rest mass times velocity in special relativity, it is something more complicated. However, special relativity says that there is a well defined notion of mass, an invariant quantity that, unlike classical mechanics, is not conserved in general.

Finally, we also learnt that the relativistic kinetic energy, defined as the difference between the total energy and the rest mass is given by

    \[ K=\Delta E=\delta mc^2=(M-m)c^2\]

So we can write these 3/4 equations to compare them a bit better in a single line:

    \[ \boxed{E=Mc^2}\]

    \[ \boxed{e=mc^2}\]

    \[ \boxed{K=\Delta E=\delta mc^2=(M-m)c^2}\]

    \[ \boxed{E^2=(mc^2)^2+(\mathbf{p}c)^2}\]

Note: Once again, the only two invariant relationships you really need in special relativity are

1st. The constitutive dispersion relationship E=E(\mathbf{P},m):

    \[ (\mathbf{P}c)^2-E^2=-m^2c^4\]

Some theories beyond SR could change this equation.

2nd. The constitutive relationship momentum-velocity \mathbf{P}=\mathbf{P}(\mathbf{v}, E):

    \[ \mathbf{P}=m\gamma \mathbf{v}\leftrightarrow \dfrac{\mathbf{P}c^2}{E}=\mathbf{v}\leftrightarrow \mathbf{P}=\mathbf{v}\dfrac{E}{c^2}\]

Again, it could be possible for some kind of theories change this last equation or its equivalent expressions.

A caution and comentary deserves special attention: in many textbooks, specially some old-fashioned books and papers, and a really large amount of popular books about physics and relativity, you find the controversial concept of “relativistic mass”:

    \[ M=\dfrac{m}{\sqrt{1-\dfrac{v^2}{c^2}}}\]

You observe that M is bigger as you approach to the speed of light. A massive particle in motion, as seen from an inertial frame then, is increased. Be aware a was deliberately “clever” with my notation, since some people use “m” instead other different name for M. And if you are not careful, you can think and confuse M with m. Some books do in fact make the distinction more clear writing M=M(v,m) instead of M or m. Anyway, the important keypoint is that M(v,m)=M\neq m!

It is essential to understand that and what this stuff really means. Perhaps, the first thing we should recall is what it was called “rest mass” and “rest energy”. Indeed, I think it is the easiest concept to teach everyone when he has in mind the Einstein’s mass-energy equation, and even when they think about the concept of mass-energy equivalence that Special Relativity introduces, and it is simple. Simply speaking, energy and mass are two related entities, and the proportion is the square of the speed of light. Equivalently, any particle with a non-zero mass has an energy given by:

    \[ e=mc^2 \leftrightarrow E_0=mc^2 \leftrightarrow \mbox{Rest mass-Rest energy equivalence/proportion}\]

Recall that the square of the speed of light is a big number. c^2\approx 9\cdot 10^{16}m^2/s^2. Therefore, a single kilogram has about 9\cdot 10^{16}J of energy!

By the other hand, remember that the rest mass is itself and invariant as well (it is frame independent) since \mathbb{P}\cdot\mathbb{P}=-(mc)^2. Let us continue. What happens in other inertial frames? I mean, how are mass and energy observed from a S-frame if you have a massive particle moving with constant speed in a S’-frame. Well, as we studied before, the relativistic momentum or the total energy are NOT independently invariant. They are boosted, i.e., they are transformed under a Lorentz transformation! In this way, the relativistic mass M=m\gamma means that, from the S-frame viewpoint, masses (or energies) in motion are “bigger” than the rest mass. Please, note that from the point of view of S’, its mass is m, not M, accordingly to the relativity principle as we would expect. Even more, if S’ would measure other mass in motion relative to S’, from the S’ viewpoint, the S-frame mass would be larger. There is no contradiction between these views since the objects are different and they are placed in different frames. Therefore, the known equation as the relativistic mass expresses only the fact that the measures of mass are not frame independent. Indeed, that is the reason because the idea of invariant mass as the square of the momenergy is very important. The invariant mass IS invariant in any frame. The rest mass IS not. It is frame dependent!  However, the total energy can be calculated in the SR framework and, perhaps surprinsingly, it is equal to:

    \[ E=Mc^2=m\gamma c^2=\dfrac{mc^2}{\sqrt{1-\dfrac{v^2}{c^2}}}\]

Note, that, this equation has a different physical meaning than that of the equivalence rest mass-rest energy. This equation means that, in Special Relativity, the total energy, including likely potential energies and the rest energy mc^2 equals the relativistic mass times the square of the speed of light. It is highly non-trivial! Beyond that, the striking similarity with the previous rest mass-rest energy equivalence equation use to confuse people that are not familiar with these details. Subtle and important keypoints that due to the popular spreading of the theory of relativiy has caused lots of confusions between common people. Let me explain this issue with more detail. Imagine a system of two bodies, A and B, which interact via the electromagnetic field only. Their rest energies are E_A=m_Ac^2 and E_B=m_Bc^2. If the particles are in relative motion with respect other inertial frame S, and their relative speeds with respect to that frame are u_A, u_B, their energies will be:

    \[ E_A=M_Ac^2=m_A\gamma c^2=\dfrac{m_Ac^2}{\sqrt{1-\dfrac{u_A^2}{c^2}}}\]

    \[ E_B=M_Ac^2=m_B\gamma c^2=\dfrac{m_Bc^2}{\sqrt{1-\dfrac{u_B^2}{c^2}}}\]

When the potential energy is added, the system total relativistic energy reads:

    \[ E=E_A+E_B+E_{pot}=\dfrac{m_Ac^2}{\sqrt{1-\dfrac{u_A^2}{c^2}}}+\dfrac{m_Bc^2}{\sqrt{1-\dfrac{u_B^2}{c^2}}}+E_{pot}\]

Take as an easy and natural example the hydrogen atom. It is made from a single proton and an electron (in its simpler isotope, of course). We could separate the energy into several terms

    \[ E=m_pc^2+E_{kin,p}+m_ec^2+E_{kin,e}+E_{pot}\]

or, rearranging terms,

    \[ E=(m_p+m _e)c^2+(E_{kin,p}+E_{kin,e}+E_{pot})=(\mbox{Total rest mass}+\mbox{Binding energy})\]


    \[ E=E_0^{total}+U_{bind}\]

In this case, the ground state of atomic hydrogen has the well-known value of -13.6eV as binding energy. The proton’s rest mass and the electron’s rest mass are, respectively, 938 MeV and 511keV. Then, in total, the hydrogen’s atom total energy (equivalenty, its total mass multiplying by the square of the speed of light) is slightly smaller than the sum of the rest energies (respectively, rest masses) of its two components! This phenomenon is called, quite conveniently, mass defect. And it explains the third equation

    \[ \Delta E= \delta mc^2\]

In our example, for the hydrogen isotope _1^1H, it amounts to about

    \[ \dfrac{13.6}{938\cdot 10^6+511\cdot 10^3}=1.45\cdot 10^{-6}\%\]

This tiny fraction is equilibrated or compensated with the emission of a photon, when the hydrogen atom is formed by the recombination of a single proton and an electron. This phenomenon is larger when heavy nuclei are formed from protons and neutrons. The reason is that not only electromagnetism matters. The nuclear strong interaction is important as well. And it is stronger and more important than the electromagnetic interaction. Indeed, consider the _2^4He nucleus. It is the simplest helium atom. The rest mass is:

    \[ 2(m_p+m_e)=2\cdot (1.6726+1.6749)\cdot 10^{-27}kg=6.6950\cdot 10^{-27}kg\]

However, the true rest mass of the helium atom is lower: 6.6467\cdot 10^{-27}kg

The mass defect in this case is

    \[ \delta m=\dfrac{6.6950-6.6467}{6.6950}=0.72\%\]

In general, the process of FUSSION, the formation of light nuclei from protons, electrons and neutrons, and/or the formation of heavier nucleir from lighter nuclei is associated the energy release. The new nuclear species created by this mechanism has a lower total energy (or total mass) than those of their single constituents. Moreover, exothermic nuclear reactions can be found in many stars like our sun (turning hydrogen into helium, helium into carbon,…), in hydrogen bombs, tokamaks, and it will be of course the goal of the forthcoming International Thermonuclear Experimental Reactor (ITER).

The composition of heavy nuclei through endothermic nuclear reactions happens in Nature only when supernovae explode or in the radioactive elements. Nuclei formed in nuclear reactions by the Nature (inside the stars) can be decomposed into lighter nuclei, a process associated with energy release. This is the known process called FISSION. It also happens in radioactive substances that are unstable and decay into more stable and lighter nuclei. Thus, the equivalence mass-energy helped to explain the mysterious behaviour of radioactive elements/substances like those that were discovered by Pierre and Madame Curie. The nuclear fission of uranium and plutonium is used today in nuclear reactors all over the world, and it is also a power supply for spatial probes like the New Horizons probe, or powerful explorer robotic labs like the Curiosity (to land in Mars in August, 2012). As an additional curiosity, the energy released in the Hiroshima bomb was equivalent to a mas defect of as little as 1 single gram!

In summary, there are three concepts behind the 3 equations above. Of course, experts know about this fact. Physics is about concepts, not about Mathematics even when we used advanced high level Mathematics, the 3 equatios mean different things. However, generally speaking, in spite of representing different ideas, the hidden concept is the same: the equivalence mass-energy in Special Relativity.

    \[ \boxed{E=Mc^2\leftrightarrow \mbox{Total relativistic energy}}\]

    \[ \boxed{e=mc^2\leftrightarrow \mbox{Relativistic rest energy}}\]

    \[ \boxed{K=\Delta E=\delta mc^2=(M-m)c^2\leftrightarrow \mbox{Mass defect}}\]

    \[ U=mc^2+K+E_p=mc^2+B.E.\leftrightarrow \mbox{Internal energy=rest mass+binding energy }\]


    \[ B.E.=K+E_p=(\mbox{Kinetic Energy+Potential Energy})\]


    \[ \boxed{E^2=(mc^2)^2+(\mathbf{P}c)^2\leftrightarrow \mbox{Total energy-Rest energy-Momentum equivalence}}\]

These equations represent one of the most impressive results of contemporary physics. Don’t forget that if you use cell phones, bateries, electricity, part of the energy you consume every day is due to commercial nuclear reactors. It has its dangers, but mastering the energy production has caused a technological advance in our society. And it has not finished yet.

Well, now we will face a harder problem and issue in Modern Physics. What is the definition of mass? Relativity says only that it is equivalent to energy. But, what is energy? What is mass? Galileo and Newton wondered about this deep issue. Let me review the different notions of mass in classical physics and relativity:

    \[ \mbox{Classical mass}\begin{cases}\mbox{Inertial mass:} \;\; \mathbf{F}=m_i\mathbf{a} \\ \mbox{Momentum:}\;\; \mathbf{p}=m_i\mathbf{v}\;\;\rightarrow \mathbf{F}=\dfrac{d\mathbf{p}}{dt}\\ \mbox{Gravitational mass}\begin{cases} \mbox{Active:}\;\; \phi=\int \mathbf{g}\cdot d\mathbf{r}=-4\pi G_NM_g^{active}\\ \mbox{Passive:}\;\; \mathbf{F}_g=m_g^{passive}\mathbf{g}\end{cases}\end{cases}\]

    \[ \mbox{Mass and relativity}\begin{cases}\mbox{Mass-Energy equivalence:}\;\; E=Mc^2\;\; \\ \mbox{Generalized Newton's law:}\;\; \mathbb{F}=m\mathbb{A}\;\\ \mbox{Momentum:}\;\; \mathbf{P}=M\mathbf{v}=m\mathbf{p}=m\gamma \mathbf{v}\;\;\rightarrow\mathbb{F}=\dfrac{d\mathbb{P}}{d\tau} \\ \mbox{Invariant mass:}\; m \rightarrow \;\;\mathbb{P}^2=-m^2c^2\;\; \rightarrow e=mc^2\\ \mbox{Relativistic mass:}\;\; M \begin{cases}\mbox{Transversal:}\; M_T=m\gamma \\ \mbox{Longitudinal:}\; M_L=\gamma^2M_T=\gamma^3m \end{cases}\end{cases}\]

Mass in galilean and newtonian physics is an interesting concept. Galileo’s inertia law is indeed one the celebrated Newton’s laws of Dynamics. Inertia is caused by mass, and hence it is called inertial mass m_i. This scalar magnitude or number is the multiplicative constant before the velocity in the definition of classical (linear) momentum. Therefore, according to the fundamental law of Dynamics, inertial mass is a measure of an object’s resistance to changing its state of motion when a external force is applied. This mass can be experimentally determined by applying some force to an object and measuring the acceleration from that force.  For instance, an object with small inertial mass will accelerate more than an object with large inertial mass when acted upon by the same force. Greater the (inertial) mass, greater the inertia. Mathematically speaking, this is expressed as the well known equation

    \[ F=m_ia\]

Active gravitational mass M_g^{active} is a measure of the strength of an object’s gravitational flux ( the gravitational flux is equal to the surface of gravitationl field over an enclosing surface, i.e., the number of field lines that pass through a given test surface). The gravitational field can be measured by allowin a small test object/probe to freely fall and measuring its free-fall acceleration. For example, an object in free-fall near the Moon (or Mars) will experience less gravitational field, and hence, it accelerates slower than the same object would if it were in free-fall near the Earth. The gravitational fied near the Moon (Mars) has less active gravitational mass.

    \[ \phi=\int g=\int \mathbf{g}\cdot d\mathbf{r}=-4\pi G_NM_g^{active}\]

Passive gravitational mass is a measure of the strength of an object’s interaction with the gravitational field. It is something different from the previous concept. Passive gravitational mass is the proportionality constant between the object’s weight (i.e. the gravitational force) and its free-fall acceleration/gravitational field. Mathematically speaking,

    \[ \mbox{Weight}=\mathbf{F}_g=\mathbf{P}=m_g^{passive}\mathbf{g}\]

Two objects within the same gravitational field will experience the same acceleration. We will discuss this empirical fact when we study General Relativity in forthcoming articles. However, the object with a smaller passive gravitational mass will experience a smaller force (less weight) than the object with a larger passive gravitational mass. Please, do not confuse the weight’s symbol \mathbf{P}=\mathbf{F}_g with the momentum. Latin (or even greek) letters are finite, so we are into trouble with the notation sometimes. Fortunately, knowing what symbols are representing, generally the context is enough to avoid this kind of mistakes. Be aware anyway.

There are three additional notions of mass from the previous tables. In Special Relativity, we found the equivalence of mass and energy. Moreover we distinguished three extra new notions of mass: invariant mass, transversal mass (or relativistic mass) and longitudinal mass.

The invariant mass m is related to the squared momenergy vector. The transversal mass M=m\gamma is the mass a body in rest measures when he observes a massive particle (with rest mass m) in motion at constant relative constant respect to him in the perpendicular direction of motion. When the motion (i.e. the velocity) is not parallel to the rest frame in S, the observer measures an additional longitudinal mass M_L=\gamma M=\gamma^3 m in the direction parallel to him. Furthermore, due to the equivalence mass-energy in SR, there are a lot of physical processes (like pair production, nuclear fusion, nuclear fision or the gravitational bending of light) showing how mass and energy can be exchanged or released in high energy (high velocities) experiments. Finally, in spite we have not studied the phenomenon yet, photons and other pure energy particles (with rest mass equal to zero) are shown to exhibit a behaviou similar to passive gravitational mass, i.e., photons are affected by gravitational fields as well.

Let me put an example about “invariant mass” to clarify this important mass concept in Special Relativity.

Suppose a very heavy and massive particle A that decays to two very lightweight (likely massless) particles B and C. What you do in order to find the mass of A is to measure the energy E_B and E_C, and the angle \theta_{BC} between the directions of motion of the two particles. Then, you can calculate this quantity

    \[ 2 E_B E_C (1 -\cos \theta_{BC})\]

and then you take the square root, and divide by the speed of light squared

    \[ m_A=\dfrac{\sqrt{2 E_B E_C (1 -\cos \theta_{BC})}}{c^2}\]

This answer is called the invariant mass of particles B and C, and it equals the mass of particle A. Note that it is not a trivial sum of the rest masses of the particles B and C, since B and C are created with some momentum (velocity).  Therefore, you can calculate the mass of particle A simply by knowing E_B, E_C,\theta_{BC}, and plugging their values into the above formula!

If particles B and C aren’t that lightweight (e.g.neutrinos) or massless, a more precise answer is to replace the previous formulae with these:

    \[ (m_B)^2 c^4 + (m_C)^2 c^4 + 2 E_B E_C \left(1 - \dfrac{v_B v_C}{c^2} \cos \theta_{BC}\right)\]

where m_B and v_B are the mass and velocity of particle B (and similarly for C), and moreover c is the speed of light.  Again, taking the square root of this quantity, divided by c^2, is the invariant mass, and this will be the mass of the original/mother particle A. Mathematically speaking:

    \[ m_A=\sqrt{(m_B)^2+(m_C)^2+2\dfrac{E_B E_C}{c^2}\left( 1-\dfrac{v_Bv_C}{c^2} \cos\theta_{BC}\right)}\]

Finally, in order to make the ultimate claims about mass, momentum and energy in special relativity, I am going to borrow a nice example from physicist Matt Strassler. Suppose a particle 1 decay into two particles, called 2 and 3 (note that 1,2,and 3 are mere labels to identify the three particles). Suppose we have 3 different observers seeing that decay: Amy, Bob, and Cid. Amy is moving down with respect to the particle 1, Bob is moving to the left and Cid is stationary. What do they observe/measure? Moreover, we suppose the following conditions:

1st. Particle 1 has a mass (measured from its own reference frame) about 126 GeV/c^2. Particle 2 and 3 are photons and their rest masses are zero.

2nd. Amy moves with v=4/5c. Then \gamma = 5/3.

3rd. Bob moves to the left with constant v=4/5c as well.

First Reference Frame. Cid’s frame, particle 1 is stationary in this frame.

Cid observes two photons with E_2=E_3=63GeV=E_1/2.

Cid observes particle 1 in rest, i.e., particle 1 is stationary, so p_1=0 and then the photon momenta are p_2=p_3=63GeV/c, and p_2 and p_3 have moved in opposite directions (sometimes is said they are back-to-back in this case), i.e., as vectors, p_2=-p_3.

Energy and momentum are conserved, but mass of the objects is not, since the photons are massless and the Higgs was not.  What about the mass of the system?  What is the mass of the system of two photons?  It isn’t zero.  In fact it is obvious what it is. Just as for the particle 1 itself (which initially made up the entire system), the system of two photons has the same energy and momentum as the particle 1 did to start with,

    \[ E(total)=E_2+E_3 = 63 GeV + 63 GeV = 126 GeV\]

    \[ p(total)=p_2(up)+p_3(down) = 63 GeV/c-63 GeV/c = 0\]

And since p(total) = 0 for Cid,

    \[ m(total)=E(total)/c^2 =126GeV/c^2\]

which is the particle 1 mass. The total (or system’s) mass did not change during the decay, as we would expect with our physical intuition.

Second Reference Frame. Amy’s frame, particle 1 is not stationary in this frame.

Cid says particle 1 has p_1=0, E_1=126 GeV. What about Amy?  She says: particle 1  has

    \[ p_1=\gamma v E_1= \left(\dfrac{5}{3}\right)\left(\dfrac{4}{5}\right)E_1=168 GeV/c\]


    \[ E'_1=\gamma E_1=\dfrac{5}{3} E_1=210 GeV\]

Moreover, Amy observes in the decay that the photons have:

    \[ E'_2=\gamma (1+v/c)E_2=189 GeV\]


    \[ p_2=E_2/c\]

moving upward.

    \[ E'_3=\gamma (1-v/c)E_3=21 GeV\]


    \[ p_3=E_3/c\]

moving downward.

The conservations of energy and momentum works again, since, for Amy:

    \[ E_1=210 GeV\]


    \[ E_1=E_2+E_3=189+21=210 GeV\]


    \[ p_1=p_2+p_3=189+(-21)=168GeV\]


Note that the vector character of the momentum is important in order to get the right answer for the momentum conservation!

And the mass of the system is equal to the particle 1 mass both before and after the decay, because both before and after the decay we get

    \[ E(total)=210 GeV\]


    \[ p(total)=168GeV/c\]

, moving upward.

For Amy obtaining the particle 1 invariant mass, she has to use the squared of her 4-momentum:

    \[ 210^2-168^2=126^2=\mbox{invariant mass squared}\]

Remark: interestingly, the so called rest mass IS NOT conserved between different observers, since for Cid or Amy rest masses before and after the decay change.

Third Reference Frame. Bob’s frame, particle 1 is not stationary in this frame.

Now the same calculation that we did for Marie tells us that the particle 1 energy is E_1= 210 GeV and p_1= 168 GeV, but unlike for Amy, for whom the particle 1 is moving upward, for Bob the particle 1’s momentum is to the right. We have to work in “components” (i.e., we have to use vector calculus more directly, although I will keep things simple):

i) up-down part of p_1 = up-down part of p'_1

ii) \mbox{right-left part of}\,p_1= \gamma\left(\left(\mbox{right-left part of}\, p'_1\right) + v E'_1/c^2\right)

iii)E_1=\gamma (E'_1+v(\mbox{right-left part of}\,p'_1))

And these equations are going to be simpler than they look, because from Cid’s point of view, p has no right-left part; all the momentum is either up or down. So Bob sees the particle 1 having the following components:

i) up-down part of p_1=up-down part of p'_1=0

ii) right-left part of p_1=\gamma v E_1/c^2=\left(\dfrac{5}{3}\right)\left(\dfrac{4}{5}\right)126GeV/c=168GeV, rightward.

iii) E'_1=\gamma E_1=(5/3)126=210GeV, as we mentioned before.

In the same fashion, Bob calculates the upward-going photon is having

i) up-down part of p_1=up-down part of p'_2=63GeV upward.

ii) right-left part of p_1=\gamma v E_2=\left(\dfrac{5}{3}\right)\left(\dfrac{4}{5}\right)63GeV/c=84GeV/c rightward.

iii) E_1'=\gamma E_1=\dfrac{5}{3}63GeV=105GeV.

with the formulas for the second photon being the same except that its up-down part points downward. Notice that E = p c for massless photons, and that if we use the Pythagorean theorem for the size p of each photon’s momentum, we obtain the consistent result

    \[ p_2^2=(\mbox{upward part of}\,p'_2)^2+(\mbox{rightward part of}p'_2)^2\]

Plugging the numbers: (105GeV/c)^2=(63 GeV/c)^2+(84 GeV/c)^2. Please, note these results are indeed the known pythagorean rule to add components of a vector in order to get the modulus of the given vector.

So again, Bob observes completely different energies and momenta than Amy and Cid. But Bob still observes that energy and momentum are conserved.  Bob  also sees that the system of two photons has a mass equal to the mass of the particle 1 mass.  Why?  The total up-down part of the system’s momentum is zero; it cancels between the two photons.  The left-right part of the system’s momentum is 168 GeV/c; the total energy of the system is 210 GeV; and that’s just what Amy saw, with the only difference that she had the system’s momentum going up instead of to the right.   So, like Amy, Bob also sees that the mass of the system of two photons is 126GeV/c^2, the invariant mass of the original particle.


It is important to write some conclusions. The 3 observers, Amy, Bob and Cid:

1. Disagree about how much energy and momentum the particle 1 has and they also disagree about how much energy and momentum each of the two photons has.

2. Agree that: energy and momentum are conserved in the decay, the mass of the system is conserved in the decay. They also agree that the invariant mass of the 2 photons system is 126 GeV/c^2.

3. Interestingly, they agree moreover that the sum of the masses of the objects in the system was not conserved; it has decreased to zero from 126 GeV/c^2.

There is no accident.  Einstein knew that energy and momentum were conserved according to previous experiments, so he sought (and found) equations that would preserve this feature of the world.  What is behind this stuff: Poincaré’s group symmetry. This topic is to be discussed in the near future in my blog too.

There are even two more notions of “mass” we have not discussed or sketched in previous schemes:

1st. Curvature of spacetime.

The spacetime has an intrinsic geometry and thus, it has curvature. Curvature itself is a general relativistic manifestation of the existence of mass. Spacetime curvature is extremely weak and hard to measure. It was not discovered after it was predicted by Einstein himself with his theory or general relativity, a relativistic theory of gravitation indeed. Extremely precise atomic clocks on the surface of the Earth planet are found to measure less time (run slower) than similar clocks in space (where clocks run faster). The elapsed time is a form of curvature called gravitational time dilation. Other forms of curvature have been measure using some experiments (like the Gravity Probe B satellite). Even worst, Einstein himself invented something called the cosmological constant. Grossly speaking, the cosmological constant can be understood as the energy/mass of “vacuum” spacetime. Imagine a 1 x 1 x 1 meter cube. Erase with some clever procedure every mass/energy from that cube. Then, the density of energy of what remains there is not zero but equal to:

    \[ \rho_\Lambda=\dfrac{\Lambda c^4}{8\pi G}\]

The vacuum “weights” something!

2nd. Quantum mass.

Finally, we want to explain briefly how mass arises in quantum (likely relatistic too) theories. There are two contemporary notions on “quantum mass”. The first idea uses the Compton effect (to be explained here in the future) and the second explains mass as interaction with the so-called Higgs field. In the first case, without entering into further details at this moment, quantum mass appears itself like a difference between an object’s quantum frequency and its wave number. In this fashion, the quantum mass of an elemantary particle, say an electron for instance, it is related to the Compton wavelength and it can be determined through various forms of spectroscopy and it is also connected to the the Rydberg constant, the Böhr radius and the classical electron radius. The quantum mass of larger objects can be directly measured using a watt balance and, in relativistic Quantum Mechanics, mass is one of the irreducible representation labels (quantum numbers) of the Poincaré group.
The second notion of mass in Quantum theories arises in (relativistic) Quantum Field Theories. There is a “bare” mass and a “dressed mass” as well. We won’t discuss these two masses here. We are going deeper. Mass terms coupled to fields are not naturally gauge invariant, and we are forced to introduce the so-called Higgs mechanism in order to give a non-zero mass to the gauge bosons W^+, W^-,Z and the fundamental fermions (leptons and quarks) keeping safe gauge invariance. In the framework of Higgs mechanism one gets:

    \[ M_W=\dfrac{1}{2}gv\approx g\cdot 123 GeV\]

    \[ M_Z=\dfrac{1}{2}g_Zv\approx g_Z\cdot 123 GeV\]

    \[ m_f=vg_f\dfrac{\sqrt{2}}{2}\approx g_f \cdot 173 GeV\]

    \[ M_H=v\sqrt{2}\lambda\approx \lambda \cdot 348GeV\]


    \[ v=\dfrac{1}{\sqrt{\sqrt{2}G_F}}\approx 246GeV\]

is the so called v.e.v (vacuum expectation value) of the Higgs field (v is related to the Higgs boson mass) and g, g_Z, g_f are the coupling constants of the gauge bosons W, Z and the fermions to the Higgs boson, respectively. \lambda is the Higgs boson self-coupling. However, the gluons and the photon remain massless. Higgs boson itself gets a mass due to non-linear self-interaction in the Standard Model framework. Therefore, there is no explanation of the Higgs boson mass in the SM, the current accepted theory of fundamental electroweak interactions and strong interactions. Therefore, the SM is an incomplete (or effective) theory and we do not understand the ultimate origin of mass. We only shifted their final understanding to the yet mysterious origin of the Higgs field (and its Higgs particle constituents).

This is not the whole story, since, for instance, protons (or neutrons) get their masses non only through the constituent mass of its quarks. Quantum Chromodynamics (QCD) is the gauge theory of interacting quarks and gluons, and it is a non-abelian gauge theory. Thus, some part of the proton mass is due to nonlinear QCD interactions, not exclusively (indeed not mainly) to the interactions of quarks and Higg bosons! We leave this discussion here, including a final graph with some concepts of mass we discussed here:


PS: A final test to see if you understood and learnt what the mass-energy equivalence in Special Relativity is. Okun himself created a test about what is the truly fundamental mass-energy relationship in special relativity, and he gave 4 options ( note Okun uses other notation for M and m, but there is only one invariant mass!):

a) E=mc^2




Hint: Being purist and consistent with the full Lorentz invariance of special relativity, there is only one expression right. It is the one that allows to recover the correct newtonian limit and to preserve the notion of “invariant mass” without using the confusing, obsolete and misleading ideas like “relativistic mass” or “velocity dependent mass”. Indeed, the answer was my original motivation in the selection of the first image of this article.

Scroll to see the answer…



The answer is c, E_0=e=mc^2. Note that the subscript zero is irrelevant and it only confuses things such as the two notions of mass do if you are not careful enough.

LOG#019. Triangle mnemonics.

Today, we are going to learn some interesting mnemomic tricks using the celebrated Pythagorean Theorem from your young years at the school. We will be using some simple triangles to remember some of the wonderful formulae of Special Relativity. It is quite nice and surprising that basic euclidean trigonometry tools  are very helpful in the abstract realm of relativistic theories, a theory based on non-euclidean geometry! It is one of the great powers of Mathematics. Its amazing ability to model big things with simple pictures and equations. There is no other language like the mathematical language, and despite its terrible looking sometimes, it is awesome and beautiful most of the time. Of course, like it happens in Arts, mathematical beauty can be very subjective. So, you have to be trained in order to admire and love its great features.

The first triangle we are going to study is this one:


If we apply the Pythagorean Theorem to this triangle, we get:

    \[ X^2+(c\tau)^2=(ct)^2\]

This equation is indeed related to the square or “norm” of the spacetime vector (a.k.a., spacetime event)

    \[ \mathbb{X}\cdot \mathbb{X}=-c^2\tau^2=\mathbf{X}^2-c^2t^2\]

and thus, in complete agreement with the above triangle

    \[ \mathbf{X}^2+(c\tau)^2=(ct)^2\]

And it can be easily stated with words or concepts as:

    \[ \boxed{(\mbox{SPACE})^2+(\mbox{PROPER TIME})^2=(\mbox{TIME})^2}\]

You also get:

    \[ \sin \varphi= \dfrac{\tau}{t}\]

    \[ \cos \varphi=\dfrac{X}{ct}\]

    \[ \tan \varphi = \dfrac{X}{c\tau}=\dfrac{v}{c}=\beta\]


    \[ \boxed{\tan \varphi=\beta}\]


    \[ \gamma^2=\dfrac{1}{1-\beta^2}\]

Elementary trigonometry yields:

    \[ 1-\tan ^2\varphi=2-\sec^2\varphi=\dfrac{2\cos^2\varphi-1}{\cos^2\varphi}=\dfrac{\cos 2\varphi}{\cos^2\varphi}\]

    \[ \boxed{\gamma^2=\dfrac{\cos^2\varphi}{\cos 2\varphi}}\]

In the same way, we can draw other cool triangle:


Now, we can relate the Pythagorean theorem with the squared momenergy vector:

    \[ \mathbb{P}\cdot \mathbb{P}=-(mc)^2=\mathbf{P}^2-\dfrac{E}{c^2}\]

    \[ (mc^2)^2+(\mathbf{P}c)^2=E^2\]

or, in agreement with the last triangle:

    \[ (mc^2)^2+(Pc)^2=E^2\]

In words, it means simply that

    \[ \boxed{(\mbox{MASS})^2+(\mbox{MOMENTUM})^2=(\mbox{ENERGY})^2}\]

There, K(v) is the relativistic kinetic energy we have studied before in this blog. From this triangle, simple trigonometry provides:

    \[ \sin \phi = \dfrac{1}{\gamma}\]

    \[ \cos \phi=\dfrac{Pc}{E}\]

    \[ \tan \phi=\dfrac{mc}{P}\]

    \[ \dfrac{E}{mc^2}=\gamma=\dfrac{1}{\sin \phi}\]

    \[ \dfrac{K}{mc^2}=\gamma - 1=\dfrac{1-\sin \phi}{\sin \phi}\]

    \[\dfrac{P}{mc}=\dfrac{1}{\tan \phi}\]

A more refined version of the above triangle is given by the following drawing ( the relativistic kinetic energy is now written as T instead of K):


In summary, triangles are cool!

Remark(I): The relationship (mc^2)^2+(\mathbf{P}c)^2=E^2=E^2(P,m) is also called dispersion relation in SR. Some theories beyond the Standard Model of Particle Physics and/or “extended” relativies can modify this equation. However, any known experiment seems to be consistent with SR and this dispersion relationship as far as I know.

Remark(II): Massless particles with m=0, i.e., particles with rest mass equal to zero, satisfy

    \[ \boxed{E=Pc}\]

They are called ultra-relativistic particles. This is the case of massless gauge bosons like photons, gluons and likely gravitons. It was thought that the neutrinos were massless too. In recents years, though, we have managed conclusive evidence that neutrinos are not massless and they have a very tiny mass. They are yet ultra-relativistic particles, since we can indeed use yet the equation E=Pc up to a great degree of precision, but they are ultimately MASSIVE particles, and, for them, the exact mass-energy equation reads

    \[ \boxed{E=\sqrt{(mc^2)^2+(\mathbf{P}c)^2}}\]

This equation is of course the energy-mass general rule for any massive particle like neutrinos, massive fermions, massive gauge bosons, the Higgs particle, and so on.