## LOG#005. Lorentz transformations(I).

For physicists working with objects approaching the light speed, e.g., handling with electromagnetic waves, the use of special relativity is essential.

The special theory of relativity is based on two single postulates:

1st) Covariance or invariance of all the physical laws (Mechanics, Electromagnetism,…) for all the inertial observers ( i.e. those moving with constant velocity relative to each other).

It means that there is no preferent frame or observer, only “relative” motion is meaningful when speaking of motion with respect to certain observer or frame. Indeed, unfortunately, it generated a misnomer and a misconception in “popular” Physics when talking about relativity (“Everything is relative”). What is relative then? The relative motion between inertial observers and its description using certain “coordinates” or “reference frames”. However, the true “relativity” theory introduces just about the opposite view. Physical laws and principles are “invariant” and “universal” (not relative!).  Einstein himself was aware of this, in spite he contributed to the initial spreading of the name “special relativity”, understood as a more general galilean invariance that contains itself the electromagnetic phenomena derived from Maxwell’s equations.

2nd) The speed of light is independent of the source motion or the observers. Equivalently, the speed of light is constant everywhere in the Universe.

No matter how much you can run, speed of light is universal and invariant. Massive particles can never move at speed of light. Two beams of light approaching to each other does not exceed the speed of light either. Then, the usual rule for the addition of velocities is not exact. Special relativity provides the new rule for adding velocities.

In this post, the first of a whole thread devoted to special relativity, I will review one of the easiest ways to derive the Lorentz transformations. There are many ways to “guess” them, but I think it is very important to keep the mathematics as simple as possible. And here simple means basic (undergraduate) Algebra and some basic Physics concepts from electromagnetism, galilean physics and the use of reference frames. Also, we will limite here to 1D motion in the x-direction.

Let me begin! We assume we have two different observers and frames, denoted by S and S’. The observer in S is at rest while the observer in S’ is moving at speed $v$ with respect to S. Classical Physics laws are invariant under the galilean group of transformations:

$x'=x-vt$

We know that Maxwell equations for electromagnetic waves are not invariant under Galileo transformations, so we have to search for some deformation and generalization of the above galilean invariance. This general and “special” transformation will reduce to galilean transformations whenever the velocity is tiny compared with the speed of light (electromagnetic waves). Mathematically speaking, we are searching for transformations:

$x'=\gamma (x-vt)$

and

$x=\gamma (x'+vt')$

for the inverse transformation. There $\gamma=\gamma(c,v)$ is a function of the speed of light (denoted as c, and constant in every frame!) and the relative velocity $v$ of the moving object in S’ with respect to S. The small velocity limit of special relativity to galilean relativity imposes the condition:

$\displaystyle{\lim_{v \to 0} \gamma (c,v) =1}$

By the other hand, according to special relativity second postulate, light speed is constant in every reference frame. Therefore, the distance a light beam ( or wave packet) travels in every frame is:

$x=ct$ in S, or equivalently $x^2=c^2t^2$

and

$x'=ct'$ in S’, or equivalently $x'^2=c^2t'^2$

Then, the squared spacial  separation between the moving light-like object at S’ with respect to S will be

$x^2-x'^2=c^2(t^2-t'^2)$

Squaring the modified galilean transformations, we obtain:

$x'^2=\gamma ^2(x-vt)^2 \rightarrow x'^2=\gamma ^2 (x^2+v^2t^2-2xvt) \rightarrow x'^2-\gamma ^2x^2+2\gamma ^2xvt=\gamma ^2 v^2t^2$

$x^2=\gamma ^2 (x'+vt')^2 \rightarrow x^2-\gamma ^2x'^2-2\gamma ^2x'vt'=\gamma ^2v^2t'^2$

The only “weird” term in the above last two equations are the mixed term with “xvt” (or the x’vt’ term). So, we have to make some tricky algebraic thing to change it. Fortunately for us, we do know that $x'=\gamma(x-vt)$, so

$x'=\gamma x -\gamma vt \rightarrow \gamma x'=\gamma ^2 x-\gamma ^2 vt \rightarrow \gamma xx'=\gamma ^2 x^2-\gamma ^2 xvt$

and thus

$2\gamma xx'=2 \gamma ^2 x^2-2\gamma ^2 xvt \rightarrow 2\gamma ^2 xvt =2\gamma ^2x^2-2\gamma xx'$

In the same way,  we proceed with the inverse transformations:

$x=\gamma x'+\gamma vt' \rightarrow \gamma x=\gamma ^2x'+\gamma ^2vt' \rightarrow \gamma xx'=\gamma ^2x'^2-\gamma ^2x'vt'$

and thus

$2\gamma xx'=2\gamma^2x'^2+2\gamma^2x'vt' \rightarrow 2\gamma^2x'vt'=2\gamma^2x'^2-2\gamma^2xx'$

We got it! We can know susbtitute the mixed x-v-t and x’-v-t’ triple terms in terms of the last expressions. In this way, we get the following equations:

$x'^2=\gamma ^2(x-vt)^2 \rightarrow x'^2=\gamma ^2(x^2+v^2t^2-2xvt) \rightarrow x'^2-\gamma ^2x^2+2\gamma ^2x^2-2\gamma ^2xx'=\gamma ^2v^2t^2 \rightarrow x'^2+\gamma ^2x^2-2\gamma ^2xx'=\gamma ^2v^2t^2$

$x'^2=\gamma ^2(x'+vt')^2 \rightarrow x^2=\gamma ^2(x'^2+v^2t^2+2x'vt') \rightarrow x^2-\gamma ^2x'^2+2\gamma ^2x'^2-2\gamma ^2xx'=\gamma ^2v^2t'^2 \rightarrow x^2+\gamma ^2x'^2-2\gamma ^2xx'=\gamma ^2v^2t'^2$

And now, the final stage! We substract the first equation to the second one in the above last equations:

$x^2-x'^2+\gamma ^2(x'^2-x^2)=\gamma ^2v^2(t'^2-t^2) \rightarrow (x'^2-x^2)(\gamma ^2-1)= \gamma ^2v^2(t'^2-t^2)$

But we know that $x^2-x'^2=c^2(t^2-t'^2)$, and so

$(x'^2-x^2)(\gamma ^2-1)= \gamma ^2v^2(t'^2-t^2) \rightarrow c^2(x'^2-x^2)(\gamma ^2-1)= \gamma ^2v^2(x'^2-x^2)$

then

$c^2(\gamma ^2-1)= \gamma ^2v^2 \rightarrow -c^2= -\gamma ^2c^2+\gamma ^2v^2 \rightarrow \gamma ^2=\dfrac{c^2}{c^2-v^2}$

or, more commonly we write:

$\gamma ^2=\dfrac{1}{1-\dfrac{v^2}{c^2}}$

and therefore

$\gamma =\dfrac{1}{\sqrt{1-\dfrac{v^2}{c^2}}}$

Moreover, we usually define the beta (or boost) parameter to be

$\beta = \dfrac{v}{c}$

To obtain the time transformation we only remember that $x'=ct'$ and $x=ct$ for light signals, so then, for time we  obtain:

$x'=\gamma (x-vt) \rightarrow t' =x' /c= \gamma (x/c-vt/c)=\gamma ( t- vx/c^2)$

Finally, we put everything together to define the Lorentz transformations and their inverse for 1D motion along the x-axis:

$x'=\gamma (x-vt)$

$y'=y$

$z'=z$

$t'=\gamma \left( t-\dfrac{vx}{c^2}\right)$

and for the inverse transformations

$x=\gamma (x'+vt)$

$y=y'$

$z=z'$

$t=\gamma \left( t'+\dfrac{vx'}{c^2}\right)$

ADDENDUM: THE EASIEST, FASTEST AND SIMPLEST DEDUCTION  of $\gamma$ (that I do know).

If you don’t like those long calculations, there is a trick to simplify the “derivation” above.  The principle of Galilean relativity enlarged for electromagnetic phenomena implies the structure:

$x'=\gamma (x-vt)$

and

$x=\gamma (x'+vt')$

for the inverse.

Now, the second postulate of special relativity says that light signals travel in such a way light speed in vacuum is constant, so $t=x/c$ and $t'=x'/c$. Inserting these times in the last two equations:

$x'=\gamma (1-v/c)x$

and

$x=\gamma (1+v/c)x'$

Multiplying these two equations, we get:

$x'x =\gamma ^2(1+v/c)(1-v/c)xx'$.

If we consider any event beyond the initial tic-tac, i.e., if we suppose $t\neq 0$ and $t'\neq 0$, the product $xx'$ will be different from zero, and we can cancel the factors on both sides to get what we know and expect:

$\gamma^2(1-v^2/c^2)=1$

i.e.

$\gamma = \dfrac{1}{\sqrt{1-\dfrac{v^2}{c^2}}}$

## LOG#004. Feynmanity.

The dream of every theoretical physicist, perhaps the most ancient dream of every scientist, is to reduce the Universe ( or the Polyverse if you believe we live in a Polyverse, also called Multiverse by some quantum theorists) to a single set of principles and/or equations. Principles should be intuitive and meaningful, while equations should be as simple as possible but no simpler to describe every possible phenomenon in the Universe/Polyverse.

What is the most fundamental equation?What is the equation of everything? Does it exist? Indeed, this question was already formulated by Feynman himself  in his wonderful Lectures on Physics! Long ago, Feynman gave us other example of his physical and intuitive mind facing the First Question in Physics (and no, the First Question is NOT “(…)Dr.Who?(…)” despite many Doctors have faced it in different moments of the Human History).

Today, we will travel through this old issue and the modest but yet intriguing and fascinating answer (perhaps too simple and general) that R.P. Feynman found.

Well, how is it?What is the equation of the Universe? Feynman idea is indeed very simple. A nullity condition! I call this action a Feynman nullity, or feynmanity ( a portmanteau), for brief. The Feynman equation for the Universe is a feynmanity:

$\boxed{U=0}$

Impressed?Indeed, it is very simple. What is the problem then?As Feynman himself said, the problem is really a question of “order” and a “relational” one. A question of what theoretical physicists call “unification”. No matter you can put equations together, when they are related, they “mix” somehow through the suitable mathematical structures.  Gluing “different” pieces and objects is not easy.  I mean, if you pick every equation together and recast them as feynmanities, you will realize that there is no relation a priori between them. However, it can not be so in a truly unified theory. Think about electromagnetism. In 3 dimensions, we have 4 laws written in vectorial form, plus the gauge condition and electric charge conservation through a current. However, in 4D you realize that they are indeed more simple. The 4D viewpoint helps to understand electric and magnetic fields as the the two sides of a same “coin” (the coin is a tensor). And thus, you can see the origin of the electric and magnetic fields through the Faraday-Maxwell tensor $F_{\mu \nu }$. Therefore, a higher dimensional picture simplifies equations (something that it has been remarked by physicists like Michio Kaku or Edward Witten) and helps you to understand the electric and magnetic field origin from a second rank tensor on equal footing.

You can take every equation describing the Universe set it equal to zero. But of course, it does not explain the origin of the Universe (if any), the quantum gravity (yet to be discovered) or whatever. However, the remarkable fact is that every important equation can be recasted as a Feynmanity! Let me put some simple examples:

Example 1. The Euler equation in Mathematics. The most famous formula in complex analysis is a Feynmanity $e^{i\pi}+1=0$ or $e^{2\pi i}=1+0$ if you prefer the constant $\tau=2\pi$.

Example 2. The Riemann’s hypothesis. The most important unsolved problem in Mathematics(and number theory, Physics?) is the solution to the equation $\zeta (s)=0$, where $\zeta(s)$ is the celebrated riemann zeta function in complex variable $s=\kappa + i \lambda$, $\kappa, \lambda \in \mathbb{R}$. Trivial zeroes are placed in the real axis $s=-2n$ $\forall n=1,2,3,...,\infty$. Riemann hypothesis is the statement that every non-trivial zero of the Riemann zeta function is placed parallel to the imaginary axis and they have all real part equal to 1/2. That is, Riemann hypothesis says that the feynmanity $\zeta(s)=0$ has non-trivial solutions iff $s=1/2\pm i\lambda _n$, $\forall n=1,2,3,...,\infty$, so that

$\displaystyle{\lambda_{1}=14.134725, \lambda_{2}= 21.022040, \lambda_{3}=25.010858, \lambda _{4}=30.424876, \lambda_{5}=32.935062, ...}$

I generally prefer to write the Riemann hypothesis in a more symmetrical and “projective” form. Non-trivial zeroes have the form $s_n=\dfrac{1\pm i \gamma _n}{2}$ so that for me, non-trivial true zeroes are derived from projective-like operators $\hat{P}_n=\dfrac{1\pm i\hat{\gamma} _n}{2}$, $\forall n=1,2,3,...,\infty$. Thus

$\gamma_1 =28.269450, \gamma_2= 42.044080, \gamma_3=50.021216, \gamma _4=60.849752, \gamma_5=65.870124,...$

Example 3. Maxwell equations in special relativity. Maxwell equations have been formulated in many different ways along the history of Physics. Here a picture of that. Using tensor calculus, they can be written as 2 equations:

$\partial _\mu F^{\mu \nu}-j^\nu=0$

and

$\epsilon ^{\sigma \tau \mu \nu} \partial _\tau F_{\mu\nu}=\partial _\tau F_{\mu \nu}+ \partial _\nu F_{\tau \mu}+\partial_\mu F_{\nu \tau}=0$

Using differential forms:

$dF=0$

and

$d\star F-J=0$

Using Clifford algebra (Clifford calculus/geometric algebra, although some people prefer to talk about the “Kähler form” of Maxwell equations) Maxwell equations are a single equation: $\nabla F-J=0$ where the geometric product is defined as $\nabla F=\nabla \cdot F+ \nabla \wedge F$.

Indeed, in the Lorentz gauge  $\partial_\mu A^\mu=0$, the Maxwell equations reduce to the spin one field equations:

$\square ^2 A^\nu=0$

where we defined

$\square ^2=\square \cdot \square = \partial_\mu \partial ^\mu =\dfrac{\partial^2}{\partial x^i \partial x_i}-\dfrac{\partial ^2}{c^2\partial t^2}$

Example 4. Yang-Mills equations. The non-abelian generalization of electromagnetism can be also described by 2 feynmanities:

The current equation for YM fields is $(D^{\mu}F_{\mu \nu})^a-J_\nu^a=0$

The Bianchi identities are $(D _\tau F_{\mu \nu})^a+( D _\nu F_{\tau \mu})^a+(D_\mu F_{\nu \tau})^a=0$

Example 5. Noether’s theorems for rigid and local symmetries. Emmy Noether proved that when a r-paramatric Lie group leaves the lagrangian quasiinvariant and the action invariant, a global conservation law (or first integral of motion) follows. It can be summarized as:

$D_iJ^i=0$ for suitable (generally differential) operators $D^i,J^i$ depending on the particular lagrangian (or lagrangian density) and $\forall i=1,...,r$.

Moreover, she proved another theorem. The second Noether’s theorem applies to infinite-dimensional Lie groups. When the lagrangian is invariant (quasiinvariant is more precise) and the action is invariant under the infinite-dimensional Lie group parametrized by some set of arbitrary (gauge) functions ( gauge transformations), then some identities between the equations of motion follow. They are called Noether identities and take the form:

$\dfrac{\delta S}{\delta \phi ^i}N^i_\alpha=0$

where the gauge transformations are defined locally as

$\delta \phi ^i= N^i_\alpha \epsilon ^\alpha$

with $N^i_\alpha$ certain differential operators depending on the fields and their derivatives up to certain order. Noether theorem’s are so general that can be easily generalized for groups more general than those of Lie type. For instance, Noether’s theorem for superymmetric theories (involving lie “supergroups”) and many other more general transformations can be easily built. That is one of the reasons theoretical physicists love Noether’s theorems. They are fully general.

Example 6. Euler-Lagrange equations for a variational principle in Dynamics take the form $\hat{E}(L)=0$, where L is the lagrangian (for a particle or system of particles and $\hat{E}(L)$ is the so-called Euler operator for the considered physical system, i.e., if we have finite degrees of freedom, L is a lagrangian) and a lagrangian “density” in the more general “field” theory framework( where we have infinite degrees of freedom and then L is a lagrangian density $\mathcal{L}$. Even the classical (and quantum) theory of (super)string theory follows from a lagrangian (or more precisely, a lagrangian density). Classical actions for extended objects do exist, so it does their “lagrangians”. Quantum theory for p-branes $p=2,3,...$ is not yet built but it surely exists, like M-theory, whatever it is.

Example 7.  The variational approach to Dynamics or Physics implies  a minimum ( or more generally a “stationary”) condition for the action. Then the feynmanity for the variational approach to Dynamics is simply $\delta S=0$. Every known fundamental force can be described through a variational principle.

Example 8. The Schrödinger’s equation in Quantum Mechanics $H\Psi-E\Psi=0$, for certain hamiltonian operator H. Note that the feynmanity is itself $H=0$ when we studied special relativity from the hamiltonian formalism. Even more, in Loop Quantum Gravity, one important unsolved problem is the solution to the general hamiltonian constraint for the gauge “Wilson-like” loop variables, $\hat{H}=0$.

Example 9. The Dirac’s equation $(i\gamma ^\mu \partial_\mu - m) \Psi =0$ describing free spin 1/2 fields. It can be also easily generalized to interacting fields and even curved space-time backgrounds. Dirac equation admits a natural extension when the spinor is a neutral particle and it is its own antiparticle through the Majorana equation

$i\gamma^\mu\partial_\mu \Psi -m\Psi_c=0$

Example 10. Klein-Gordon’s equation for spin 0 particles: $(\square ^2 +m^2 )\phi=0$.

Example 11. Rarita-Schwinger spin 3/2 field equation: $\gamma ^{\mu \nu \sigma}\partial_{\nu}\Psi_\sigma+m\gamma^{\mu\nu}\Psi_\nu=0$. If $m=0$ and the general conventions for gamma matrices, it can be also alternatively writen as

$\gamma ^\mu (\partial _\mu \Psi_\nu -\partial_\nu\Psi_\mu)=0$

Note that antisymmetric gamma matrices verify:

$\gamma ^{\mu \nu}\partial_{\mu}\Psi_\nu=0$

More generally, every local (and non-local) field theory equation for spin s can be written as a feynmanity or even a theory which contains interacting fields of different spin( s=0,1/2,1,3/2,2,…).  Thus, field equations have a general structure of feynmanity(even with interactions and a potential energy U) and they are given by $\Lambda (\Psi)=0$, where I don’t write the indices explicitely). I will not discuss here about the quantum and classical consistency of higher spin field theories (like those existing in Vasiliev’s theory) but field equations for arbitrary spin fields can be built!

Example 12. SUSY charges. Supersymmetry charges can be considered as operators that satisfy the condicion $\hat{Q}^2=0$ and $\hat{Q}^{\dagger 2}=0$. Note that Grassman numbers, also called grassmanian variables (or anticommuting c-numbers) are “numbers” satisfying $\theta ^2=0$ and $\bar{\theta}^2=0$.

The Feynman’s conjecture that everything in a fundamental theory can be recasted as a feynmanity seems very general, perhaps even silly, but  it is quite accurate for the current state of Physics, and in spite of the fact that the list of equations can be seen unordered of unrelated, the simplicity of the general feynmanity (other of the relatively unknown neverending Feynman contributions to Physics)

$something =0$

is so great that it likely will remain forever in the future of Physics. Mathematics is so elegant and general that the Feynmanity will survive further advances unless  a  Feynman “inequality” (that we could call perhaps, unfeynmanity?) shows to be more important and fundamental than an identity. Of course, there are many important results in Physics, like the uncertainty principle or the second law of thermodynamics that are not feynmanities (since they are inequalities).

Do you know more examples of important feynmanities?

Do you know any other fundamental physical laws or principles that can not be expressed as feynmanities, and then, they are important unfeynmanities?

## LOG#003. Entropy.

“I propose to name the quantity S the entropy of the system, after the Greek word [τροπη trope], the transformation. I have deliberately chosen the word entropy to be as similar as possible to the word energy: the two quantities to be named by these words are so closely related in physical significance that a certain similarity in their names appears to be appropriate.”  Clausius (1865).

Entropy is one of the strangest and wonderful concepts in Physics. It is essential for the whole Thermodynamics and it is essential also to understand thermal machines. It is essential for Statistical Mechanics and the atomic structure of molecules and fundamental particles. From the Microcosmos to the Macrocosmos, entropy is everywhere: from  the kinetic theory of gases, information theory as we learned from the previous post, and it is also relevant in the realm of General Relativity, where equations of state for relativistic and non-relativistic particles arise too. And even more, entropy arises in the Black Hole Thermodynamics in a most mysterious form that nobody understands yet.

By the other hand, in the Quantum Mechanics, entropy arises in the (Von Neumann’s) approach to density matrix, the quantum incarnation of the classical version of entropy, ultimately related to the notion of  quantum entanglement. I have no knowledge of any other concept in Physics that can appear in such diffent branches of Physics. The true power of the concept of entropy is its generality.

There are generally three foundations for entropy, three roads to the entropy meaning that physicists have:

– Thermodynamical Entropy. In Thermodynamics, entropy arises after integrating out the heat with an integrating factor that is nothing but the inverse of the temperature. That is:

$\boxed{dS=\oint_\gamma\dfrac{\delta Q}{T}\rightarrow \Delta S= \dfrac{\Delta Q}{T}}$

The studies of thermal machines that existed as logical consequence of the Industrial Revolution during the XIX century created the first definition of entropy. Indeed, following Clausius, the entropy change $\Delta S$ of a thermodynamic system absorbing a quantity of heat $\Delta Q$  at absolute temperature T is simply the ratio between the two, as the above formula shows!  Armed with this definition and concept, Clausius was able to recast Carnot’s statement that steam engines can not exceed a specific theoretical optimum efficiency into a much grander principle we do know as the “2nd law of Thermodynamics” (sometimes called The Maximum Entropy, MAX-ENT, principle by other authors)

$\boxed{\mbox{The entropy of the Universe tends to a maximum}}$

The problem with this definition and this principle is that  it leaves unanswered the most important questionwhat really is the meaning of entropy? Indeed, the answer to this question had to await the revival of  atomic theories of the matter at the end of the 19th century.

– Statistical Entropy.  Ludwig Boltzmann was the scientist who provided a fundamental theoretical basis to the concept of entropy. His key observation was that absolute temperature is nothing more than the average energy per molecular degree of freedom. This fact strongly implies that Clausius ratio between absorbed energy and absolute temperature is nothing more than the number of molecular degrees of freedom. That is, Boltzmann greatest idea was indeed very simply put into words:

$\boxed{S=\mbox{Number of microscopical degrees of freedom}= N_{dof}}$

We can see a difference with respect to the thermodynamical picture of entropy: Boltzmann was able to show that the number of degrees of freedom of a physical system can be easily linked to the number of microstates $\Omega$ of that system. And it comes with a relatively simple expression from the mathematical viewpoint (using the 7th elementary arithmetical operation, beyond the more known addition, substraction, multiplication, division, powers, roots,…)

$\boxed{S \propto \log \Omega}$

Really the base of the logarithm is absolutely conventional. Generally, it is used the natural base (or the binary base, see below).

Why does it work? Why is the number of degrees of freedom related to the logarithm of the total number of available microscopical states? Imagine a system with 2 simple degrees of freedom, a coin. Clone/copy it up to N of those systems. Then, we have got a system of N coins  showing head or tail. Each coin contributes one degree of freedom that can take two distinct values. So in total we have N (binary, i.e., head or tail) degrees of freedom. Simple counting tells us that each coin (each degree of freedom) contributes a factor of two to the total number of distinct states the system can be in. In other words, $\Omega = 2^N$.  Taking the base-2 logarithm  of both sides of this equation yields the logarithm of the total number of states to equal the number of degrees of freedom: $\log_2 \Omega = N$.

This argument can be made completely general. The key argument is that the total number of states  $\Omega$ follows from multiplying together the number of states for each degree of freedom. By taking the logarithm of  $\Omega$, this product gets transformed into an addition of degrees of freedom. The result is an additive entropy definition: adding up the entropies of two independent subsystems provides us the entropy of the total system.

– Information Entropy.
Time machine towards the past future. 20th century. In 1948, Claude Shannon, an electrical engineer at Bell Telephone Laboratories, managed to mathematically quantify the concept of “information”. The key result he derived is that to describe the precise state of a system that can be in states labelled by numbers $1,2,...,n$ with probabilities $p_1, p_2,...,p_n$.

It requires a well-defined minimum number of bits. In fact, the best one can do is to assign $\log_2 (1/p_i)$ bits to the one event with state $i$. Result:  statistically speaking the minimum number of bits one needs to be capable of specifying the system regardless its precise state will be

$\displaystyle{\mbox{Minimum number of bits} = \sum_{i=i}^{n}p_i\log_2 p_i = p_1\log_2 p_1+p_2\log_2 p_2+...+p_n\log_2 p_n}$

When applied to a system that can be in $\Omega$ states, each with equal  probability $p= 1/\Omega$, we get that

$\mbox{Minimum number of bits} = \log_2 \Omega$

We got it. A full century after the thermodynamic and statistical research we were lead to the simple conclusion that the Boltzmann expression $S = \log \Omega$ is nothing more than an alternative way to express:

$S = \mbox{number of bits required to define some (sub)system}$

Entropy is therefore a simple bit (or trit, cuatrit, pentit,…,p-it) counting of your system. The number of bits required to completely determine the actual microscopic configuration between the total number of microstates allowed. In these terms the second law of thermodynamics tells us that closed systems tend to be characterized by a growing bit count. Does it work? Yes, it does. Very well as far as we know…Even in quantum information theory you have an analogue with the density matrix. Even it works in GR and even it strangely works with Black Hole Thermodynamics, excepting the fact that entropy is the area of the horizon, temperature is the surface gravity in the horizon, and the fact that mysteriously, BH entropy is proportional not to the volume as one could expect from conventional thermodynamics (where entropy scales as the volume of the container) , but to the area of the horizon. Incredible, isn’t it? That scaling of the Black Hole entropy with the area was the origin of the holographic principle. But it is far away where my current post wants to go today.

Indeed, there is  a subtle difference between the statistical and the informational entropy. A sign minus in the definition. (Thermodynamical) Entropy can be understood as “missing” (information) entropy:

$\boxed{Entropy = - Information}$

or mathematically

$S= - I$, do you prefer maybe $I+S=0$?

That is, entropy is the same thing that information, excepting a minus sign! So, if you add the same thing to its opposite you get zero.

The question that naturally we face in this entry is the following one: what is the most general mathematical formula/equation for “microscopic” entropy? Well, as many others great problems in Physics, it depends on the axiomatics and your assumptions! Let’s follow Boltzmann during the XIX century. He cleverly suggested a deep connection between thermodynamical entropy and the microscopical degrees of freedom of the considered system. He suggested that there were a connection between the entropy S of a thermodynamical system and the probability $\Omega$ of a given thermodynamical state. How can the functional relationship between S and $\Omega$ be found? Suppose we have $S=f(\Omega)$. In addition, suppose that we have a system that can be divided into two pieces, with their respective entropies and probabilities $S_1,S_2,\Omega_1,\Omega_2$. If we assume that the entropy is additive, meaning that

$S_\Omega=S_1(\Omega_1)+S_2(\Omega_2)$

with the additional hypothesis that the sybsystems are independent, i.e., $\Omega=\Omega_1\Omega_2$, then we can fix the functional form of the entropy in a very easy way: $S(\Omega)=f(\Omega_1\Omega_2)=f(\Omega_1)+f(\Omega_2)$. Do you recognize this functional equation from your High-School? Yes! Logarithms are involved with it. If you are not convinced, with simple calculus, following the Fermi lectures on Thermodynamics you can do the work. Let be $x=\Omega_1, y=\Omega_2$

$f(xy)=f(x)+f(y)$

Write now $y=1+\epsilon$, then $f(x+\epsilon x)=f(x)+f(1+\epsilon)$, where $\epsilon$ is a tiny infinitesimal quantity of first order. Thus, Taylor expanding to both sides, neglecting terms higher to first order infinitesimals, we get

$f(x)+\epsilon f'(x)=f(x)+f(1)+\epsilon f'(1)$

For $\epsilon=0$ we obtain $f(1)=0$, and therefore $xf'(x)=f'(1)=k=constant$, where k is a constant, and nowadays it is  called Boltzmann’s constant. We integrate the differential equation:

$f'(x)=k/x$ in order to obtain the celebrated Boltzmann’s equation for entropy: $S=k\log \Omega$. To be precise, $\Omega$ is not the probability, is the number of microstates compatible with the given thermodynamical state. To obtain the so-called Shannon-Gibbs-Boltzmann entropy, we must divide $\Omega$ between the number of possible dynamical states that agree with the microstate.The Shannon entropy functional form is then generally written as follows:

$\displaystyle{\boxed{S=-k \sum_i p_i\log p_i}}$

It approaches a maximum value when $p_i=1/\Omega$, i.e., when the probability$\Omega$ is a uniform distribution. There is a subtle issue related to the additive constant obtained from the above argument that is important in classical and quantum thermodynamics. But we will discuss that in the future. Now, we could be happy with this functional entropy but indeed, the real issue is that we derived it from some a priori axioms that could look natural, but they are not the most general set of axioms. And, then,  our fascinating trip continues here today! There previous considerations have been using, more or less, formal according to the so-called  “Khinchin axioms” of information theory. That is, The Khinchin axioms are enough to derive the Shannon-Gibbs-Boltzmann entropy we wrote before. However, as it happened with the axioms of euclidean geometry, we can modify our axioms in order to obtain more general “geometries”, here more general “statistical mechanics”. We are going now to explore some of the most known generalizations to Shannon entropy.In the succesive, for simplicity, we set the Boltzmann’s constant to one (i.e. we work with a k=1 system of units ). Is the above definition of entropy/information the only one that is interesting from the physical viewpoint? No, indeed, there has been an increasing activity in “generalized entropies” in the past years. Note, however, that we should recover the basic and simpler entropy (that of Shannon-Gibbs-Boltzmann) in some limit. I will review here some of the most studied entropic functionals that have been studied during the last decades.

The Rényi entropy.

It is a set of uniparametric entropies, now becoming more and more popular in works on entanglement and thermodynamics, with the following functional form:

$\displaystyle{ \boxed{S_q^R=\dfrac{1}{1-q}\ln \sum_i p_{i}^{q}}}$

where the sums extends itself to any microstate with non zero probability $p_i$. It is quite easy to see that in the limit $q\rightarrow 1$ the Rényi entropy transforms into the Shannon-Gibbs-Boltzmann entropy (it can be checked with a perturbative expansion around $q=1+\epsilon$ or using the L’Hôspital’s rule.

The Tsallis entropy.

Tsallis entropies, also called q-entropies by some researchers, are the uniparametric family of entropies defined by:

$\displaystyle{ \boxed{S_{q}^{T}=\dfrac{1}{1-q}\left( \sum_{i} p_{i}^{q}-1\right)}}$.

Tsallis entropy is related to Rényi’s entropies throug a nice equation:

$\boxed{S_q^ T=\dfrac{1}{q-1}(1-e^{(q-1)S_q^R})}$

and again, taking the limit q=1, Tsallis entropies provide the Shannon-Gibbs-Boltzmann’s entropies. Why consider such a Statistical Mechanics based on Tsallis entropy and not Renyi’s?Without entrying into mathematical details, the properties of Tsallis entropy makes itself more suitable to a generalized Statistical Mechanics for complex systems(in particular, it is due to the concavity of Tsallis entropy), as the seminal work of C.Tsallis showed. Indeed, Tsallis entropies were found unnoticed by Tsallis in other unexpected place. In a paper, Havrda-Charvat introduced the so-called “structural $\alpha$ entropy” related to some cybernetical problems in computing.

Interestingly, Tsallis entropies are non-additive, meaning that they satisfy a “pseudo-additivity” property:

$\boxed{S_{q}^{\Omega}=S_q^{\Omega_1}+S_q^{\Omega_2}-(q-1)S_q^{\Omega_1}S_q^{\Omega_2}}$

This means that if we built a Statistical Mechanics based on the Tsallis entropy, it is non-additive itself. Independent subsystems are generally non-additive. However, they are usually called “non-extensive” entropies. Why? The definition of extensivity is  different, namely the entropy of a given system is extensive if, in the so called thermodynamicla limit $N\rightarrow \infty$, then $S\propto N$ , where N is the number of elements of the given thermodynamical system. Therefore, the additivity only depends on the functional relation between the entropy and the probabilities, but extensivity depends not only on that, but also on the nature of the correlations between the elements of the system. The entropic additivity test is quite trivial, but checking its extensivity for a specific system can be complex and very complicated. Indeed, Tsallis entropies can be additive for certain systems, and for some correlated systems, they can become extensive, like usual Thermodynamics/Statistical Mechanics. However, in the broader sense, they are generally non-additive and non-extensive. And it is the latter feature, its thermodynamical behaviour in the thermodynamical limit, from where the name “non-extensive” Thermodynamics arises.

Landsberg-Vedral entropy.

They are also called “normalized Tsallis entropies”. Their functional form are the uniparametric family of entropies:

$\displaystyle{ \boxed{S_q^{LV} =\dfrac{1}{1-q} \left( 1-\dfrac{1}{\sum_i p_{i}^{q}}\right)}}$

They are related to Tsallis entropy through the equation:

$\displaystyle{ S_q^{LV}= \dfrac{S_q^T}{\sum_i p_i ^q}}$

It explains their alternate name as “normalized” Tsallis entropies. They satisfy a modified “pseudoadditivity” property:

$S_q^\Omega=S_q^{\Omega_1}+S_q^{\Omega_2}+(q-1)S_q^{\Omega_1}S_q^{\Omega_2}$

That is, in the case of normalized Tsallis entropies the rôle of (q-1) and -(q-1) is exchanged, i.e., -(q-1) becomes (q-1) in the transition from Tsallis to Landsberg-Vegral entropy.

Abe entropy.

This kind of uniparametric entropy is very symmetric. It is also related to some issues in quantum groups and fractal (non-integer) analysis. They are defined by the q-1/q entropic functional:

$\displaystyle{ \boxed{S_q^{Abe}=-\sum_i \dfrac{p_i^q-p_i^{q^{-1}}}{q-q^{-1}}}}$

Abe entropy can be obtained from Tsallis entropy as follows:

$\boxed{S_q^{ LV}=\dfrac{(q-1)S_q^T-(q^{-1}-1)S_{q^{-1}}^{T}}{q-q^{-1}}}$

Abe entropy is also concave from the mathematical viewpoint, like the Tsallis entropy. It has some king of “duality” or mirror symmetry due to the invariance swapping q and 1/q.

Other uniparametric entropic family well-know is the Kaniadakis entropy or $latex \kappa$-entropy. Related to relativistic kinematics, it has the functional form:

$\displaystyle{ \boxed{S_\kappa^{K}=-\sum_i \dfrac{p_i^{1+\kappa }-p_i^{1-\kappa}}{2\kappa}}}$

In the limit $\kappa \rightarrow 0$ Kaniadakis entropy becomes Shannon entropy. Also, writing $q=1+\kappa$, and $\dfrac{1}{q}=1-\kappa$, Kaniadakis entropy becomes Abe entropy. Kaniadakis entropy, in addition to be concave, have further subtle properties, like being something called Lesche stable. See references below for details!

Sharma-Mittal entropies.

Finally, we end our tour along entropy functionals with a biparametric family of entropies called Sharma-Mittal entropies. They have the following definition:

$\displaystyle{ \boxed{S_{\kappa,r}^{SM}=-\sum_i p_i^{r}\left( \dfrac{p_i^{1+\kappa}-p_i^{1-\kappa}}{2\kappa}\right)}}$

It can be shown that these entropy species contain many entropies as special subtypes. For instance, Tsallis entropy is recovered if $r=\kappa$ and $q=1-2\kappa$. Kaniadakis entropy is got if we set r=0. Abe entropy is the subcase with $\kappa=\frac{1}{2}(q-q^{-1})$ and $r=\frac{1}{2}(q+q^{-1})-1$. Isn’t it wonderful? There is an alternative expression of Sharma-Mittal entropy, taking the following expression:

$\displaystyle{ \boxed{S_{r,q}^{SM}=\dfrac{1}{1-r}\left[\sum_i \left(p_i^q\right)^{(\frac{1-r}{1-q})}-1\right]}}$

In this functional form, SM entropy recovers Rényi entropy for $r\rightarrow 1$, SM entropy becomes Tsallis entropy if $r\rightarrow q$. Finally, when both parameters approach 1, i.e., $r,q\rightarrow 1$, we recover the classical Shannon-Gibbs-Boltzmann. It is left as a nice exercise for the reader to relate the above 2 SM entropy functional forms and to derive Kaniadakis entropy, Abe entropy and Landsberg-Vedral entropy for some particular values of $r,q$ from the second definition of SM entropy.

However, entropy as a concept is yet very mysterious. Indeed, it is not clear yet if we have exhausted every functional form for entropy!

Non-extensive Statistical Mechanics and its applications are becoming more and more important and kwown between the theoretical physicists. It has a growing number of uses in High-Energy Physics, condensed matter, Quantum Information and Physics. The Nobel Prize Murray Gell-Mann has dedicated their last years of research in the world of Non-Extensive entropy. At least, from his book The Quark and the Jaguar, Murray Gell-Mann has progressively moved into this fascinating topic. In parallel, it has also produced some other interesting approaches to Statistical Mechanics, such as the so-called “superstatistics”. Superstatistics is some kind of superposition of statistics that was invented by the physicist Christian Beck.

The last research on the foundations of entropy functionals is related to something called “group entropies” and the transformation group of superstatistics and the rôle of group transformations on non-extensive entropies. It provides feedback between different branches of knowledge: group theory, number theory, Statistical Mechanics, and Quantum Satisties…And a connection with the classical Riemann zeta function even arises!

WHERE DO I LEARN ABOUT THIS STUFF and MORE if I am interested in it? You can study these topics in the following references:

The main entry is based in the following article by Christian Beck:

1) Generalized information and entropy measures in physics by Christian Beck. http://arXiv.org/abs/0902.1235v2

If you get interested in Murray Gell-Mann works about superstatistics and its group of transformations, here is the place to begin with:

2) Generalized entropies and the transformation group of superstatistics. Rudolf Haner, Stefan Thurner, Murray Gell-Mann

http://arxiv.org/abs/1103.0580

If you want to see a really nice paper on group entropies and zeta functions, you can read this really nice paper by P.Tempesta:

3)Group entropies, correlation laws and zeta functions. http://arxiv.org/abs/1105.1935

C.Tsallis himself has a nice bibliography related to non-extensive entropies in his web page:

The “Khinchin axioms” of information/entropy functionals can be found, for instance, here:

5) Mathematical Foundations of Information Theory, A. Y. Khinchin. Dover. Pub.

Two questions to be answered by the current and future scientists:

A) What is the most general entropy (functional entropy) that can be build from microscopic degrees of freedom? Are they classical/quantum or is that distinction irrelevant for the ultimate substrate of reality?

B) Is every fundamental interaction related to some kind of entropy? How and why?

C) If entropy is “information loss” or “information” ( only a minus sign makes the difference), and Quantum Mechanics says that Quantum Mechanics is about information (the current and modern interpretation of QM is based on it), is there some hidden relationship between mass-energy and information and entropy? Could it be used to build Relativity and QM from a common framework? Therefore, are then QM and (General) Relativity emergent and likely the two sides of a most fundamental theory based on information only?

## LOG#002. Information and noise.

I  enjoyed as a teenager that old game in which you are told a message in your ear, and you transmit it to other human, this one to another and so on. Today, you can see it at big scale on Twitter. Hey! The message is generally very different to the original one! This simple example explains the other side of communication or information transmission: “noise”.  Although efficiency is also used. The storage or transmission of information is generally not completely efficient. You can loose information. Roughly speaking, every amount of information has some quantity of noise that depends upon how you transmit the information(you can include a noiseless transmission as a subtype of information process in which,  there is no lost information). Indeed, this is also why we age. Our DNA, which is continuously replicating itself thanks to the metabolism (possible ultimately thanksto the solar light), gets progressively corrupted by free radicals and  different “chemicals” that makes our cellular replication more and more inefficient. Don’t you remember it to something you do know from High-School? Yes! I am thinking about Thermodynamics. Indeed, the reason because Thermodynamics was a main topic during the 19th century till now, is simple: quantity of energy is constant but its quality is not. Then, we must be careful to build machines/engines that be energy-efficient for the available energy sources.

Before going into further details, you are likely wondering about what information is! It is a set of symbols, signs or objects with some well defined order. That is what information is. For instance, the word ORDER is giving you  information. A random permutation of those letters, like ORRDE or OERRD is generally meaningless. I said information was “something” but I didn’t go any further! Well, here is where Mathematics and Physics appear. Don’t run far away!  The beauty of Physics and Maths, or as I like to call them, Physmatics, is that concepts, intuitions and definitions, rigorously made, are well enough to satisfy your general requirements. Something IS a general object, or a set of objects with certain order. It can be certain DNA sequence coding how to produce certain substance (e.g.: a protein) our body needs. It can a simple or complex message hidden in a highly advanced cryptographic code. It is whatever you are recording on your DVD ( a new OS, a movie, your favourite music,…) or any other storage device. It can also be what your brain is learning how to do. That is  “something”, or really whatever. You can say it is something obscure and weird definition. Really it is! It can also be what electromagnetic waves transmit. Is it magic? Maybe! It has always seems magic to me how you can browse the internet thanks to your Wi-Fi network! Of course, it is not magic. It is Science. Digital or analogic information can be seen as large ordered strings of  1’s and 0’s, making “bits” of information. We will not discuss about bits in this log. Future logs will…

Now, we have to introduce the concepts through some general ideas we have mention and we know from High-School. Firstly, Thermodynamics. As everybody knows, and you have experiences about it, energy can not completely turned into useful “work”. There is a quality in energy. Heat is the most degradated form of energy. When you turn on your car and you burn fuel, you know that some of the energy is transformed into mechanical energy and a lof of energy is dissipated into heat to the atmosphere. I will not talk about the details about the different cycles engines can realize, but you can learn more about them in the references below. Simbollically, we can state that

$\begin{pmatrix} AVAILABLE \\ENERGY\end{pmatrix}=\begin{pmatrix}TOTAL \;\;ENERGY \\SUPPLIED\end{pmatrix} - \begin{pmatrix}UNAVAILABLE \\ENERGY\end{pmatrix}$

The great thing is that an analogue relation in information theory  does exist! The relation is:

$\boxed{\mbox{INFORMATION} = \mbox{SIGNAL} - \mbox{NOISE}}$

Therefore, there is some subtle analogy and likely some deeper idea with all this stuff. How do physicists play to this game? It is easy. They invent a “thermodynamic potential”! A thermodynamic potential is a gadget (mathematically a function) that relates a set of different thermodynamic variables. For all practical purposes, we will focus here with the so-called Gibbs “free-energy”. It allows to measure how useful a “chemical reaction” or “process” is. Moreover, it also gives a criterion of spontaneity for processes with constant pressure and temperature. But it is not important for the present discussion. Let’s define Gibbs free energy G as follows:

$G= H - TS$

where H is called enthalpy, T is the temperature and S is the entropy. You can identify these terms with the previous concepts. Can you see the similarity with the written letters in terms of energy and communication concepts? Information is something like “free energy” (do you like freedom?Sure! You will love free energy!). Thus, noise is related to entropy and temperature, to randomness, i.e., something that does not store “useful information”.

Internet is also a source of information and noise. There are lots of good readings but there are also spam. Spam is not really useful for you, isn’t it? Recalling our thermodynamic analogy, since the first law of thermodynamics says that the “quantity of energy” is constant and the second law says something like “the quality of energy, in general, decreases“, we have to be aware of information/energy processing. You find that there are signals and noise out there. This is also important, for instance, in High Energy Physics or particle Physics. You have to distinguish in a collision process what events are a “signal” from a generally big “background”.

We will learn more about information(or entropy) and noise in my next log entries. Hopefully, my blog and microblog will become signals and not noise in the whole web.

Where could you get more information? 😀 You have some good ideas and suggestions in the following references:

1) I found many years ago the analogy between Thermodynamics-Information in this cool book (easy to read for even for non-experts)

Applied Chaos Theory: A paradigm for complexity. Ali Bulent Cambel (Author)Publisher: Academic Press; 1st edition (November 19, 1992)

Unfortunately, in those times, as an undergraduate student, my teachers were not very interested in this subject. What a pity!

2) There are some good books on Thermodynamics, I love (and fortunately own) these jewels:

Concepts in Thermal Physics, by Stephen Blundell, OUP. 2009.

A really self-contained book on Thermodynamics, Statistical Physics and topics not included in standard books. I really like it very much. It includes some issues related to the global warming and interesting Mathematics. I enjoy how it introduces polylogarithms in order to handle closed formulae for the Quantum Statistics.

Thermodynamcis and Statistical Mechanics. (Dover Books on Physics & Chemistry). Peter T. Landsberg

A really old-fashioned and weird book. But it has some insights to make you think about the foundations of Thermodynamics.

Thermodynamcis, Dover Pub. Enrico Fermi

This really tiny book is delicious. I learned a lot of fun stuff from it. Basic, concise and completely original, as Fermi himself. Are you afraid of him? Me too! E. Fermi was a really exceptional physicist and lecturer. Don’t loose the opportunity to read his lectures on Thermodynamcis.

Mere Thermodynamics. Don S. Lemons. Johns Hopkins University Press.

Other  great little book if you really need a crash course on Thermodynamics.

Introduction to Modern Statistical Physics: A Set of Lectures. Zaitsev, R.O. URSS publishings.

I have read and learned some extra stuff from URSS ed. books like this one. Russian books on Science are generally great and uncommon. And I enjoy some very great poorly known books written by generally unknow russian scientists. Of course, you have ever known about Landau and Lipshitz books but there are many other russian authors who deserve your attention.

3) Information Theory books. Classical information theory books for your curious minds are

An Introduction to Information Theory: Symbols, Signals and Noise. Dover Pub. 2nd Revised ed. 1980.   John. R. Pierce.

A really nice and basic book about classical Information Theory.

An introduction to Information Theory. Dover Books on Mathematics. F.M.Reza. Basic book for beginners.

The Mathematical Theory of Communication. Claude E. Shannon and W.Weaver.Univ. of Illinois Press.

A classical book by one of the fathers of information and communication theory.

Mathematical Foundations of Information Theory. Dover Books on Mathematics. A.Y.Khinchin.

A “must read” if you are interested in the mathematical foundations of IT.

## LOG#001. A brand new blog.

Hello, world! Hello, blogosphere!

You are surely invited  to share my digital Odyssey through the Neverending Story of Science…