jump to navigation

Updated Mw-Mt Higgs search plot from Sven May 14, 2008

Posted by dorigo in mathematics, science.
add a comment

Since I am currently preparing the slides of the talk I will give next week at PPC2008, a conference being held in Albuquerque on the interconnection between particle physics and cosmology, I have my hands full with material that would be perfect for this blog. The talk is a review of new results from the CDF experiment, and there is literally a ton of them! What makes it hard for me is to sort out the stuff that i  really the very best and most worthy of being shown at the particular event.

I am however not posting here direct CDF results, but rather a plot that my friend Sven Heinemeyer was kind to produce according to my directives, which I meant to allow me to summarize in my talk the status of electroweak fits by separating the main contributions of experimental measurements. In the graph below, showing the dependence of the Higgs boson mass on the value of W boson and top quark masses, you can see several different regions highlighted with black, blue, and magenta lines. The black lines bracket the LEP II determination of the W mass; the blue ellipse describes the Tevatron measurements of the two parameters, and the magenta hatched “wing profile” area shows the allowed values of the two quantities according to electroweak fits performed using LEP I and SLD determinations of electroweak parameters.

Also shown in the plot is the SM-allowed range (in red), where the Higgs boson has a mass varying between the lower LEP II limit of 114.4 GeV (upper border of the red hatched area) and 400 GeV (lower border), and the SUSY allowed region (hatched green), which shows the zone allowed by different choices of some of the many SUSY parameters, in particular the mass of supersymmetric particles.

Now, let me make a few points on the plot above.

  1. Although precise, the indirect experimental input shown in the plot is still incapable of discriminating between SM and SUSY - and it probably never will by itself, since LHC will soon rule out or find SUSY before it shrinks the ellipse sizably (ok, ok, I am neglecting the possibility of split SUSY…)
  2. the celebrated LEP I / SLD data looks obsolete from this particular vantage point, in light of the more recent direct measurements; this would however be an unfair interpretation, given that electroweak fits have many more parameters than just W and top quark masses.
  3. The LEP I / SLD data is obsolete as far as the top quark is concerned: in the plot it does not even appear to constrain it if compared with the ultra-precise (+-0.8%) Tevatron determination!
  4. The top quark mass has been bouncing up and down a bit, although always well within errors, in the last 5 years, from 178 to 170 to 172.4 GeV. This has slightly moved up and down the preferred value of fit Higgs mass in the SM. However, as the ellipse shrinks, this is becoming less of an issue. In fact, to justify the effort of producing the best possible top mass measurement, we used to say that a 1 GeV precision on the top mass was equivalent to a 7 MeV precision on the W mass as far as the knowledge we would obtain on MH was concerned, based on the slope of the Higgs contours in the plot above. Now that the error on top mass is well below 2 GeV, however, it becomes clear that we will not gain much knowledge by increasing the precision much further. The W mass has become one of the main players in the game of precision SM fits now!
  5. The ellipse includes 68% of the area of the two-dimensional gaussian centered on the Mw-Mt determination, just as much as the black bars do, but being two-dimensional it is deceiving: the single most precise determination of the W boson mass is in fact from CDF, and Tevatron and LEP II are basically at the same level of precision on that quantity!

Why is the comparison deceiving ? Because if you have one single quantity, you determine the 68% interval by integrating a gaussian distribution from its center outwards, until you “cover” 68% of its total integral (from -inf to +inf). If you add a dimension to your single-variable gaussian, and make it a two-dimensional gaussian shape, the 68% bounds remain the same unless you integrate by expanding an ellipse, rather than a band, around the center. The ellipse encompasses values of the 2-dimensional distribution which have the same “probability”, but in so doing it “cuts the corners”, and to total a 68% of the 2-dim integral it now has to extend past the one-dimensional 68% boundaries in each of the two variables. A sketch will clarify matters:

Well, not exactly “clarified”… But I have no time to make the graph easier to understand. The point is that the ellipse “cuts” only a part of the band in each direction, and so the integral of the 2-dimensional curve it comprises is much smaller than the band. To make the ellipse include 68% of the 2-dimensional distribution constructed with the two gaussian curves, one has to widen it to a roughly double size.

So, paradoxically: if LEP II had a determination of the top quark mass too, the band bracketed by the two black lines in the plot by Heinemeyer would convert into an ellipse which would be about as wide in the vertical direction as the Tevatron blue ellipse.

Not convinced ? Oh well. Think at it this way: with a single measurement of the W mass, you say “the probability that the mass is between 80.35 and 80.45 GeV is 68%, because I determined it to be 80.4 and I have an error of 0.05 GeV”. Fine: the gaussian distribution, if integrated from -1-sigma to +1-sigma, provides 68% of its total normalization. The same goes if you claim that, having measured the top mass at $172.4 \pm 1.4 GeV$, there is a 68% chance that it lies in the interval 171-173.8 GeV. However, if you ask what is the probability that the W mass is between 80.35 and 80.45 GeV AND the top mass is between 171 and 173.8 GeV, this is much smaller than 68%, because independent probabilities multiply each other: it is, in fact, only 46.2%; but this corresponds to the square drawn around the circle in the graph! The probability that the two values lie in the ellipse with major axes equal to 1-sigma band widths is (if I recall correctly) about 37%.

The bottomline? Whenever you look at a plot with two measurements, one described by an ellipse and the other by a band, always regard the ellipse with more respect than it seems to deserve!

Guest post - Jeff Wyss: The Relativistic Train April 30, 2008

Posted by dorigo in Blogroll, mathematics, physics, science.
11 comments

Jeff is a physics professor at the University of Cassino, and a long-time colleague and friend of mine. He worked in the SLD and CDF collaborations as a particle physicist, but later moved on to study radiation damage on silicon detectors for particle and astroparticle applications.

Besides admiring him for his wicked sense of humor, which he uses to make the workplace around him always a pleasant place, I have the highest esteem of Jeff as a professor, because he is quite skilled in explaining physics concepts in simple terms. He always looks for the most intuitive way to understand things, as you might appreciate in the contribution he offers below.

The following describes a very elegant and simple derivation of the relativistic formula for the addition of velocities, w = (u+v)/(1 + uv/c^2).

It is due to David Mermin. I fell in love with it and have been telling it for the past four years now to the students of my general physics course. The students are first year telecommunications and electrical engineering students. Before sitting in on my course all of them have heard about Einstein and most of them heard the expression “the velocity of light is constant”. I do not have the time to discuss special relativity in detail. My course is quite traditional. I discuss reference frames, inertial frames, Galilean transformations and covariance of Newton’s laws. I then point out that when describing mechanical waves the frame that is stationary respect to the medium is a special reference! In particular the wave motion can be made to disappear by moving respect to the medium with a velocity equal to that of the wave. It is clear at this point that the constancy of the velocity of light cannot be understood by assuming Newton’s laws and then modeling light as a mechanical wave in a medium (the ether). I then restate the constancy of the velocity of light and begin Mermin’s derivation.

The derivation uses:

  • only one reference frame (no use of Lorentz transformations),
  • simple kinematics (always good to brush up on),
  • the constancy of the velocity of light (something that every telecommunications and electrical engineering student should know),
  • the idea that some things are invariant; i.e. while many quantities are relative, observers will agree on some absolutes.

Consider a train of length L moving along the x-axis at a constant velocity v respect to an inertial frame of reference (the observer watching the events unfold). At the trailing end of the train a loaded gun is aimed in the forward direction and fired at time t=0: the bullet and flash of light emerge and travel in the forward direction with different speeds: w the velocity of the bullet, c the velocity of light. A mirror at the front end of the train reflects the light back towards the advancing bullet. Let f be the fraction of the length of train that the reflected light travels before meeting up with the bullet. The constancy of light (Einstein’s dictum) tells us that the velocity of light in the forward direction is equal to the velocity of light in the backward direction; i.e. c_F = c_B = c.

The space-time plot looks like this:

Let t_F be the time for the light flash to reach the forward-going mirror and t_B be the time the reflected light needs to return from the mirror and meet up with the forward-moving bullet. Simple kinematics allows us to label the space-time plot:

Simple algebra:

It is important to note that the expression for f we just obtained is valid if the velocity of light in the forward and backward direction are equal. Note:

  • A classical pre-Einstein physicist would say this expression is valid only if the observer is stationary respect to the ether frame.
  • On the other hand Einstein says that any inertial observer would use the same velocity of light; i.e. Einstein tells us that this expression is valid for any observer (generic inertial frame).

Following Einstein we consider a particular observer (frame), one that is moving along with the train. For this observer the velocity of the train is v = 0. For clarity let us use the symbol u to indicate the velocity of the bullet with respect to this observer; i.e. with respect to the train.

Suppose the train has 10 windows and the reflected light and the bullet meet up at the third window from the front (f=0.3). It is important to realize that all observers will agree on the value of f. The fraction f is an invariant!

The constancy of the velocity of light allows us to impose the invariance of f the following way:

Q.E.D. !

Correcting the CMS momentum scale April 29, 2008

Posted by dorigo in mathematics, personal, physics, science.
add a comment

I have wanted to write some version of the present post for a while, and so it is a relief to publish it at last. In fact, it is rather strange to have completely avoided discussing in my blog the problem I have invested the best part of my research time in the last three months -plus a fair share of last year’s thinking-, and it was due time that I filled that void somehow.

Unfortunately, strange as it may seem, there are topics in my research activities that are hard to explain in simple terms. The problem I have been working on is not difficult to state, nor too difficult to solve, but it is extremely complicated and varied, so that a comprehensive description is challenging. However, I want to make an attempt…

The problem I have been dealing with, together with a small and focused group of bright colleagues (Sara Bolognesi, Marco De Mattia, and Chiara Mariotti: lads from Padova and ladies from Torino University) is the one of calibrating the momentum of charged tracks detected by the CMS experiment at CERN.

After being produced in a proton-proton collision in the core of CMS, charged particles have their position measured in a dozen layers of silicon detector before they hit the calorimeter system; the few penetrating ones surviving the encounter with trillions of heavy nuclei are also detected by the large set of muon chambers situated outside them. With the information provided by the silicon detectors -and, for muon candidates, by the muon system- a very performant and refined software algorithm reconstructs and fits the trajectory of the track, providing a measurement of the five parameters describing the helical trajectory; most notably, the curvature \rho inside the solenoid, which yields a precise determination of transverse momentum through the formula P_t = 0.3 B / \rho (where B is the magnetic field intensity -about 4 Tesla- and P_t is transverse momentum).

There are a number of reasons why a precise determination of the momentum of charged tracks is crucial. Let me just flash a few:

  1. Charged particles are measured with a better precision than neutral ones, and a careful determination of their momentum allows to calibrate in turn other parts of the detector.
  2. Some physics measurements such as the mass of the W boson rely heavily on track momentum.
  3. The identification of a high-mass resonance -say a new Z’ boson- may require the reconstruction of its Z' \to \mu \mu decay, and a scale error on the momentum of those high-energy tracks translates in a worse resolution in the Z’ mass, and a diminished discovery reach.
  4. B-physics crucially needs charged tracks to be precisely reconstructed in order for exclusive B decays to be extracted from backgrounds.

So how do we do it ?

We use resonances. A few neutral particles -vector mesons and the Z boson- decay to pairs of muons, and they can thus be extracted with small backgrounds from the data (events with two muons are easy to collect with CMS, and muons have the benefit that they are “perfect” tracks in several ways). We know the mass of these particles with great accuracy, thanks to previous experiments:

  • The Z boson mass is known to be 91.1876 \pm 0.0021 GeV, a 0.023% precision.
  • The Y(1S), the ground state of the (b \bar b) vector meson family, has its mass known as 9460.30 \pm 0.26 MeV, a 0.0028% measurement.
  • The Y(2S) mass is 10.02326 \pm 0.00031 GeV, a 0.0031% measurement.
  • The Y(3S) mass is 10.3552 \pm 0.0005 GeV, a 0.005% measurement.
  • The J/Psi, the ground state of the (c \bar c) vector meson family, has its mass known as 3096.916 \pm 0.011 MeV, a 0.0004% measurement.
  • The Psi(2S) has mass 3686.093 \pm 0.034 MeV, a 0.001% measurement.

All the above particles are easy to trigger on, collect, reconstruct, and measure. With CMS we expect to collect thousands of these decays every day of running. Their mass can be measured on a event-by-event basis by reconstructing the momentum of the two muons they decayed into, using the relativistic equation

M = \sqrt{ (\Sigma E)^2 - (\Sigma \vec{P})^2}

where M is the resonance mass, E is the muon energy, and P is the muon momentum vector.

By comparing the average mass of each reconstructed resonance to the reference values above, we get to know the scale of our momentum measurement, S = M_{true}/M; every time we measure a momentum P we then do P' = SP, forget P, use P’, and we are done. Easy enough, wouldnt’ you agree ?

Sure. Easy enough. But actually kind of lame. With the millions of dimuon resonances we collect, can’t we do something better ? Our detector is, in fact, a quite complicated set of devices. The momentum scale -or, to be precise, the bias on the momentum measurement- depends on very subtle effects, such as tiny distorsions in the magnetic field generated by the 4-Tesla solenoid, occasional mis-alignment (by a few microns, that is) of one of the thousands silicon sensors, erratic behavior of the reconstruction algorithm in very particular regions of the detector. We can, and we must, check the bias on our measured momentum more closely, because it in turn gives us a chance to verify the B field map, check the alignments, validate the reconstruction code.

In the simplified formulas described above to determine a corrected momentum P’, you might have noticed that we used the invariant mass of the two muons making the resonance, rather than each muon separately. Indeed, the decayed particle is not produced at rest in the laboratory frame of reference, so we cannot expect that the two muons share evenly their parent’s energy, M/2 each. Only by combining their momenta can we get a number to compare to the reference value. Or is there a smarter way ?

There is a smarter way. Strangely enough, to my knowledge it has not been used in the past for this application. Let me explain in short what it is. I will try to make this as simple as possible, but not simpler - in Einstein’s style.

In the formula for the relativistic mass above enters the energy and momentum -or better, if you allow a slip into special relativity jargon, quadrimomentum. We can, in purely symbolic terms, write:

M = f [P_1(x_1, x_2, ..., x_i), P_2(x_1, x_2, ..., x_i)]

where we have made explicit the fact that the computed invariant mass is a function f of the quadrimomenta P_1, P_2 of the two muons, and that each of the two quadrimomenta is in turn a function of many (i, in the formula) other variables, collected in two i-dimensional vectors x . These variables are the measured characteristics of the track: its angles, the region of the detector it crosses, its electric charge, you name them.

Still here ? Ok. The next step is to realize that what we really would love to have is a measurement of the momentum as a function of the particular characteristics x of the track, and not just P=0.3 B/\rho, which only depends on the curvature \rho. Through a knowledge of P=P(x_i) we could get sensitive to the effects mentioned above -B field distorsions, alignment errors, reconstruction biases.

There is a simple way: we can compute the probability that we observe a mass M, if the reference value is M_{true}, as a function of the measured quantities x_i of each muon, by assuming a functional form for the way the momentum P depends on the parameters. So let us write:

M = f [g_1(\vec{x};\vec{\alpha}), g_2(\vec{x}; \vec{\alpha})]

where the new function g( ) describes how the momenta vary with the vector of measured track parameters \vec{x}, and \vec{\alpha} is a vector of unknown variables describing the function g( ).

(To let you understand what the heck I am talking about, assume that your detector measures a track momentum with a bias depending on momentum itself:

P = g(\vec{x};\vec{\alpha}) = x_1 \times (\alpha_1 + \alpha_2 \times x_1),

with x_1=P, and \alpha_1 = 0.998, \alpha_2 = 0.0002. This function describes momenta which are underestimated by 0.2% for small P, correctly estimated for P=10 GeV, and overestimated by 1% for every additional increase of P by 50 GeV. )

Using the parametrization, we compute for each event the measured mass as a function of the variables \alpha. WIth these numbers we finally form a likelihood function:

L = -\Sigma[log(Prob(M(x,\alpha))]

which of course implicitly depends on the functional form we have chosen for g. By maximizing L as a function of the parameters, we obtain their most likely values, and we are done: we get to know how our track momentum depends on its characteristics \vec {x}.

In the discussioon above I have not given much emphasis on the fact that the true form of the “bias function” g( ) is not known. One can in fact test different hypotheses with the data, and the value of the likelihood will be a measure of how well they describe the experimental situation. There’s more: the likelihood can be studied as a function of each of the components of the vector x, allowing to spot biases which require a more subtle parametrization.

The above discussion is a simplified view of the problem: In reality, things are much more complicated. Here is a short list of details I hid under the carpet above:

  • We model the probability to observe a given mass in the likelihood function by convoluting a Lorentzian function (the Breit-Wigner, which is the true form of the mass distribution of the resonance) with a gaussian resolution function; the gaussian has parameters \vec {\beta} which also get fit simultaneously with the bias parameters \vec {\alpha}. The figure below shows the probability distribution function of a measurement of mass M and resolution \sigma for a Z boson: for each point in the plane, defined by the two values (M, \sigma), the probability is the height of the surface. Notice how the probability grows as the resolution increases, for values of mass very far from the true resonance mass M_Z=91 GeV (for instance, for a mass of 71 GeV-the left boundary of the surface), while the opposite happens for values of mass close to it.

  • the fitter also assumes a functional form for the background (which is unavoidably included in the dataset containing the resonances), and fits it together with the bias and resolution parameters;
  • Each of the six considered resonances can be fit individually, or all together. The window around the peaks defining events used or not used in the computation requires an optimization;
  • The fitter iterates several times the whole procedure: after bias parameters are extracted, momenta get corrected, and a new parameter extraction must return values which are compatible with no bias.
  • And so on…

The algorithm is indeed quite complicated. I spent the last three months implementing the fitting of resolution and background, and the algorithm is not yet complete but it now works well. It is particularly satisfactory to be able to launch the program on a set of resonances, and extract all at once not just the parameters that allow momenta to be corrected, but also a precise estimate of momentum resolution as a function of track kinematics - something that would once require detailed studies with simulations. All is now squeezed out of the data!

The work is far from over. With the help of my colleagues, we will test the code on a very large sample of simulated events in the next few months, to be ready for the data which will hopefully start pouring in this fall… But the work will only be started then: we plan to fit chunks of data on a monthly basis, checking the stability of the detector and the track reconstruction, and producing a correction function to be used by all analyses in need of a precise momentum measurement… It really is a long-term plan!

The Corfu 2005 proceedings online April 10, 2008

Posted by dorigo in astronomy, books, games, humor, internet, language, mathematics, music, news, personal, physics, politics, science, travel.
add a comment

Just a note to post here the permanent link to the proceedings of a conference I attended in Corfu (Greece) three years ago. This is a long (32 pages) report on “High-P_T Physics: from the Tevatron to the LHC“, now published in the Journal of Physics: Conference Series [Tommaso Dorigo 2006 J. Phys.: Conf. Ser. 53 163-194]. I think I did post a draft of the paper on this blog a couple of years ago, but then I forgot to post the final version as well.

The paper is a bit dated in some parts, where the most recent (back then) results from the Tevatron are discussed; however, some parts -especially a discussion of the usefulness of Tevatron data for LHC physics- are still readable IMHO. Also worth noting is the fact that the acknowledgments section mentions the late Riqie Arneberg, a friend who passed away last fall, who had accepted the offer I had made to all readers of this blog to proofread the manuscript, and contributed in several places to the clarity of the text.

The publisher has now made available online all its 100 open access volumes through the JPCS home page. Of course I salute this contribution to the free diffusion of science with enthusiasm.

The say of the week March 10, 2008

Posted by dorigo in Blogroll, games, humor, mathematics, physics, science.
12 comments

At least 99% of the 10^500 possible vacua are complete garbage and can be ruled out easily. Thus, the regions of the landscape for which realistic vacua may arise is limited.”

Eric (string theory enthusiast)

Multiple interactions at LHC: an exercise in elementary statistics February 13, 2008

Posted by dorigo in mathematics, physics, science.
7 comments

The LHC will start running towards the end of this year, at the design energy and with a bunch crossing time of 25 ns. That means 40 million intersections per second between two proton packets in the core of CMS and ATLAS [things are a bit more complicated -some bunches are empty- but this has no relevance for my point].

We expect that the beams will contain few protons in this initial phase: low luminosity, that is. That’s because a high energy beam requires a lot of tuning before it can accommodate a large number of particles. Charged particles, in fact, have the nasty tendency to repel each other, and squeezing them into a narrow corner of phase space -knee to knee, all together as a single man- is a tremendously hard task, requiring successive approximations. Moreover, as the beams travel through the LHC tunnel, each making over ten thousand turns a second, they generate strong induced currents on the machine’s hardware. This electromagnetic interplay is impossible to compute beforehand, and a trial and error procedure by the machinists is unavoidable.

Luminosity is a function of the number of protons spinning in the two directions. Basically one can compute it from the number of particles circulating in the two directions by taking their product N_1 N_2 and dividing it by the revolution frequency and the transverse section of the beam. One obtains a number whose units are inverse area (the beam size) times inverse time in seconds (the frequency). The LHC will start at L=10^{30} cm^{-2} s^{-1}, but we expect it to reach the design value of L=10^{34} in a couple of years.

Luminosity is not just a number with which machinists boast about their gadget. With it, you can compute the rate of production of any given process, if you know its cross section.

Cross section, a number labeled with the greek letter \sigma carrying units of area, basically tells you the effective area a proton must hit in another in order to give rise to a given reaction. The total cross section for proton-proton collisions at the LHC energy is \sigma_{tot} = 8 \times 10^{-26} cm^2: more or less like a circle with a radius of 1.6 millionths of a billionth of a meter - the “size” of a proton seen by another colliding with it head-on. But the total pp cross section is huge! Compare it with the cross section for producing a top quark pair: \sigma_{t \bar t} = 8 \times 10^{-34} cm^2, or a hundred million times smaller. It is like if the incoming proton had to hit the other one “just right there”, to produce a top pair.

With a knowledge of what a cross section is we can answer questions. What is the total rate of proton collisions at the LHC if the luminosity is L=10^{33} cm^{-2} s^{-1} - the one we will have in the “low luminosity” phase ? Simply,

N = \sigma_{tot} L.

With \sigma_{tot} as quoted above, we get a rate N=8 \times 10^7 of proton collisions: eighty million collisions per second! How many per bunch crossing ? Well, if all proton bunches contain the same number of particles, we get on average two interactions per bunch crossing, since the crossing rate is 4 \times 10^7. Easy, huh ?

Well, things in reality are just a bit more complicated. The probability of events that may come incoherently in integer numbers follows the rules of Poisson statistics. Poisson statistics allows us to compute the probability that a bunch crossing will contain no collisions, or one, or two, or N, given the average \mu=2 as computed above. The formula looks awful, but it is quite benign:

P(N) = \frac {\mu^N e^{-\mu}} {N!} ( the exclamation mark indicates taking the factorial of N).

We need a pocket calculator, but other than that there is nothing that should scare you out of this post. Keep reading if you want to use what you just learned to get some insight in the inner workings of the LHC experiments!

With the formula for the probability of N collisions, we have gained power - knowledge, they say, is just that. The power to make wonderful calculations. If I tell you that the cross section for producing an event with two energetic jets (say, energy above 30 GeV each) is \sigma_2 = 2 \times 10^{-28} cm^2, or 200 \mu b  (we prefer to use microbarns -labeled \mu b- for the area of 10^{-30} cm^2, a quite convenient unit),  how many such events will be produced in a single bunch crossing at the full LHC luminosity of 10^{34} cm^{-2} s^{-1}, on average ?

Easy. Use the formula N = \sigma L, and you get a rate N=2 \times 10^6 s^{-1}, that is,  2 MHz. Then, by dividing by 40 million bunch crossings per second, you get the rate per bunch crossing, 0.05: one in twenty. If instead we had taken the cross section for producing four energetic jets, \sigma_4 = 3 \mu b, we would have obtained a rate of 30 kHz, and a bunch crossing frequency of 7.5 in ten thousand. Mind you, the cross sections I quote are approximate - I estimated them with some back-of-the-envelope calculation. But let’s not be distracted by details and let me get to the point.

Those computed above are average rates. What happens if I ask you what is the chance that two, or more, separate proton collisions each producing two jets like the ones above in the same bunch crossing?

Now, that might sound like a weird question, devoid of any practical importance. Quite the contrary. Let me compute it for you before making my point. We use the Poisson probability formula, with \mu = 0.05 and N>=2. Instead of computing P(2), P(3), P(4)… and then adding them together, we use the fact that the sum of all P(N) is one: a nice property of probability, indeed! Here is the computation:

P(0) = e^{-\mu} = 0.95123,

P(1) = e^{-\mu} \mu^1 /1! = 0.04756, and so

P(>=2) = 1 - P(0) - P(1) = 1-0.95123-0.04756=0.00121.

Interesting! The chance of two distinct dijet events in a single bunch crossing is not that negligible… If we cannot distinguish where the jets come from (i.e., if the two proton collisions happen too close to each other), we will interpret the event as one with four energetic jets!

Now compare this 0.00121 with the number computed above with \sigma_4, the rate of collisions producing four jets in the final state from a single proton-proton interaction: we discover that at the LHC, there are instances when two separate collisions may conspire to mimic rarer processes! If I am looking for four-jet events, I will find 1.21 every thousand bunch crossings coming from two 2-jet “multiple interaction” events, while only an additional 0.75 every thousand will be genuine 4-jet events. I have a background to consider which lower luminosity machines would never have to care about!

The exercise is over. It is not an academic one: in the study of the very rare production of top pairs with higgs-strahlung, pp \to ttH, one gets to consider the collection of exceedingly rare events with up to eight energetic jets in the final state. The background from multiple interactions conjuring a multijet final state by adding different contributions is to be removed! We can do that by actually tracking the jets down to the space point where they were originated: we only keep events where all eight jets originated from the same spot, and we are ok. We can do it, since we have such a wonderful silicon tracker (see picture)…

Explaining traffic jams February 7, 2008

Posted by dorigo in mathematics, news, science.
17 comments

I just finished browsing a paper by Gabor Orosz and Gabor Stepan, researchers respectively in nonlinear mathematics and applied mechanics. The pdf file had rest on the virtual desktop of my laptop computer for a while now, begging to be read like a hundred more, but advantaged by having not been thrown to the darkness of my “papers to read” folder with all the others. And today, with some time to spend before the arrival of the next train to Venice, I just ventured to read it. 

After my quick read I am left with mixed feelings. The paper is not the kind of science that fits George Bernard Shaw’s definition, which I learned from Jeff a week ago: “Science is always wrong! It never solves a problem without creating ten more“. In fact, it does answer the question of how traffic jams are created from a uniform flow. The problem in this case is that the answer was already rather well known. Nonetheless, just thinking at the elegant math which is the ultimate cause of your anguish at the wheel when stuck on a highway makes it easier to accept the situation, and this is enough justification for the article. But the study is indeed some breakthrough in modeling traffic jams.

Orosz and Stepan consider an idealized highway such as the one pictured on the right: vehicles are the points at coordinates x_i along a circumference, all moving in the same direction. They then analyze the nonlinearities that arise in a model of traffic flow in their toy highway when one introduces a realistic time delay in the response of drivers to the detection of an impact threat with the car preceding them. They find that the time delay is crucial in allowing to model, with quite complicated formulas, the onset of backward-traveling “stop-and-go” waves, which interrupt the unstable solution of a well-behaved uniform flow of vehicles.

Apparently, the duality between uniform flow and “stop-and-go” waves has a name: it is an instance of a Hopf bifurcation. Now, since I had never heard of Hopf bifurcations before (or maybe I have, and have forgotten about them – oblivion is the privilege of a cultured man), I am not the best person to explain it here.
So you can read about it on wikipedia if you can not stand your own ignorance (I have accepted mine long ago). If you have lost your mouse and cannot click above, here is a quote:

In bifurcation theory a Hopf or Andronov-Hopf bifurcation is a local bifurcation in which a fixed point of a dynamical system loses stability as a pair of complex conjugate eigenvalues of the linearization around the fixed point cross the imaginary axis of the complex plane.

Everything is clear now, huh ? Well, the math is really not for everybody, not even in the simplest case. And it turns out that the time delay introduced by Orosz and Stepan changes the description of the system from one with ordinary differential equations in a finite-dimensional dynamical space to one modeled by delay differential equations and infinite-dimensional phase spaces. Hugh.

In any case, however complex the main body of the paper is, its conclusions are quite readable. Basically, the model of Orosz and Stepan demonstrates the onset of backward-traveling waves of traffic jams, and shows how a highway is basically a bistable system, with the linear flow easily affected by large enough “perturbations” -.such as a truck changing lane – which cause the onset of stop-and-go waves. All things we knew, but the formulas in the model do allow some planning: just a little decrease in the speed of cars approaching a backward-moving wave could significantly dampen it. Something we knew qualitatively, but we can now compute. A step in the right direction, towards the fulfilment of my highway dream.

I dream of highways where you enter with your car, and then leave the wheel and the gas pedal and read a book. An electronic wireless system controls the speed of your car and its steering, and you get to destination in the smallest possible time available given the number of vehicles on the road. This is not science fiction: we have owned the technology to do this since maybe ten years ago. Just imagine the amount of time saved to human beings, the decrease in pollution, and in the amount of stress… I know these systems are being studied, and I am rooting for those guys.

Scientific Bang for the Buck January 5, 2008

Posted by dorigo in computers, mathematics, news, physics, politics, science.
10 comments

A concept worth a preprint, specifically Bruce Knuteson’s “A Quantitative Measure of Experimental Scientific Merit“, physics.data-an/0712.3572v1. And certainly a preprint worth a look, if only for making up one’s mind on the scientific merit of working at MIT. It came out on Christmas day on the ArXiv.

Jokes aside, I found the paper quite entertaining, and at times indeed surprising. While I find Bruce’s approach to the problem of assessing the scientific merit of a proposed experiment or analysis rather dangerous, and the explicit formulation of priors for the probability of discovering new physics in this or that experiment vaguely reactionary, I admit the paper brings home a point, which is however its premise rather than its thesis: review committees, as well as search committees, move in the dark. I am still in doubt on whether the exercise of endlessly debating over priors is a valid substitute to good-old preconceptions and biases. 

Bruce is quite up-front from the very beginning in stating what is the main purpose of his study:

“In the context of determining which research program to pursue, review committees often must decide the relative scientific merits of proposed experiments. Within large experiments, deciding which analyses to emphasize requires similar decisions”.

Which gets me to raise the first objection - or rather a comment: It is remarkably radical to talk about “which analyses to emphasize”. I find that the concept, in fact, is a bit a too business-like way of doing physics in a large experiment. At the Tevatron we certainly need to emphasize the top mass measurement, the B mixing, and the Higgs searches these days, but we do not need a computation of entropy decrease to know it; emphasizing other analyses (which means, please note, de-emphasizing others) because of some pre-arranged prior (the estimated probability that a gluino is there, for instance) smells of a covert way of depriving scientists working in the collaboration of their wonderful inventiveness, of their freedom to be guided by their nose, by their intuition.

It is not a chance, it seems, that Knuteson is one of the authors of a complex automated machinery for new physics searches, a device producing hundreds of histograms of kinematical variables describing any combination of physics objects (high-Pt electrons and muons, jets, missing Et, photons, etcetera) in search for discrepancies with the standard model: is number-crunching winning its battle with scientific minds as much as it has won the chess challenge with our best grandmasters ?

The paper starts with a definition of the surprise content of the result of an experiment. It does so by using information theory, arriving at the wanted measure of the merit of an experimental result as the entropy decrease in the state of knowledge relative to the particular physics question investigated. Here is the synopsis of the discussion up to Section II, in Knuteson’s words:

“The essential thesis of this article is summarized in two sentences.

  • The appropriate quantification of scientific merit of a proposed experiment or analysis (before it is performed and its outcome is known) is the reduction in information entropy the experiment or analysis is expected to provide [...].
  • The appropriate quantification of scientific merit of an experiment or analysis after the result is known is the information gained from the result [...].”

Fair enough: if one knew what is the chance of the Tevatron discovering new physics in Run II, or the LHC finding something beyond the Higgs, one could certainly be able to tell how well the money was spent in building those experiments. Using the reduction in information entropy is a principled way to quantify the appropriateness of the investments.

But here, in fact, comes the nice part: the paper goes on to delve with the question by specifically working out priors. In Section III, Knuteson uses priors derived in the Appendix to estimate the “scientific bang for the buck” (SBFB) of existing experiments, and even that of past experiments discovering the Psi, the W and Z bosons, and so on. One learns that the probability of the Tevatron Run II finding new physics is 20%, and that the probability that the LHC will see something new is 90%. 

Using those numbers and the cost of the experiments, the SBFB of the LHC is computed at a mere 0.001, while the Tevatron stands a giant at 5.0! Also worth noting is the specific search for single top production at the Tevatron, which - due to the low surprise factor - has a SBFB of 0.00001. Ironically, in the same table Knuteson includes the SBFB of the experiment of flipping a coin: the SBFB of the experiment is zero, not that different from the global search for new physics at the LHC!, although, to be fair, zero and 0.001 are indeed quite different when you take the logarithm.

As far as completed experiments go, one learns instead that the tau discovery stands at a SBFB of 5.0, soundly beating runner-up J/psi discovery at 0.2, with the top quark discovery at an amateurish 0.0004. The table is long, and you can search for your favorite HEP result, and judge for yourself on whether the Nobel Prize to Rubbia was’t indeed a bit hasty.

In earnest, the summary of Bruce’s paper is very direct in clarifying the rather limited scope of the proposed quantification method:

“Use of information content or information gain to evaluate the scientific merit of experiments requires the estimation of the probabilities of qualitatively different outcomes, and the reader may object that the problem of quantifying an experiment’s scientific merit has simply been reformulated in terms of the estimation of the probabilities of possible experimental outcomes. At worst, this reformulation significantly changes and focuses the discussion. The fact that there is not a well-developed literature to point to for the justification of these a priori probabilities emphasizes the fact that until now the importance of these probabilities has not been properly recognized [...]“

However, he argues that

“The reader may object to the very idea of constructing an explicit figure of merit [...] Such a reader misses the point that this is done (implicitly, if not explicitly) every time a decision of resource allocation is made. It is surely in the field’s best interest for such evaluations to be made in the sharpest, most open, most quantifiable, and scientifically best motivated framework possible”.

Which, to my biased ears, sounds like, “come on, we all know that the allocation of funding to science is made by fools, so let’s give ‘em some only partially random numbers to base their decisions upon and we will contain the damage”.

I do not mean to criticize the paper too much. It is a quite principled and tidy study of the problem. I think one cannot do much better in terms of finding a suitable figure of merit than what Knuteson did. I disagree with the very concept, though. But maybe I am too old-fashioned and I miss the point: scientific funds are not allocated wisely. On that, I think, we all agree.

Update: being away on vacation obviously does not help one staying in touch with what happens elsewhere on the web. I only now got aware of two other posts on this same topic: one at Superweak and one at Collider Blog. Backreaction also discusses it shortly.

Update 2: a detailed discussion of the statistical aspects of Knuteson’s paper is also available at Deep Thoughts and Silly Things.

Guest post: Tony Smith, “Visualizing E8 Physics” December 13, 2007

Posted by dorigo in internet, mathematics, physics, science.
13 comments

Tony Smith needs no presentation to the readers of this blog, since he often contributes to the discussion of physics posts here.  His web site can be found at tony5m17h.net . I received yesterday, and am glad to publish, the following interesting discussion of the properties of the E8 group, which has attracted a lot of attention since the recent paper by Lisi. Enjoy!

Garrett Lisi at hep-th/0711.0770 describes a physics model based
on the 248-dimensional rank 8 exceptional Lie algebra E8 in which
each of the 240 root vectors of E8 are given a physical
interpretation.

The Lie algebra root vectors of E8 form a polytope (called the
Witting polytope) in 8-dimensional Euclidean space. 8 of the 248
generators of E8 are used to form the 8-dimensional root vector
space, and the remaining 248 - 8 = 240 generators of E8 correspond to
the 240 vertices of the E8 root vector Witting polytope.

Garrett Lisi shows a projection of the 240 vertices down into
2-dimensional space

A youtube movie based on a New Scientist article describes some of Garrett Lisi’s physical interpretations of the vertices, and shows how the patterns of vertices transform under rotations.

In this guest post, I want to describe an alternative set of physical interpretations of the 240 E8 root vector vertices and present a movie of how they transform under rotations, so that E8 physics might be more intuitively visualized. In this post, I will abuse notations by using E8 , Spin(16), etc., for both Lie algebra and Lie group, and I will not be careful about group / algebra distinctions, factors of Z2, and other technical matters that might get in the way of exposition.

Like Garrett Lisi’s E8 physics model, this E8 physics model is based on seeing E8 in terms of

248-dimensional E8( 8) = EVIII = 120-dimensional adjoint Spin(16) + 128-dimensional half-spinor Spin(16)

and on seeing 120-dimensional Spin(16) as

120-dimensional Spin(16) = 28-dimensional D4 + 28-dimensional D4* + 64-dimensional 8v x 8g

and on seeing 128-dimensional half-spinor Spin(16) as

128-dimensional half-spinor Spin(16) = 64-dimensional 8s’ x 8g + 64-dimensional 8s” x 8g

and on seeing the 240 root vectors of E8 and the 120 - 8 = 112
root vectors of rank 8 Spin(16) as

240 E8 root vectors = 112 adjoint Spin(16) root vectors + 128 half-spinor Spin(16) root vectors =

= 24 D4 root vectors + 24 D4* root vectors + 64-dimensional 8v x 8g + 64-dimensional 8s’ x 8g + 64-dimensional 8s” x 8g

However, in this E8 physics model the physical interpretations of the 240 root vectors are not exactly the same as in Garrett Lisi’s model. Here is how they look in this model.

In this image of this model there are two sets of 24 vertices each:

24 yellow points correspond to the 24 root vectors of D4 which is used to construct Gravity by a generalized MacDowell-Mansouri mechanism based on the 15-dimensional D3 = A3 Conformal Group Spin(2,4) = SU(2,2). To help get started with visualization, here are the 24 yellow points

in the image. Note that the 24 yellow points form three sets:

6 near the top, in a 1 4 1 pattern corrresponding to the 6 vertices of an octahedron;

12 in the middle, in a 4 4 4 pattern corresponding to the 12 vertices of a cuboctahedron;

6 near the bottom, in a 1 4 1 pattern corresponding to the 6 vertices of a second octahedron.

Note also that a 24-cell can be seen as being made up of a cuboctahedron and two octahedra as in this stereo image:

in which the cuboctahdron is green and the two octahedra are red and blue. So, it is clear that the 24 yellow points form a 24-cell, which is the root vector polytope of the D4 Lie algebra.

24 purple points correspond to the 24 root vectors of D4* which is used to construct the U(3) x SU(2) x U(1) Standard Model based on the 15-dimensional D3 = A3 group SU(4) and its 9-dimensional subgroup U(3) and the 6-dimensional SU(4) / U(3) = CP3 Twistor space, with the U(3) giving the SU(3) x U(1) of the Standard Model and the CP3 Twistor space giving (via relation to quaternionic structure) the SU(2) of the Standard Model. Note that the 24 purple points form a pattern similar to that of the 24 yellow points shown above.

Each of the remaining three sets of 64 vertices is of the form 8 x 8g, where 8g denotes the 8 Dirac gamma basis elements of the Dirac gammas of an 8-dimensional Kaluza-Klein spacetime.

64 blue points correspond to 8v x 8g, where 8v corresponds to the 8 basis elements of an 8-dimensional Kaluza-Klein spacetime, so that the 64 blue points correspond to an 8×8 matrix of the 8 spacetime basis elements with respect to 8 Dirac gammas.

64 red points correspond to 8s’ x 8g, where 8s’ corresponds to D4 +half-spinors and to the 8 first-generation fermion particles (electron, neutrino, red up quark, green up quark, blue up quark, red down quark, green down quark, blue down quark), so that the 64 red points correspond to an 8×8 matrix of the 8 first-generation fermion particles with respect to 8 Dirac gammas.

64 green points correspond to 8s” x 8g, where 8s” corresponds to D4 -half-spinors (mirror image to +half-spinors) and to the 8 first-generation fermion antiparticles,, so that the 64 green points correspond to an 8×8 matrix of the 8 first-generation fermion antiparticles with respect to 8 Dirac gammas.

Note that the 24 yellow D4 + 24 purple D4* + 64 blue = 112 adjoint Spin(16) vertices are in some sense fundamentally bosonic, physically corresponding to gauge bosons or spacetime vectors,

while

the 64 red and 64 green = 128 half-spinor Spin(16) vertices are in some sense fundamentally fermionic, physically corresponding to fermion particles and antiparticles.

which is characteristic of exceptional Lie algebras being constructed by combining adjoint-type and spinor-type repesentations.

To see how the 240 root vectors of E8 transform under rotation, I used a root vector rotation web applet by Carl Brannen and took a bunch of screen shots and used them to make an image-sequence movie. There may be a little glitch about half-way through the 34 second movie (I may have messed up by hitting a reset button, or by taking screen shots a little off center, or etc), but to me it seems that, even so, the movie gives interesting visualization insights into how the 240 root vectors of E8 fit together to describe physics.

Click here to see the .mov movie.

Using the basic components described above, it is natural to construct a Lagrangian

with the 64 blue points (8-dimensional Kaluza-Klein spacetime) as base manifold

with the 24 D4 and 24 D4* yellow and purple points (Gravity and the Standard Model gauge groups) forming curvature terms

with the 64 red fermion particle and 64 green fermion antiparticle points forming fermion terms.

The blue 64 and red 64 and green 64 are related by Triality inherited from the Spin( 8) triality among vectors, +half-spinors, and -half-spinors. Instead of using the triality for fermion generations, this model uses Triality to show a subtle supersymmetry between fermions and gauge bosons, seeing the gauge bosons as related to bivectors constructed from the blue 64 vectors, and using the Triality to relate them to the red 64 fermion particles and the green 64 fermion antiparticles.

In the interest of keeping this expository guest post somewhat simple, I will only mention in passing such things as that the 8-dimensional Kaluza-Klein is motivated by the work of Batakis, the second and third generations of fermions are composites of the first generation fermions, the Higgs mechanism comes from a geometric construction due to Meinhard Mayer, the force strengths as particle masses are calculated using structures related to bounded complex domains in the spirit of Armand Wyler, etc. For such details and more, as well as references, see my web page entitled E8, Cl(16) = Cl( 8) (x) Cl(8), and Physics Calculation or the corresponding 82-page pdf version.

I will try to reply to comments here not only about the visualization movie, but also about any questions that might arise from the 82-page detailed paper.

The worst students, the best universities December 4, 2007

Posted by dorigo in mathematics, news, physics, science.
12 comments

A flaming title, but let me explain. Italy was recently placed at the 35th place in a ranking of countries by the level of scientific knowledge of high school students. A list led by Finland, with several eastern countries figuring very well, and Italy scoring quite unlike other european countries. Disappointing, to use an euphemism.

On the other hand, today I read very conforting news, only at first sight in contradiction with the former datum: as far scientific studies are concerned, italian universities figure very high in a ranking of excellence compiled by the Centrefor Higher Education Development (CHE) in Gutersloh. In mathematics,  the University of Rome-”Tor Vergata” is among the institutes “of excellence”, a very short list. Even better is the ranking of Physics institutes: here, among 24 chosen in the study, four are italian: and among them, of course, Padova (with Firenze, Pisa, and Rome-”La Sapienza”).

The parameters which were used to rank institutes were the number of scientific publications in the period 1997-2004, the number of citations, the presence of researchers among the most cited in Europe, and the use of European funds from the “Marie Curie” program, a program to favor the mobility of researchers. Apparently, my presence in Padova did not affect negatively the outcome ;-)