jump to navigation

Variation found in a dimensionless constant! April 1, 2009

Posted by dorigo in cosmology, mathematics, news, physics, science.
Tags: , ,
comments closed

I urge you to read this preprint by R.Scherrer (from Vanderbilt University), which appeared yesterday on the arxiv. It is titled “Time variation of a fundamental dimensionless constant“, and I believe it might have profound implications in our understanding of cosmology, as well as theoretical physics. I quote the incipit of the paper below:

“Physicists have long speculated that the fundamental constants might not, in fact, be constant, but instead might vary with time. Dirac was the first to suggest this possibility [1], and time variation of the fundamental constants has been investigated numerous times since then. Among the various possibilities, the fine structure constant and the gravitational constant have received the greatest attention, but work has also been done, for example, on constants related to the weak and strong interactions, the electron-proton mass ratio, and several others.”

Many thanks to Massimo Pietroni for pointing out the paper to me this morning. I am now collecting information about the study, and will update this post shortly.

Ten photons per hour March 23, 2009

Posted by dorigo in astronomy, games, mathematics, personal, physics, science.
Tags: , , ,
comments closed

Every working day I walk for about a mile to my physics department in Padova from the train station in the morning. I find it is a healthy habit, but I sometimes fear it also in some sense is a waste of time: if I catched a bus, I could be at work ten minutes earlier. I hate losing time, so I sometimes use the walking time to set physics problems to myself, trying to see whether I can solve them by heart. It is a way to exercise my mind while I exercise my body.

Today I was thinking at the night of stargazing I treated myself with last Saturday. I had gone to Casera Razzo, a secluded place in the Alps, and observed galaxies for four hours in a row with a 16″ dobsonian telescope, in the company of four friends (and three other dobs). One thing we had observed with amazement was a tiny speck of light coming from the halo of an interacting pair of galaxies in Ursa Major, the one pictured below.

The small speck of light shown in the upper left of the picture above, labeled as MGC 10-17-5, is actually a faint galaxy in the field of view of NGC3690. It has a visual magnitude of +15.7: this is a measure of its integrated luminosity as seen from the Earth. It is a really faint object, and barely at the limit of visibility with the instrument I had. The question I arrived at formulating to myself this morning was the following: how many photons did we get to see per second through the eyepiece, from that faint galaxy ?

This is a nice, simple question, but computing its answer by heart took me the best part of my walk. My problem was that I did not have a clue of the relationship between visual magnitude and photon fluxes. So I turned to things I did know.

Some background is needed to those of you who do not know how visual magnitudes are computed, so I will make a small digression here. The scale of visual magnitude is a semi-empirical one, which sets the brightest stars at magnitude zero or so, and defines a decrease of luminosity by a factor 100 per every five magnitudes difference. The faintest stars visible with the naked eye in a moonless night are of magnitude +6, and that means they are about 250 times fainter than the brightest ones. On the other hand, Venus shines at magnitude -4.5 at its brightest -almost 100 times as bright as the brightest stars-, and our Sun shines at a visual magnitude of about -27, more than a billion times brighter than Venus. The magnitude difference between two objects is in a relation with their relative brightness by a power law: L_1/L_2 = 2.5^{-M_1+M_2}; the factor 2.5 is an approximation for the fifth root of 100, and it corresponds to the brigthness ratio of two objects that differ by one unit of visual magnitude.

Ok, so we know how bright is the Sun. Now, if I could get how many photons reach our eye from it every second, I would make some progress. I reasoned that I knew the value of the solar constant: that is the energy radiated by the Sun on an area of 1 square meter on the ground of the Earth. I remembered a value of about 1 kilowatt (it is actually 1.366 kW, as I found out later in wikipedia).

Now, how many photons of visible light arriving per second on that square meter of ground correspond to 1 kilowatt of power ? I reasoned that I did not remember the energy of a single visible photon -I remembered it was in the electron-Volt range but I was not really sure- so I had to compute it.

The energy of a quantum of light is given by the formula E = h \nu, where h is Planck’s constant and \nu is the light frequency. However, all I knew was that visible light has a wavelength of about 500 nanometers (which is 5 \times 10^{-7} m), so I had to use the more involved formula E = hc/\lambda, where now c is the speed of light and \lambda is the wavelength. I remembered that h=6 \times 10^{-34} Js, and that c=3 \times 10^8 m/s, so with some effort I could get E=6 \times 10^{-34} \times 3 \times 10^8 / (5 \times 10^-7) = 4 \times 10^{-19}, more or less.

My brains were a bit strained by the simple calculation above, but I was relieved to get back an energy roughly equal to that I expected -in the eV range (one eV equals 1.6 \times 10^{-19} Joules -that much I do know).

Now, if the Sun radiates 1 kW of power, which is a thousand Joules per second, how many visible photons do we get ? Here there is a subtlety I did not even bother considering in my walk to the physics department: only about half of the power from the Sun is in the form of visible light, so one should divide that power by two. But I was unhindered by this in my order-of-magnitude walk-estimate. Of course, 1 kW divided by 4 \times 10^{-19} makes 2.5 \times 10^{21} visible quanta of light per square meter per second.

Now, visual magnitude is expressed as the amount of light hitting the eye. A human eye has a surface of about 20 square millimeters, which is 20 millionths of a square meter: so the number of photons you get by looking straight at the sun (do not do it) is 1.2 \times 10^{14} per second. That’s a hundred trillions of ‘em photons per second!

I was close to my goal now: the magnitude of the speck of galaxy I saw on Saturday is +15.7, the magnitude of the Sun is -27, so the difference is 43 magnitudes. This corresponds to 2.5^{43}, which you might throw up your hands at, until you realize that every 5 units of the exponent the number increases by 100, so you just do 100^{43/5} which is 100^{8.6} which is 10^{17.2}… Simple, isn’t it ?

Now, taking the number of photons reaching the eye from the Sun every second, and dividing by the ratio of apparent luminosities of the Sun and the galaxy, I could get N_{\gamma}=10^{14} / 10^{17} = 10^{-3}. One photon every thousand seconds!

Let me stress this: if you watch that patch of sky at night, the number of photons you get from that source alone is a few per hour! With my dobson telescope, which intensifies light by almost 10,000 times, I could get a rate of a few tens of photons per second, and the detail was indeed detectable!

If you are intested in the exact number, which I worked out after reaching my office and the tables of constants in the PDG booklet, I computed a rate of N_{\gamma}=3.4 \times 10^{-3} photons per second with unaided eye, and 22 per second through the eyepiece of the telescope. Without telescope, that galaxy sends to each of us about 10 photons per hour!

UPDATE: this post will remain as one clear example of how dangerous it is to compute by heart! Indeed, somewhere in my order-of-magnitude conversions above I dropped a factor 10^2 -which, mind you, is not horrible in numbers which have 20 digits or so; but when one wants to get back to reasonable estimates for reasonably small numbers, it does count a lot. So, after taking care of some other (more legitimate) approximations, if one computes things correctly, the number of photons from the galaxy seen with the unaided eye is more like two hundred per hour, and in the telescope it is of about 350 per second.

What’s hot around February 10, 2009

Posted by dorigo in astronomy, Blogroll, cosmology, internet, italian blogs, mathematics, news, physics, science.
Tags: , , , , ,
comments closed

For lack of interesting topics to blog about, I refer you to a short list of bloggers who have produced readable material in the last few days:

  • The always witty Resonaances has produced an informative post on Quirks.
  • My friend David Orban describes the recently-instituted singularity University
  • Stefan explains other types of singularities, those you can find in your kitchen!
  • Dmitry has an outstanding post out today about the physics of turbulence, with four mini-pieces on the Reynolds number, viscosity, universality and intermittency. Worth a visit, if even just for the pics!
  • Marco discusses the long winter of LHC. Sorry, in italian.
  • Peter discusses the same issue in English.
  • Marni points out a direct explanation of the Pioneer anomaly with the difference between atomic clock time and astronomical time. Or, if you will, a change of the speed of light with time!

Guess the function! January 15, 2009

Posted by dorigo in mathematics, personal, physics, science.
Tags: ,
comments closed

I have a problem today -actually I’ve fiddled with it for a couple of days now. So, since it does not involve particles (at least not directly), I figured I’d bounce it off the mathematically inclined among you: maybe I get an answer before I can figure my problem out by myself!

The problem is simple: find a functional form that can be a good fit, with suitable parameters, to the following graph:

(This is a residual of a Z lineshape fit to a relativistic Breit-Wigner function by the way, but you need not bother with these unnecessary details).

As you can see, we have a negative asymptote and a positive asymptote that have different values, and a central wiggling which has different “width” for the negative and positive component. I have been trying several combinations like f(x) = atan(h(x))*g(x), where g(x) is a gaussian and h(x) some kind of “warping factor” with a different slope in the negative and positive side (with respect to x=91)… But I am getting nowhere. I am sure there is somebody out there that has a good advice, so please shoot!

UPDATE: Marius suggests a function in the comments thread below. I thank him for his input, but as is, the function f(x)=A atan(x)+ B atan(x+C) +D does not work well: see the best fit below (parameters in the upper right legend are A,B,C,D as in the function suggested by Marius):

Maybe with suitable modifications this might work, though. Hmmmm…

UPDATE: Using the hint by Marius that the addition of another arctangent could account for the different height of the two asymptotes, I have cooked up a better fit:

This is better, but I am really not satisfied. The function has 11 degrees of freedom -which is not too troublesome since there are 300 points in the graph to fit anyway; but the function is UGLY:

p_o atan[(p_1-x) e^{-((x-p_3)/p_4)^2}] + p_5 atan[p_6(p_7-x)] + p_8 e^{-((x-p_9)/p_{10})^2}

Any further idea on how to improve it ?

Hmmm, and I should add that having 11 parameters is a curse for me, because what I am going to do after I have a reasonable functional form is to study the parameters as a function of Z rapidity (which modifies the original graph), and parameterize those 11 dependencies… I already have a headache!

UPDATE: Lubos makes a very good attempt with a simple ratio of polynomials in the comments thread, offering f(x) = p_0 ( x-p_1)/(p_2x^2 + p_3 x+p_4) (he even offers some eyeballed parameters). Nice try, but the problem is that the function seems to be very irregular. If one fits the center region, Lubos’ function obtains a good fit (see upper plot below); if one tries to extend the fit further out on the tails, however, the fit rapidly worsens (lower plot).

Despite the shortcomings, I think I will investigate some ways to fix the function offered by Lubos -it has the potential of describing with few parameters the whole shape, once tweaked a bit…

UPDATE: Lubos tried to mend himself the function he proposed above, by adding a hyperbolic tangent. The function fits better the whole range, but it still fails to catch the subtleties of the slopes… Here is a fit using his suggested parameters:

I think I will remove the hyperbolic tangent and work on some warping of the polynomial…

UPDATE: warping the x values above 91 GeV from Lubos’ polynomial with a function f(x)=-p_0+p_1 x +\sqrt{(p_1-1)^2 x^2 + p_0^2} seems to work. The result is below:

The fit is not extremely precise, but these are residuals from a Breit-Wigner, so I guess that the multiplication of this function by the original shape will give a more than adequate parametrization, for my goals. Next up is obtaining 50 different fits like the one above, one per each interval in Z rapidity from 0 to 5.0, and parametrizing each of the seven parameter of the fits…

Why is rapidity maximum at 0 ? May 15, 2008

Posted by dorigo in mathematics, physics, science.
Tags: , ,
comments closed

In a recent post where I shortly discussed a rapidity distribution of muons produced in low-energy proton-proton collisions at LHC, I formulated a conjecture: given parton densities in the colliding protons which are monotonously decreasing functions of the momentum fraction (and they are, for small fractions), the rapidity y of the parton-parton collision has a maximum for y=0.

Of course, I was guided by experience: I know for a fact that unless you bias your data sample with requirements such as trigger selections or analysis cuts, any time you plot a rapidity distribution you obtain something symmetrical around zero. But one thing is having experience of the rule, the other is proving there are no exceptions.

So I was triggered to demonstrate the rule. I was helped by a grad student in our group, Nicola Pozzobon… Without him, I would not have gone past the long-forgotten rules of the delta function!

Demonstration: Take parton distribution functions in the proton to be monotonously decreasing functions of the momentum fraction:

f(x), x \in [0,1[ , df/dx<0.

The rapidity of the collision can be defined as

y=0.5 \log (x_1/x_2)

and so to obtain the distribution of rapidity arising from a given distribution of the partons in the proton we have to integrate on x_1, x_2 as follows:

G(y) = \int dx_1 dx_2 f(x_1) f(x_2) \delta(y-0.5 \log(x_1/x_2)).

The delta function, which is zero everywhere and equal to one only where its argument is zero, “picks up” the relevant values of f(x_1), f(x_2) capable of contributing to a given value of y.

Now, if we solve y=0.5 \log(x_1/x_2) for x_1 we find x_1=x_2 e^{2y}, so we can substitute in the expression for G and integrate in x_1, to find

G(y) = \int dx_2 f(x_2) f(x_2 e^{2y}) /(2x_2 e^{2y}),

where we have used the property of the delta function which says that

\delta(h(x)) = \delta(x)/|h'(0)|,

and the fact that

h(x_1, x_2)=y-0.5 \log(x_1/x_2)

from which

h'(x_1)=dh/dx_1=-1/(2x_1)=-1/(2x_2 e^{2y}),

or

h'(x_2)=dh/dx_2=1/(2x_2)=1/(2x_1 e^{-2y}).

If we had instead substituted x_2 we would have found

G(y) = \int dx_1 f(x_1) f(x_1 e^{-2y}) / (2x_1 e^{-2y}),

so we can generically write, using x with no subscripts:

G(y) = \int dx f(x) f(x e^{\pm 2y}) e^{\mp 2y} /(2x),

with the agreement that both expressions implied by the sign swaps have to be true simultaneously.

Upon derivation with respect to y we find

G'(y) = \int dx f(x)/x [ \pm x f'(x e^{\pm 2y}) \mp f(x e^{\pm 2y}) e^{\mp 2y}].

For y=0, the derivative of G is thus equal to

G'(0) = \int dx f(x)/x [\pm x f'(x) \mp f(x)].

But these (the G’(0) with +- and the G’(0) with -+ signs chosen) are two expressions that must simultaneously be correct: and since they have opposite values for any x in the integration interval, they must be zero. If the derivative of a function is null, the function has an extremum there. CVD.

We have thus proved that the rapidity distribution has an extremum if the PDF f(x) are monotonous functions of x. It would be easy to show that this is indeed a maximum, but I will be content with the above result for this post.

So in conclusion, my conjecture is correct! If the parton distribution functions are taken to be monotonous, the rapidity distribution of the center-of-momentum of the collision is indeed maximum at zero… I prefer to use intuition usually, because the above demonstration took me more than an hour!

UPDATE: Hmmm, by thinking at the problem for a microsecond, rather than writing integrals and derivatives, I only now arrive at the following conclusion: Since the rapidity distribution is symmetrical around zero – from its very definition -, and since it must go to zero for y going to plus and minus infinity, it goes without saying that there is an extremum at zero, duh! However, it is still interesting to see how that arises mathematically. THis also proves what I know for a fact: a bit of analysis is worth more than a megabyte of programming – and the same can be said with thinking and computing formulas!

Updated Mw-Mt Higgs search plot from Sven May 14, 2008

Posted by dorigo in mathematics, science.
comments closed

Since I am currently preparing the slides of the talk I will give next week at PPC2008, a conference being held in Albuquerque on the interconnection between particle physics and cosmology, I have my hands full with material that would be perfect for this blog. The talk is a review of new results from the CDF experiment, and there is literally a ton of them! What makes it hard for me is to sort out the stuff that is really the very best and most worthy of being shown at the particular event.

I am however not posting here direct CDF results, but rather a plot that my friend Sven Heinemeyer was kind to produce according to my directives, which I meant to allow me to summarize in my talk the status of electroweak fits by separating the main contributions of experimental measurements. In the graph below, showing the dependence of the Higgs boson mass on the value of W boson and top quark masses, you can see several different regions highlighted with black, blue, and magenta lines. The black lines bracket the LEP II determination of the W mass; the blue ellipse describes the Tevatron measurements of the two parameters, and the magenta hatched “wing profile” area shows the allowed values of the two quantities according to electroweak fits performed using LEP I and SLD determinations of electroweak parameters.

Also shown in the plot is the SM-allowed range (in red), where the Higgs boson has a mass varying between the lower LEP II limit of 114.4 GeV (upper border of the red hatched area) and 400 GeV (lower border), and the SUSY allowed region (hatched green), which shows the zone allowed by different choices of some of the many SUSY parameters, in particular the mass of supersymmetric particles.

Now, let me make a few points concerning the plot above.

  1. Although precise, the indirect experimental input shown in the plot is still incapable of discriminating between SM and SUSY – and it probably never will by itself, since LHC will soon rule out or find SUSY before it shrinks the ellipse sizably (ok, ok, I am neglecting the possibility of split SUSY…)
  2. the celebrated LEP I / SLD data looks obsolete from this particular vantage point, in light of the more recent direct measurements; this would however be an unfair interpretation, given that electroweak fits have many more parameters than just W and top quark masses.
  3. The LEP I / SLD data is obsolete as far as the top quark is concerned: in the plot it does not even appear to constrain it if compared with the ultra-precise (+-0.8%) Tevatron determination!
  4. The top quark mass has been bouncing up and down a bit, although always well within errors, in the last 5 years, from 178 to 170 to 172.4 GeV. This has slightly moved up and down the preferred value of fit Higgs mass in the SM. However, as the ellipse shrinks, this is becoming less of an issue. In fact, to justify the effort of producing the best possible top mass measurement, we used to say that a 1 GeV precision on the top mass was equivalent to a 7 MeV precision on the W mass as far as the knowledge we would obtain on MH was concerned, based on the slope of the Higgs contours in the plot above. Now that the error on top mass is well below 2 GeV, however, it becomes clear that we will not gain much knowledge by increasing the precision much further. The W mass has become one of the main players in the game of precision SM fits now!
  5. The ellipse includes 68% of the area of the two-dimensional gaussian centered on the Mw-Mt determination, just as much as the black bars do, but being two-dimensional it is deceiving: the single most precise determination of the W boson mass is in fact from CDF, and Tevatron and LEP II are basically at the same level of precision on that quantity!

Why is the comparison deceiving ? Because if you have one single quantity, you determine the 68% interval by integrating a gaussian distribution from its center outwards, until you “cover” 68% of its total integral (from -inf to +inf). If you add a dimension to your single-variable gaussian, and make it a two-dimensional gaussian shape, the 68% bounds remain the same unless you integrate by expanding an ellipse, rather than a band, around the center. The ellipse encompasses values of the 2-dimensional distribution which have the same “probability”, but in so doing it “cuts the corners”, and to total a 68% of the 2-dim integral it now has to extend past the one-dimensional 68% boundaries in each of the two variables. A sketch will clarify matters:

Well, not exactly “clarified”… But I have no time to make the graph easier to understand. The point is that the ellipse “cuts” only a part of the band in each direction, and so the integral of the 2-dimensional curve it comprises is much smaller than the band. To make the ellipse include 68% of the 2-dimensional distribution constructed with the two gaussian curves, one has to widen it to a roughly double size.

So, paradoxically: if LEP II had a determination of the top quark mass too, the band bracketed by the two black lines in the plot by Heinemeyer would convert into an ellipse which would be about as wide in the vertical direction as the Tevatron blue ellipse.

Not convinced ? Oh well. Think at it this way: with a single measurement of the W mass, you say “the probability that the mass is between 80.35 and 80.45 GeV is 68%, because I determined it to be 80.4 and I have an error of 0.05 GeV”. Fine: the gaussian distribution, if integrated from -1-sigma to +1-sigma, provides 68% of its total normalization. The same goes if you claim that, having measured the top mass at $172.4 \pm 1.4 GeV$, there is a 68% chance that it lies in the interval 171-173.8 GeV. However, if you ask what is the probability that the W mass is between 80.35 and 80.45 GeV AND the top mass is between 171 and 173.8 GeV, this is much smaller than 68%, because independent probabilities multiply each other: it is, in fact, only 46.2%; but this corresponds to the square drawn around the circle in the graph! The probability that the two values lie in the ellipse with major axes equal to 1-sigma band widths is (if I recall correctly) about 37%.

The bottomline? Whenever you look at a plot with two measurements, one described by an ellipse and the other by a band, always regard the ellipse with more respect than it seems to deserve!

Guest post – Jeff Wyss: The Relativistic Train April 30, 2008

Posted by dorigo in Blogroll, mathematics, physics, science.
comments closed

Jeff is a physics professor at the University of Cassino, and a long-time colleague and friend of mine. He worked in the SLD and CDF collaborations as a particle physicist, but later moved on to study radiation damage on silicon detectors for particle and astroparticle applications.

Besides admiring him for his wicked sense of humor, which he uses to make the workplace around him always a pleasant place, I have the highest esteem of Jeff as a professor, because he is quite skilled in explaining physics concepts in simple terms. He always looks for the most intuitive way to understand things, as you might appreciate in the contribution he offers below.

The following describes a very elegant and simple derivation of the relativistic formula for the addition of velocities, w = (u+v)/(1 + uv/c^2).

It is due to David Mermin. I fell in love with it and have been telling it for the past four years now to the students of my general physics course. The students are first year telecommunications and electrical engineering students. Before sitting in on my course all of them have heard about Einstein and most of them heard the expression “the velocity of light is constant”. I do not have the time to discuss special relativity in detail. My course is quite traditional. I discuss reference frames, inertial frames, Galilean transformations and covariance of Newton’s laws. I then point out that when describing mechanical waves the frame that is stationary respect to the medium is a special reference! In particular the wave motion can be made to disappear by moving respect to the medium with a velocity equal to that of the wave. It is clear at this point that the constancy of the velocity of light cannot be understood by assuming Newton’s laws and then modeling light as a mechanical wave in a medium (the ether). I then restate the constancy of the velocity of light and begin Mermin’s derivation.

The derivation uses:

  • only one reference frame (no use of Lorentz transformations),
  • simple kinematics (always good to brush up on),
  • the constancy of the velocity of light (something that every telecommunications and electrical engineering student should know),
  • the idea that some things are invariant; i.e. while many quantities are relative, observers will agree on some absolutes.

Consider a train of length L moving along the x-axis at a constant velocity v respect to an inertial frame of reference (the observer watching the events unfold). At the trailing end of the train a loaded gun is aimed in the forward direction and fired at time t=0: the bullet and flash of light emerge and travel in the forward direction with different speeds: w the velocity of the bullet, c the velocity of light. A mirror at the front end of the train reflects the light back towards the advancing bullet. Let f be the fraction of the length of train that the reflected light travels before meeting up with the bullet. The constancy of light (Einstein’s dictum) tells us that the velocity of light in the forward direction is equal to the velocity of light in the backward direction; i.e. c_F = c_B = c.

The space-time plot looks like this:

Let t_F be the time for the light flash to reach the forward-going mirror and t_B be the time the reflected light needs to return from the mirror and meet up with the forward-moving bullet. Simple kinematics allows us to label the space-time plot:

Simple algebra:

It is important to note that the expression for f we just obtained is valid if the velocity of light in the forward and backward direction are equal. Note:

  • A classical pre-Einstein physicist would say this expression is valid only if the observer is stationary respect to the ether frame.
  • On the other hand Einstein says that any inertial observer would use the same velocity of light; i.e. Einstein tells us that this expression is valid for any observer (generic inertial frame).

Following Einstein we consider a particular observer (frame), one that is moving along with the train. For this observer the velocity of the train is v = 0. For clarity let us use the symbol u to indicate the velocity of the bullet with respect to this observer; i.e. with respect to the train.

Suppose the train has 10 windows and the reflected light and the bullet meet up at the third window from the front (f=0.3). It is important to realize that all observers will agree on the value of f. The fraction f is an invariant!

The constancy of the velocity of light allows us to impose the invariance of f the following way:

Q.E.D. !

Correcting the CMS momentum scale April 29, 2008

Posted by dorigo in mathematics, personal, physics, science.
comments closed

I have wanted to write some version of the present post for a while, and so it is a relief to publish it at last. In fact, it is rather strange to have completely avoided discussing in my blog the problem I have invested the best part of my research time in the last three months -plus a fair share of last year’s thinking-, and it was due time that I filled that void somehow.

Unfortunately, strange as it may seem, there are topics in my research activities that are hard to explain in simple terms. The problem I have been working on is not difficult to state, nor too difficult to solve, but it is extremely complicated and varied, so that a comprehensive description is challenging. However, I want to make an attempt…

The problem I have been dealing with, together with a small and focused group of bright colleagues (Sara Bolognesi, Marco De Mattia, and Chiara Mariotti: lads from Padova and ladies from Torino University) is the one of calibrating the momentum of charged tracks detected by the CMS experiment at CERN.

After being produced in a proton-proton collision in the core of CMS, charged particles have their position measured in a dozen layers of silicon detector before they hit the calorimeter system; the few penetrating ones surviving the encounter with trillions of heavy nuclei are also detected by the large set of muon chambers situated outside them. With the information provided by the silicon detectors -and, for muon candidates, by the muon system- a very performant and refined software algorithm reconstructs and fits the trajectory of the track, providing a measurement of the five parameters describing the helical trajectory; most notably, the curvature \rho inside the solenoid, which yields a precise determination of transverse momentum through the formula P_t = 0.3 B / \rho (where B is the magnetic field intensity -about 4 Tesla- and P_t is transverse momentum).

There are a number of reasons why a precise determination of the momentum of charged tracks is crucial. Let me just flash a few:

  1. Charged particles are measured with a better precision than neutral ones, and a careful determination of their momentum allows to calibrate in turn other parts of the detector.
  2. Some physics measurements such as the mass of the W boson rely heavily on track momentum.
  3. The identification of a high-mass resonance -say a new Z’ boson- may require the reconstruction of its Z' \to \mu \mu decay, and a scale error on the momentum of those high-energy tracks translates in a worse resolution in the Z’ mass, and a diminished discovery reach.
  4. B-physics crucially needs charged tracks to be precisely reconstructed in order for exclusive B decays to be extracted from backgrounds.

So how do we do it ?

We use resonances. A few neutral particles -vector mesons and the Z boson- decay to pairs of muons, and they can thus be extracted with small backgrounds from the data (events with two muons are easy to collect with CMS, and muons have the benefit that they are “perfect” tracks in several ways). We know the mass of these particles with great accuracy, thanks to previous experiments:

  • The Z boson mass is known to be 91.1876 \pm 0.0021 GeV, a 0.023% precision.
  • The Y(1S), the ground state of the (b \bar b) vector meson family, has its mass known as 9460.30 \pm 0.26 MeV, a 0.0028% measurement.
  • The Y(2S) mass is 10.02326 \pm 0.00031 GeV, a 0.0031% measurement.
  • The Y(3S) mass is 10.3552 \pm 0.0005 GeV, a 0.005% measurement.
  • The J/Psi, the ground state of the (c \bar c) vector meson family, has its mass known as 3096.916 \pm 0.011 MeV, a 0.0004% measurement.
  • The Psi(2S) has mass 3686.093 \pm 0.034 MeV, a 0.001% measurement.

All the above particles are easy to trigger on, collect, reconstruct, and measure. With CMS we expect to collect thousands of these decays every day of running. Their mass can be measured on a event-by-event basis by reconstructing the momentum of the two muons they decayed into, using the relativistic equation

M = \sqrt{ (\Sigma E)^2 - (\Sigma \vec{P})^2}

where M is the resonance mass, E is the muon energy, and P is the muon momentum vector.

By comparing the average mass of each reconstructed resonance to the reference values above, we get to know the scale of our momentum measurement, S = M_{true}/M; every time we measure a momentum P we then do P' = SP, forget P, use P’, and we are done. Easy enough, wouldnt’ you agree ?

Sure. Easy enough. But actually kind of lame. With the millions of dimuon resonances we collect, can’t we do something better ? Our detector is, in fact, a quite complicated set of devices. The momentum scale -or, to be precise, the bias on the momentum measurement- depends on very subtle effects, such as tiny distorsions in the magnetic field generated by the 4-Tesla solenoid, occasional mis-alignment (by a few microns, that is) of one of the thousands silicon sensors, erratic behavior of the reconstruction algorithm in very particular regions of the detector. We can, and we must, check the bias on our measured momentum more closely, because it in turn gives us a chance to verify the B field map, check the alignments, validate the reconstruction code.

In the simplified formulas described above to determine a corrected momentum P’, you might have noticed that we used the invariant mass of the two muons making the resonance, rather than each muon separately. Indeed, the decayed particle is not produced at rest in the laboratory frame of reference, so we cannot expect that the two muons share evenly their parent’s energy, M/2 each. Only by combining their momenta can we get a number to compare to the reference value. Or is there a smarter way ?

There is a smarter way. Strangely enough, to my knowledge it has not been used in the past for this application. Let me explain in short what it is. I will try to make this as simple as possible, but not simpler – in Einstein’s style.

In the formula for the relativistic mass above enters the energy and momentum -or better, if you allow a slip into special relativity jargon, quadrimomentum. We can, in purely symbolic terms, write:

M = f [P_1(x_1, x_2, ..., x_i), P_2(x_1, x_2, ..., x_i)]

where we have made explicit the fact that the computed invariant mass is a function f of the quadrimomenta P_1, P_2 of the two muons, and that each of the two quadrimomenta is in turn a function of many (i, in the formula) other variables, collected in two i-dimensional vectors x . These variables are the measured characteristics of the track: its angles, the region of the detector it crosses, its electric charge, you name them.

Still here ? Ok. The next step is to realize that what we really would love to have is a measurement of the momentum as a function of the particular characteristics x of the track, and not just P=0.3 B/\rho, which only depends on the curvature \rho. Through a knowledge of P=P(x_i) we could get sensitive to the effects mentioned above -B field distorsions, alignment errors, reconstruction biases.

There is a simple way: we can compute the probability that we observe a mass M, if the reference value is M_{true}, as a function of the measured quantities x_i of each muon, by assuming a functional form for the way the momentum P depends on the parameters. So let us write:

M = f [g_1(\vec{x};\vec{\alpha}), g_2(\vec{x}; \vec{\alpha})]

where the new function g( ) describes how the momenta vary with the vector of measured track parameters \vec{x}, and \vec{\alpha} is a vector of unknown variables describing the function g( ).

(To let you understand what the heck I am talking about, assume that your detector measures a track momentum with a bias depending on momentum itself:

P = g(\vec{x};\vec{\alpha}) = x_1 \times (\alpha_1 + \alpha_2 \times x_1),

with x_1=P, and \alpha_1 = 0.998, \alpha_2 = 0.0002. This function describes momenta which are underestimated by 0.2% for small P, correctly estimated for P=10 GeV, and overestimated by 1% for every additional increase of P by 50 GeV. )

Using the parametrization, we compute for each event the measured mass as a function of the variables \alpha. WIth these numbers we finally form a likelihood function:

L = -\Sigma[log(Prob(M(x,\alpha))]

which of course implicitly depends on the functional form we have chosen for g. By maximizing L as a function of the parameters, we obtain their most likely values, and we are done: we get to know how our track momentum depends on its characteristics \vec {x}.

In the discussioon above I have not given much emphasis on the fact that the true form of the “bias function” g( ) is not known. One can in fact test different hypotheses with the data, and the value of the likelihood will be a measure of how well they describe the experimental situation. There’s more: the likelihood can be studied as a function of each of the components of the vector x, allowing to spot biases which require a more subtle parametrization.

The above discussion is a simplified view of the problem: In reality, things are much more complicated. Here is a short list of details I hid under the carpet above:

  • We model the probability to observe a given mass in the likelihood function by convoluting a Lorentzian function (the Breit-Wigner, which is the true form of the mass distribution of the resonance) with a gaussian resolution function; the gaussian has parameters \vec {\beta} which also get fit simultaneously with the bias parameters \vec {\alpha}. The figure below shows the probability distribution function of a measurement of mass M and resolution \sigma for a Z boson: for each point in the plane, defined by the two values (M, \sigma), the probability is the height of the surface. Notice how the probability grows as the resolution increases, for values of mass very far from the true resonance mass M_Z=91 GeV (for instance, for a mass of 71 GeV-the left boundary of the surface), while the opposite happens for values of mass close to it.

  • the fitter also assumes a functional form for the background (which is unavoidably included in the dataset containing the resonances), and fits it together with the bias and resolution parameters;
  • Each of the six considered resonances can be fit individually, or all together. The window around the peaks defining events used or not used in the computation requires an optimization;
  • The fitter iterates several times the whole procedure: after bias parameters are extracted, momenta get corrected, and a new parameter extraction must return values which are compatible with no bias.
  • And so on…

The algorithm is indeed quite complicated. I spent the last three months implementing the fitting of resolution and background, and the algorithm is not yet complete but it now works well. It is particularly satisfactory to be able to launch the program on a set of resonances, and extract all at once not just the parameters that allow momenta to be corrected, but also a precise estimate of momentum resolution as a function of track kinematics – something that would once require detailed studies with simulations. All is now squeezed out of the data!

The work is far from over. With the help of my colleagues, we will test the code on a very large sample of simulated events in the next few months, to be ready for the data which will hopefully start pouring in this fall… But the work will only be started then: we plan to fit chunks of data on a monthly basis, checking the stability of the detector and the track reconstruction, and producing a correction function to be used by all analyses in need of a precise momentum measurement… It really is a long-term plan!

The Corfu 2005 proceedings online April 10, 2008

Posted by dorigo in astronomy, books, games, humor, internet, language, mathematics, music, news, personal, physics, politics, science, travel, Uncategorized.
comments closed

Just a note to post here the permanent link to the proceedings of a conference I attended in Corfu (Greece) three years ago. This is a long (32 pages) report on “High-P_T Physics: from the Tevatron to the LHC“, now published in the Journal of Physics: Conference Series [Tommaso Dorigo 2006 J. Phys.: Conf. Ser. 53 163-194]. I think I did post a draft of the paper on this blog a couple of years ago, but then I forgot to post the final version as well.

The paper is a bit dated in some parts, where the most recent (back then) results from the Tevatron are discussed; however, some parts -especially a discussion of the usefulness of Tevatron data for LHC physics- are still readable IMHO. Also worth noting is the fact that the acknowledgments section mentions the late Riqie Arneberg, a friend who passed away last fall, who had accepted the offer I had made to all readers of this blog to proofread the manuscript, and contributed in several places to the clarity of the text.

The publisher has now made available online all its 100 open access volumes through the JPCS home page. Of course I salute this contribution to the free diffusion of science with enthusiasm.

The say of the week March 10, 2008

Posted by dorigo in Blogroll, games, humor, mathematics, physics, science.
comments closed

At least 99% of the 10^500 possible vacua are complete garbage and can be ruled out easily. Thus, the regions of the landscape for which realistic vacua may arise is limited.”

Eric (string theory enthusiast)

Follow

Get every new post delivered to your Inbox.

Join 100 other followers