## Correcting the CMS momentum scale April 29, 2008

Posted by dorigo in mathematics, personal, physics, science.

I have wanted to write some version of the present post for a while, and so it is a relief to publish it at last. In fact, it is rather strange to have completely avoided discussing in my blog the problem I have invested the best part of my research time in the last three months -plus a fair share of last year’s thinking-, and it was due time that I filled that void somehow.

Unfortunately, strange as it may seem, there are topics in my research activities that are hard to explain in simple terms. The problem I have been working on is not difficult to state, nor too difficult to solve, but it is extremely complicated and varied, so that a comprehensive description is challenging. However, I want to make an attempt…

The problem I have been dealing with, together with a small and focused group of bright colleagues (Sara Bolognesi, Marco De Mattia, and Chiara Mariotti: lads from Padova and ladies from Torino University) is the one of calibrating the momentum of charged tracks detected by the CMS experiment at CERN.

After being produced in a proton-proton collision in the core of CMS, charged particles have their position measured in a dozen layers of silicon detector before they hit the calorimeter system; the few penetrating ones surviving the encounter with trillions of heavy nuclei are also detected by the large set of muon chambers situated outside them. With the information provided by the silicon detectors -and, for muon candidates, by the muon system- a very performant and refined software algorithm reconstructs and fits the trajectory of the track, providing a measurement of the five parameters describing the helical trajectory; most notably, the curvature $\rho$ inside the solenoid, which yields a precise determination of transverse momentum through the formula $P_t = 0.3 B / \rho$ (where B is the magnetic field intensity -about 4 Tesla- and $P_t$ is transverse momentum).

There are a number of reasons why a precise determination of the momentum of charged tracks is crucial. Let me just flash a few:

1. Charged particles are measured with a better precision than neutral ones, and a careful determination of their momentum allows to calibrate in turn other parts of the detector.
2. Some physics measurements such as the mass of the W boson rely heavily on track momentum.
3. The identification of a high-mass resonance -say a new Z’ boson- may require the reconstruction of its $Z' \to \mu \mu$ decay, and a scale error on the momentum of those high-energy tracks translates in a worse resolution in the Z’ mass, and a diminished discovery reach.
4. B-physics crucially needs charged tracks to be precisely reconstructed in order for exclusive B decays to be extracted from backgrounds.

So how do we do it ?

We use resonances. A few neutral particles -vector mesons and the Z boson- decay to pairs of muons, and they can thus be extracted with small backgrounds from the data (events with two muons are easy to collect with CMS, and muons have the benefit that they are “perfect” tracks in several ways). We know the mass of these particles with great accuracy, thanks to previous experiments:

• The Z boson mass is known to be $91.1876 \pm 0.0021 GeV$, a 0.023% precision.
• The Y(1S), the ground state of the $(b \bar b)$ vector meson family, has its mass known as $9460.30 \pm 0.26 MeV$, a 0.0028% measurement.
• The Y(2S) mass is $10.02326 \pm 0.00031 GeV$, a 0.0031% measurement.
• The Y(3S) mass is $10.3552 \pm 0.0005 GeV$, a 0.005% measurement.
• The J/Psi, the ground state of the $(c \bar c)$ vector meson family, has its mass known as $3096.916 \pm 0.011 MeV$, a 0.0004% measurement.
• The Psi(2S) has mass $3686.093 \pm 0.034 MeV$, a 0.001% measurement.

All the above particles are easy to trigger on, collect, reconstruct, and measure. With CMS we expect to collect thousands of these decays every day of running. Their mass can be measured on a event-by-event basis by reconstructing the momentum of the two muons they decayed into, using the relativistic equation

$M = \sqrt{ (\Sigma E)^2 - (\Sigma \vec{P})^2}$

where M is the resonance mass, E is the muon energy, and P is the muon momentum vector.

By comparing the average mass of each reconstructed resonance to the reference values above, we get to know the scale of our momentum measurement, $S = M_{true}/M$; every time we measure a momentum $P$ we then do $P' = SP$, forget P, use P’, and we are done. Easy enough, wouldnt’ you agree ?

Sure. Easy enough. But actually kind of lame. With the millions of dimuon resonances we collect, can’t we do something better ? Our detector is, in fact, a quite complicated set of devices. The momentum scale -or, to be precise, the bias on the momentum measurement- depends on very subtle effects, such as tiny distorsions in the magnetic field generated by the 4-Tesla solenoid, occasional mis-alignment (by a few microns, that is) of one of the thousands silicon sensors, erratic behavior of the reconstruction algorithm in very particular regions of the detector. We can, and we must, check the bias on our measured momentum more closely, because it in turn gives us a chance to verify the B field map, check the alignments, validate the reconstruction code.

In the simplified formulas described above to determine a corrected momentum P’, you might have noticed that we used the invariant mass of the two muons making the resonance, rather than each muon separately. Indeed, the decayed particle is not produced at rest in the laboratory frame of reference, so we cannot expect that the two muons share evenly their parent’s energy, M/2 each. Only by combining their momenta can we get a number to compare to the reference value. Or is there a smarter way ?

There is a smarter way. Strangely enough, to my knowledge it has not been used in the past for this application. Let me explain in short what it is. I will try to make this as simple as possible, but not simpler – in Einstein’s style.

In the formula for the relativistic mass above enters the energy and momentum -or better, if you allow a slip into special relativity jargon, quadrimomentum. We can, in purely symbolic terms, write:

$M = f [P_1(x_1, x_2, ..., x_i), P_2(x_1, x_2, ..., x_i)]$

where we have made explicit the fact that the computed invariant mass is a function f of the quadrimomenta $P_1, P_2$ of the two muons, and that each of the two quadrimomenta is in turn a function of many (i, in the formula) other variables, collected in two i-dimensional vectors x . These variables are the measured characteristics of the track: its angles, the region of the detector it crosses, its electric charge, you name them.

Still here ? Ok. The next step is to realize that what we really would love to have is a measurement of the momentum as a function of the particular characteristics x of the track, and not just $P=0.3 B/\rho$, which only depends on the curvature $\rho$. Through a knowledge of $P=P(x_i)$ we could get sensitive to the effects mentioned above -B field distorsions, alignment errors, reconstruction biases.

There is a simple way: we can compute the probability that we observe a mass M, if the reference value is $M_{true}$, as a function of the measured quantities $x_i$ of each muon, by assuming a functional form for the way the momentum P depends on the parameters. So let us write:

$M = f [g_1(\vec{x};\vec{\alpha}), g_2(\vec{x}; \vec{\alpha})]$

where the new function g( ) describes how the momenta vary with the vector of measured track parameters $\vec{x}$, and $\vec{\alpha}$ is a vector of unknown variables describing the function g( ).

(To let you understand what the heck I am talking about, assume that your detector measures a track momentum with a bias depending on momentum itself:

$P = g(\vec{x};\vec{\alpha}) = x_1 \times (\alpha_1 + \alpha_2 \times x_1)$,

with $x_1=P$, and $\alpha_1 = 0.998$, $\alpha_2 = 0.0002$. This function describes momenta which are underestimated by 0.2% for small P, correctly estimated for P=10 GeV, and overestimated by 1% for every additional increase of P by 50 GeV. )

Using the parametrization, we compute for each event the measured mass as a function of the variables $\alpha$. WIth these numbers we finally form a likelihood function:

$L = -\Sigma[log(Prob(M(x,\alpha))]$

which of course implicitly depends on the functional form we have chosen for g. By maximizing L as a function of the parameters, we obtain their most likely values, and we are done: we get to know how our track momentum depends on its characteristics $\vec {x}$.

In the discussioon above I have not given much emphasis on the fact that the true form of the “bias function” g( ) is not known. One can in fact test different hypotheses with the data, and the value of the likelihood will be a measure of how well they describe the experimental situation. There’s more: the likelihood can be studied as a function of each of the components of the vector x, allowing to spot biases which require a more subtle parametrization.

The above discussion is a simplified view of the problem: In reality, things are much more complicated. Here is a short list of details I hid under the carpet above:

• We model the probability to observe a given mass in the likelihood function by convoluting a Lorentzian function (the Breit-Wigner, which is the true form of the mass distribution of the resonance) with a gaussian resolution function; the gaussian has parameters $\vec {\beta}$ which also get fit simultaneously with the bias parameters $\vec {\alpha}$. The figure below shows the probability distribution function of a measurement of mass M and resolution $\sigma$ for a Z boson: for each point in the plane, defined by the two values $(M, \sigma)$, the probability is the height of the surface. Notice how the probability grows as the resolution increases, for values of mass very far from the true resonance mass $M_Z=91 GeV$ (for instance, for a mass of 71 GeV-the left boundary of the surface), while the opposite happens for values of mass close to it.

• the fitter also assumes a functional form for the background (which is unavoidably included in the dataset containing the resonances), and fits it together with the bias and resolution parameters;
• Each of the six considered resonances can be fit individually, or all together. The window around the peaks defining events used or not used in the computation requires an optimization;
• The fitter iterates several times the whole procedure: after bias parameters are extracted, momenta get corrected, and a new parameter extraction must return values which are compatible with no bias.
• And so on…

The algorithm is indeed quite complicated. I spent the last three months implementing the fitting of resolution and background, and the algorithm is not yet complete but it now works well. It is particularly satisfactory to be able to launch the program on a set of resonances, and extract all at once not just the parameters that allow momenta to be corrected, but also a precise estimate of momentum resolution as a function of track kinematics – something that would once require detailed studies with simulations. All is now squeezed out of the data!

The work is far from over. With the help of my colleagues, we will test the code on a very large sample of simulated events in the next few months, to be ready for the data which will hopefully start pouring in this fall… But the work will only be started then: we plan to fit chunks of data on a monthly basis, checking the stability of the detector and the track reconstruction, and producing a correction function to be used by all analyses in need of a precise momentum measurement… It really is a long-term plan!