## A top mass measurement technique for CMS and ATLASJuly 8, 2008

Posted by dorigo in news, physics, science.
Tags: , ,

The Tevatron experiments CDF and D0 have been producing more and more precise measurements of the top quark mass in the last few years, in an impressive display of technique, inventiveness, and completeness. The top quark mass, one of the fundamental parameters of the Standard Model, is now known with 0.8% precision – a value way beyond the most optimistic pre-Run II estimates of technical design reports and workshops.

What is interesting to me at this point is not so much how precise the Tevatron determination can ever get, but how long will CMS and ATLAS have to wrestle (and me within CMS) with their huge datasets if they are to ever surpass the Tevatron in this very particular race. The top pair production cross section at the LHC is a hundred times larger than it is at the Tevatron, so the statistical uncertainty on whatever observable quantity one cooks up to determine the top mass is going to become negligible very soon after the start-up of the new accelerator. Instead, the systematic uncertainty from the knowledge of the jet energy scale will be a true nightmare for the CERN experiments.

Was I a bit too cryptic in the last paragraph ? Ok, let me explain. Measuring the top mass entails a series of tasks.

1 – One first collects events where top quarks were produced and decayed, by selecting those which most closely resemble the top signature; for instance, a good way to find top pair candidates is to require that the event contains a high-energy electron or muon and missing energy, plus three or four hadronic jets. The electron (or muon) and missing energy (which identifies an escaping neutrino) may have arisen from the decay of a W boson in the reaction $t \to W b \to e (\mu) \nu b$; the hadronic jets are then the result of the fragmentation of another W boson (as in $\bar t \to W^- b \to q \bar q' b$) plus the two b-quarks directly coming from top and antitop lines.

2 – Then, one establishes a procedure whereby from the measured quantities observed in those events – the energy of the particles and jets produced in the decay – a tentative value of the top mass can be derived. That procedure can be anything from a simple sum of all the particles’ energies, to a very complicated multi-dimensional technique involving kinematical fits, likelihoods, neural networks, whatchamacallits. What matters is that, at the end of the day, one manages to put together an observable number O which is, on average, proportional to the top mass. This correlation between O and M is finally exploited to determine the most likely value of M, given the sample of measurements of O in the selected events. Of course, the larger the number of top pair candidate events, the smaller the statistical uncertainty on the value of M we are able to extract [and on this one count alone, as mentioned above, the LHC will soon have a 100:1 lead with the Tevatron.]

3 – You think it is over ? It is not. Any respectable measurement carries not only a statistical, but also a systematic uncertainty. The very method you chose to determine M is likely to have biased it in many different ways. Imagine, for instance, that the energies of your jets are higher than you think. That is, what you call a 100 GeV jet has instead -say- 102.5 GeV: if you rely on a observable quantity O which depends strongly on jet energies, you will be likely to underestimate the top mass M by 2.5%! So, a careful study of all possible effects of this kind -systematical shifts that may affect your final result- is necessary, and it usually involves a lot of work.

Unfriendly systematic uncertainties: the jet energy scale

The example I made with jet energies is not a random one: the jet energy scale (JES) -the proportionality between the energy we measure and the true energy of the stream of hadrons- is the single largest source of uncertainty in most determinations of the top quark mass. Alas: that is because top quarks always produce at least one hadronic jet when they decay, and we usually cannot avoid using their energy in our recipe for O: we tend to think that the more information we use, the more accurate will be our final result. This is not always the case! In the limit of very high statistics, what matters is instead to use the measurements which carry the smallest possible systematic uncertainties.

Let us make a simple exercise. You have two cases.

Case 1. Imagine you know that the top mass is correlated with the Pt of a muon from its decay, such that on average $M=3 \times P_t + 20 GeV$, but the RMS (the root-mean-square, a measure of its width) of the distribution of M is 50% of its average: if you measure $P_t=50 GeV$ then $M=170 \pm85 GeV$. The correlation is only weak, and the wide range of possible top masses for each value of Pt makes the error bar large. Also, you have to bear in mind that your Pt measurement carries a 1% systematic uncertainty. So it is actually $P_t=50 \pm0.5 GeV$, and the complete measurement reads $M=170 \pm 85 \pm 1.5 GeV$ (where the latter number is three times 0.5, as dictated by the formula relating Pt and M above), from a single event.

Case 2. The top mass is also correlated with the sum S of transverse energy of all jets in the event, such that on average $M=0.8 \times S-54 GeV$. In this case, the RMS of M for any given value of S is only of 10%: if you measure $S=280 GeV$, then $M=0.8 \times 280-54=170 \pm 0.10 \times 170$ which turns out to be equal to $M=170 \pm 17 GeV$. This is remarkably more precise, thanks to the good correlation of S with M. You should also not forget the systematic uncertainty on the jet transverse energy determination: S is known with 2.5% precision, so in the end you get $M=170 \pm 17 \pm 0.025 \times 0.8 \times 280$ which equals $M=170 \pm 17 \pm 5.6 GeV$, for the event with $S=280 GeV$.

Now the question is: which method you should prefer if you had not just one top event, but 100 events ? And what if you had 10,000 ? To answer the question you just need to know that the error on the average decreases with the square root of the number of events.

With 100 events you expect that the muon Pt method will result in a statistical uncertainty of 8.5 GeV, while the systematic uncertainty remains the same: so $M=170 \pm 8.5 \pm 1.5 GeV$. Case 2 instead yields $M=170 \pm 1.7 \pm 5.6 GeV$, which is significantly the better determination. You should prefer method 2 in this case.

With 10,000 events, however, things change dramatically: Case 1 yields $M=170 \pm 0.85 \pm 1.5 GeV$, while Case 2 yields $M=170 \pm 0.17 \pm 5.6 GeV$a three times larger error bar overall! This is what happens when systematical uncertainties are allowed to dominate the precision of a measurement method.

Well, it turns out that in their measurements of the top quark mass CDF and D0 in Run II have almost reached the point where their jet energy measurement, something that cost years of work to tens of dedicated physicists to perfect, does not help much the final result. So large is the jet energy scale uncertainty as compared to all others, that it makes sense to try alternative procedures which ignore the energy measurement of jets.

Now, I would get flamed if I did not point out here, before getting to the heart of this post, that there are indeed cunning procedures by means of which the jet energy scale uncertainty can be tamed: indeed, by imposing the constraint that the two jets produced by the $W \to q \bar q'$ decay make a mass consistent with that of the W boson, a large part of the uncertainty on jet energies can be eliminated. This alleviates, but ultimately does not totally solve, the problem with the JES uncertainty. In any case, until now the excellent results of the Tevatron experiments on the top quark mass have not been limited by the JES uncertainty. Still, it is something that will happen one day. If not there, then at CERN.

Let me make a simple example of one analysis in CDF that does not apply the “self-calibrating” procedure mentioned above: it is a recent result based on 1 inverse femtobarn of data, of which I need not discuss the details today. Here is the table with the systematic uncertainties:

The total result of that analysis is $M = 168.9 \pm 2.2 \pm 4.2 GeV$. Since uncertainties add up in quadrature (a somewhat arguable statement, but let’s move on), the total uncertainty on the measured top mass is 4.7 GeV. Without the JES uncertainty (3.9 GeV alone), it would be 2.7 GeV!

The new result

Ok, after I made my point about the need for measurement methods which make as little use as possible of the energy measurement of jets in future large-statistics experiments, let me describe shortly a new analysis by CDF, performed by Ford Garberson, Joe Incandela, Roberto Rossin, Sue Ann Koay (all from UCSB), and Chris Hill (Bristol U.). Their measurement technique uses previously employed techniques, combining them successfully in a JES-free determination.

[A note of folklore – I should maybe mention that Roberto Rossin was a student in Padova: I briefly worked with him at his PhD analysis, an intriguing search for $B_s \to J/\psi \eta$ decays in the $\mu \mu \gamma \gamma$ final state. On the right you can see a picture I took of him for my old QD blog three years ago… The two good bottles of Cabernet were emptied with the help of Luca Scodellaro, now at Cantabria University.]

They use two observable quantities which are sensitive to the top mass. One is the transverse decay length of B hadrons produced by b-quark jets: the b-quark emitted in top decay is boosted thanks to the energy released in the disintegration of the top quark, and the distance traveled by the long-lived B hadron (which contains the b-quark) before in turn decaying into lighter particles is correlated with the top mass. The second quantity is the transverse momentum of the charged lepton – we already discussed its correlation with the top mass in our examples above.

After a careful event selection collecting “lepton+jets” top pair candidate events in 1.9 inverse femtobarns of Run II data (ones with an electron or a muon, missing Et, and three or more hadronic jets, one of which containing a signal of b-quark decay), an estimate of all contributing backgrounds is performed with simulations. The table on the right shows the number of events observed and the sample composition as interpreted in the analysis. Note that for the L2d variable there are more entries than events, since a single event may contribute twice if both b-quark jets contain a secondary vertex with a measurable decay length.

The mean value of $P_t$ and $L_{2d}$ is then computed and compared to the expectations for different values of the top mass. Below you can see the distribution of the two variables for the selected data, compared to the mixture of processes already detailed in the table above.

Of course, a number of checks are performed in control samples which are poor of top pair decay signals. This strengthens the confidence in the understanding of the kinematical distributions used for the mass measurement.

For instance, on the right you can see the lepton Pt distribution observed in events with only one jet accompanying the $W \to l \nu$ signal: this dataset is dominated by W+jets production, with some QCD non-W events leaking in at low lepton Pt (the green component): especially at low Pt, jets may sometimes mimic the charged lepton signature. All in all, the agreement is very good (similar plots exist for events with two jets, and for the L2d variable), and it confirms that the kinematic variables used in the top mass determination are well understood.

To fit for the top mass which best agrees with the observed distributions, the authors have used a technique which strongly reminds me of a method I perfected in the past for a totally different problem, the one I dubbed “hyperball algorithm. They only have two variables (I used as many as thirty in my work on the improvement of jet energy resolution), so theirs are balls – ellipses, actually – and there is little if anything hyper. In any case, the method works by comparing, in the plane of the L2d and lepton Pt variables, the “normalized distance” between observed data points and points distributed within ellipses in the plane, obtained from Monte Carlo simulations with different values of generated top quark mass: they compute the L2d and Pt distances $\delta L_{2d}=L_{2d}^{data} - L_{2d}^{MC}$, $\delta P_t = P_t^{data} - P_t^{MC}$, and using the RMS of the MC distributions $\sigma (L_{2d})$, $\sigma (P_t)$ they define the quantity

$D = \sqrt{ (\delta P_t/\sigma (P_t))^2 + ( \delta L_{2d}/\sigma (L_{2d}))^2}$.

D is then used in a fit which compares different hypotheses for the data -ones with different input top masses- to extract its most likely value.

Having worked for some time with distances in multi-dimensional spaces, I have to say I have a quick-and-dirty improvement to offer to the authors: since they know that the two variables used have a slightly different a-priori sensitivity to the top mass, they could improve the definition of D by weighing the two contributions accordingly: this simple addition would make the estimator D slightly more powerful in discriminating different hypotheses. Anyway, I like the simple, straightforward approach they have taken in this analysis: the most robust results are usually those with fewer frills.

Having said that, let me conclude: the final result of the analysis is a top mass $M = 175.3 \pm 6.2 \pm 3.0 GeV$, where the second uncertainty is systematic, and it is almost totally JES-uncertainty free. This result is not competitive with the other determinations of CDF -some of which have been extracted from the very same dataset; however, this will be a method of great potential in LHC, where jets are harder to measure for a number of reasons, and where statistics will be large enough that the attention of experimenters will soon turn to the optimization of systematic uncertainties.