jump to navigation

Your contribution to a physics analysis: tt-Higgs search December 25, 2007

Posted by dorigo in games, personal, physics, science.

In a recent post I mentioned my full immersion in a search for the signal of Higgs radiation off a top quark pair in the 14-TeV proton-proton collisions produced in the CMS detector by the LHC. The process is p p \to t t H \to W^+ b W^- \bar b b \bar b  – the Higgs is therefore sought in its decay to a pair of b-quarks.

The ttH signal is really tough to extract because of its tiny cross-section – 0.67 picobarns, not including the branching probability of H \to b \bar b. Compared to inclusive top pair production (650 pb) and W+>=4jets production ( about 300 pb) it might not look like a forbidding task, but generic QCD events with many jets in the final state (hundreds of nanobarns) remain a horrible background if one does not have powerful handles to kill it.

Normally, one tags at least one top quark decay by looking for a high energy lepton (electron or muon) produced in the chain t \to W b \to \mu \nu b or t \to W b \to e \nu b. QCD events then mostly disappear (only a relic of semileptonic b-quark decays to high-energy leptons may be an issue, as well as fake leptons mimicked by hadrons) and one only needs to fight the other electroweak signatures of high energy leptons.  

In the analysis Marco (the PhD student I am tutoring and who is about to graduate), Mia (my freshman PhD student) and myself have chosen, the top pair is instead tagged by looking for a neutrino. Neutrinos can only be seen by “not seeing” the energy they carry away: by pretending that the momentum of all observed bodies transverse to the beam direction balances, one infers whether a neutrino (one or more, that is) has carried away a part of it.

The philosophy is thus to neglect leptons (which other analyses may tag), and concentrate on a multijet plus missing energy signature. To reduce the QCD background, the missing energy is required to be significant: the missing Et measurement divided by its estimated error has to be larger than three units. Then, five or more jets are required to be contained in the central part of the detector and have a large transvers energy (30 or more GeV). After a trigger simulation and those cuts, the signal to background is below one part in a hundred thousand!

Fortunately, the top and higgs produce b-quarks, which in turn yield jets with tracks pointing back away from the primary interaction vertex, because they are produced by B mesons which have traveled away from it. By requiring four B-tags the signal gets reduced quite a bit, but the QCD background gets hit hard! The same does the W+jets electroweak background and other less probable processes. One then has to reckon mostly with inclusive top pair production, when two additional b-tagged jets are the result of QCD radiation of a gluon which materialized into two well-separated, energetic jets. The latter is called an irreducible background: its only difference from H \to b \bar b signal is that the two b-jets have an arbitrary invariant mass, while the Higgs has a well-defined one.

To discriminate the remaining QCD background and inclusive top production from the ttH signal, one relies on kinematic variables defined with the quadrimomenta of the jets, the information on which ones are b-tagged, and the missing energy magnitude and direction. Here is a incomplete list of possible kinematic characteristics:

  • the total invariant mass of the first 6 jets in the event;
  • same, for all jets (often there are as many as 10);
  • the sum of jets transverse energy and missing transverse energy (a quantity called Ht)
  • the mass of the two highest Et b-jets in the event
  • the mass of the two b-jets whose mass is closest to 120 GeV (the mass considered in our monte carlo simulation – we are targeting the search to a 120 GeV Higgs)
  • the mass of the triplet of jets containing exactly one b-tagged jet whose mass is closest to 172 GeV (there are many combinations, so this number always comes close to 170ish)
  • the mass of the pair of untagged jets in the triplet with mass closest to 172 GeV (this should be a W boson decay and should thus cluster at about 80 GeV)
  • the angle between missing energy and closest jet in the transverse plane
  • the angle between missing energy and leading jet in the transverse plane
  • the angle between the leading two jets in the transverse plane
  • the centrality of the leading 6 jets, defined as the sum of their transverse energy divided by their total invariant mass
  • the mass of all jets not b-tagged

And so on.

The study entails constructing histograms of these variables at different levels of selection, adding expected background histograms together, comparing the result to the histogram resulting from the signal simulation, and deciding whether the variable is discriminating the two or not. Once the best variables are spotted, a global relative likelihood can be constructed, and with it a sensitivity study can be performed.

To give you a view of the problem we are facing, I provide two graphs below (good lord, I hope my CMS colleagues will not be so picky as to have something to complain about my posting unnamed histograms with no y axis labels!). The first one shows the four main processes contributing to our sample after the request of five jets, two at least of them containing a b-tag. In red the QCD background stands out high over top pair production (green), electroweak W+jets (in cyan) and ttH (in blue). The black points show the sum of all processes. Mind you, this is a logarithmic plot! The x axis is one particularly discriminating kinematical variable.

In the second plot you can see what the additional requirements of missing transverse energy significance above 3.0 and two additional b-tags do for us: the QCD background is strongly reduced, and the main background has become the inclusive top production.

Ok, after this example let’s talk business. 

At this stage we are still finalizing the list of best kinematical characteristics capable of discriminating signal from backgrounds. Do you wish to contribute ? Define a variable which is meaningful enough to trigger my interest, and propose it. I will compute it with my code for each of the simulations, and come up with some handwaving information – a single measure of the discriminating power of the variable – describing how good it is. I may be able to show histograms too, by suppressing some information – remember, the simulation goes through CMS private code and so I cannot divulgate any result which is not explicitly approved by my collaboration. I can, though, provide generator-level information for the variables we end up discussing. In the end, if the variable you propose has some discriminating power, I will use it in the analysis.

You are supposed to use the following:

  1. px,py,pz,E of jets (there are at least 5 in the event, and I consider at most 8 – the most energetic); you can also choose to work with the alternative set of variables Et, \eta, and \phi of each jet.
  2. Ex, Ey components of the missing transverse energy; the arctangent of Ey/Ex is of course the phi angle in the transverse plane; also given is the missing Et significance, defined as missing Et divided by its error;
  3. B-tag flag of each jet from 1 to 8 (mind you, the selection requires 4 b-tags, but it is quite unlikely that more than 4 are tagged). This is a true/false bit per each jet: if you are looking for the Higgs, you may want to study the mass of two b-tagged jets!
  4. From the above information of course one can define invariant masses of jets, sums, triplet masses, four-body masses, angles between combined quadrimomenta, in the lab frame or in the center-of-mass frame of any sub-parton system. You name them!

Ok, if you are not into particle physics and you are still reading, you deserve some more information.

  • What is likely to discriminate the production of high-mass objects from QCD multi-jet production is, for instance, the mass of the objects, or the centrality of the produced jets.
  • Missing Et flags a neutrino, but in QCD events it is the result of a jet which was badly measured – so the angle in the transverse plane (remember, the missing energy vector is undefined in the z axis because the total momentum of the projectiles participating in the collision along the beam axis is unknown) between a jet and missing Et may be an indication, but in a multijet environment this is not so easy.
  • a ttH event with missing Et contains also a lepton, but we miss that information. If the lepton was an electron or a tau, it usually has been mistaken for a jet; if it was a muon, it got lost and may have resulted in additional missing Et. Because of these characteristics, a ttH event of the kind we are after has nominally six partons in the final state plus a neutrino and a undetected lepton. Four of these six partons are produced by b-jets.
  • The inclusive top pair background has produced missing Et, four partons, a charged lepton which went undetected, and may have other jets from radiation. These additional jets may be less central than the others.
  • Since we have a missing neutrino – of which we do not know the z-component of its momentum- and also a missing or however unidentified charged lepton, there is no chance we can reconstruct the full event kinematics. However, one might rely on techniques which have been refined for the top quark mass measurement in the dilepton final state, where again two neutrinos are missing, a circumstance not too different from the one we are facing here.

The above, I believe, is sufficient information to start thinking… A good idea might be worthy of a mention in our future analysis note on the search, so take this as a chance to see your name printed on a scientific document!


1. Bryan Bishop - December 25, 2007

My background in physics is currently sketchy. Are there any papers you would recommend on arxiv that could help out?

– Bryan

2. dorigo - December 25, 2007

Hi Bryan,

take a look at this preprint:

I think it is quite complete… See the first few introductory pages for production and decay, and p.27 for the tth signature.


3. Nikita Nikolaev - December 25, 2007

This is very detailed and very interesting!
I’m still reading it, but I thought I’d stop to write a compliment before I forget =)

4. dorigo - December 25, 2007

Hi Nikita,

thank you… and nice blog btw. I’ll keep an eye on it.


5. Nikita Nikolaev - December 26, 2007

Oh, thank you so much!
Now I have more reasons to keep writing😀

Thanks again

6. Andrea Giammanco - December 26, 2007

Are you already considering topological variables like circularity, aplanarity, Fox-Wolfram momenta, etc.?

Be aware that this analysis is very very similar to an analysis of the TDR which had a completely different goal: finding an evidence of SUSY in the inclusive ttbar+MET final state (with fully hadronic ttbar). Of course some discriminating variables will be different (only 2 b-tags in their case, and no missed lepton, and they could do a kinematic fit to the two tops) but I’m sure you can get inspiration to a large extent.

7. luca salasnich - December 29, 2007

Mmm, very interesting this subject: to find significant signals from high-energy data. This is a job for Italians: one truly needs fantasy and creativity.

Buon natale e buon nuovo anno.

P.S. I like rabbits.

8. dorigo - December 29, 2007

Hi Andrea,

we did look at aplanarity, sphericity, and such, but I did not construct fox-wolfram momenta yet. I guess I am not that desperate🙂 But I will give it a try when I return from the holidays. And I will give a look at the susy search.

Luca, the whole world is into those searches these days… And I do not see the connection with rabbits. What do you mean ? Anyway, happy new year to you too.

9. Fred - December 29, 2007

Mr. T,

A recent article from ‘Physics World’ titled “US physics suffers budget setbacks” states:


“… Oddone has a two-fold goal: To ensure that the Tevatron accelerator continues to chase the Higgs boson until its scheduled closure in 2009, and to guarantee that Fermilab will continue to make significant contributions…”

Does this mean you only have this remaining time to search for the Higgs creature until the sands run out in 2009, and if so, will you transfer all of your research from Illinois over to Europe? It sounds like one of the cliffhanger endings in a ‘Batman’ TV episode.

Holy Neutrinos, sci-phy fans! Will our caped researchers find the Invisible Mr. Higgs before he disappears forever with the melting budget due to global dumbing? Can the M&M Twins, Marco and Mia, find true love and the happiness of lab life while hacking through the wicked web of QCD background events? Or will the Masked Dorigo resort to sacrificial decaying by injecting himself into the 14-TeV proton-proton collisions and thus become part of the signal of Higgs radiation off a top quark pair? Tune in tomorrow, folks – same Jet-time, same Jet-channel!

10. dorigo - December 30, 2007

Hi Fred,

More prosaically stated, the Tevatron will run until the end of 2009, and probably into 2010. That does not mean that by that time physicists in CDF and D0 will move on to something else… They will spend more time perfecting their search on the by now not increasing dataset. Of course, many will in the meantime start working on something else too.

I think the Tevatron’s chances to find the Higgs are really, really slim. They can, however, set some interesting limits with their datasets – and I expect this to be a rather ironic turn of events: the Tevatron setting limits on Higgs mass ranges just before the LHC experiments have a chance at discovering it there.


11. rwzjatmvq ustmq - July 21, 2008

myaect oavx wtpr pfmrj hdusgjc eksd urmzl

Sorry comments are closed for this entry

%d bloggers like this: