A trick for QCD Multijet generation in CMS September 21, 2006
Posted by dorigo in computers, news, physics, science.trackback
During the last couple of years I have been following a PhD student, Marco, in a study of the possible extension of the CMS trigger to include signals from the silicon pixel detector – an upgrade under study, for a future running of the ”super LHC” – a scenario with the same accelerating machine, but ten times more protons colliding.
The trigger is a hardware system that collects the electronic signals from the detector in real time, and decides whether the collision which originated them is worth to be stored or not. Since the collision rate is 40 MHz and the storage capability of the CMS detector output signals is only of about 100 Hz, the trigger is a quite important device. Adding silicon pixel input to the trigger information would improve its decision, effectively increasing the data collection efficiency for rare processes of interest. In fact, the tiny pixels provide very precise position measurements for the trajectory of charged tracks, by detecting the ionization these leave behind.
Marco is investigating the possible use of pixel hit information on the CMS first level trigger in a specific search which could benefit from the enhancement: the collection of Higgs bosons in events with many hadronic jets.
A simple use of pixel information is in fact to discriminate where tracks come from – that is, whether they all originate from the same point along the beamline, as is the case for a single proton-proton collision, or from two or more closely spaced points, as happens when several independent collisions occur in the same 25-nanoseconds interval. This allows to determine whether all the identified jets (collimated streams of particles originated when a high-energy quark is emitted from the collision) come from the same collision or not.
When the machine luminosity is in fact as high as the Large Hadron Collider promises to achieve, tens of collisions can occur during the same crossing of two packets of protons in the center of the detector. Under these circumstances, understanding whether one hard collision produced several jets and the other soft ones just created few low-energy particles each (a good event worth storing), or whether instead two or three collisions conspired to produce the observed jets (an event to discard), becomes a critical issue. The pixels can help a lot.
Marco studied the efficiency of collecting events where a top-antitop quark pair is produced together with a Higgs boson (a tremendously rare process, yielding as many as eight jets) if the trigger is modified suitably. Of course, when you propose to modify the data acquisition chain, you need to prove that the accept rate will not increase, since the 100 Hz output rate cannot be exceeded.
To determine the rate of a given trigger selection we use Monte Carlo simulations. These are simulated collisions, whose products are followed as they travel through each detector component, determining the detector response. Unfortunately, simulating events costs computing time, and to understand trigger rates one has to produce lots and lots of them. The more stringent a selection is (and to select such a rare process as ttH production, the selection really needs to be severe), the more events are needed.
Marco simulated several hundred thousand QCD events – events which produce hadronic jets, the most common interactions a LHC collision produces – and it took him months to do it. With them, it is possible to determine the accept rate of a trigger that selects one event every ten thousand or so, as is the case for multijet triggers in CMS. But not much more can be done: the accepted events are too few to be used for a meaningful analysis of the backgrounds resulting from the trigger selection, in the hope of demonstrating how further analysis cuts (done off-line, when the data has been written to tape) can indeed put the tiny ttH signal in evidence.
What to do ? Generating more Monte Carlo is out of the question: Marco already spent months to produce the existing sample.
I had an idea. Since the characteristics of the background events we need to study are just jet energies, their number above a certain energy, and the amount of missing energy observed in the event, something smart can be done.
Missing energy – I need to explain here – is a measure of how unbalanced the energy flowing out of the collision is, in the plane transverse to the beams. You expect that particles will flow in all directions, and that their total momentum transversely to the beam will be the same of that of the originating protons: zero – that follows from the cherished law of momentum conservation. If, however, a energetic neutrino is produced by a top quark decay, you expect lots of missing energy: neutrinos traverse the detector unseen, and carry away energy which leaves an imbalance. Unfortunately, generic jet production events can also produce missing energy, due to fluctuations in the jet energy measurement in the calorimeters: so whenever you select events with large missing energy, you end up with a few events producing fancy neutrinos -the good ones-and many ordinary events where the source of the imbalance is instrumental -your background.
The idea is that the instrumental source of missing energy -fluctuation of the calorimetric measurement- can be parametrized by studying a large sample of simulated jets, determining the calorimeter response function as a function of jet energy and location in the detector. If you get to know the response function well enough, you can sample it for each jet you measure, and re-compute a modified missing energy from the vector sum of resampled jet energies. From a given simulated QCD event, originated from a given simulated hard collision, you can obtain hundreds, or even thousands, of different events, with the proper relative frequency, modified jet energies, and modified missing energy.
It is amazing how simple and yet how powerful the method is. With virtually no CPU expense we basically multiplied by a large factor the size of our background sample, and Marco has been able to understand what effectively the prospects of the search for ttH events in multijet-triggered data are, both using the existing trigger settings and the improved ones which will become available with pixel information.
We plan to write an internal CMS note describing the technique. As far as I know, nobody has used it before, but it can be useful for other analyses which study events with a missing energy signature.
Comments»
No comments yet — be the first.