Notes on the new Higgs boson search by DZERO March 2, 2009Posted by dorigo in news, physics, science.
Tags: D0, Higgs boson, neural networks, standard model, Tevatron
Three weeks ago the DZERO collaboration published new results of their low-mass Higgs boson search. This is about the production of Higgs bosons in association with a W boson, with the subsequent decay of the Higgs particle to a pair of b-quark jets, and the decay of the W to an electron-electron neutrino or muon- muon neutrino pair: in symbols, what I mean is , or . I wish to describe this important new analysis today, but first let me make a point about the reaction above.
In order to make this blog more accessible than it would otherwise be, I frequently write things inaccurately: precision is usually pedantic and distracting. But here I beg you to please note a detail I will not gloss over for once: to be accurate, one should write …, because what we care for is inclusive production of the boson pair. If we omit the X, strictly speaking we are implying that the two protons annihilated into the two bosons, with exactly nothing else coming out of the collision. While that reaction is possible, it is ridiculously rare -actually, the annihilation into ZH is possible, while the one into WH does not conserve electric charge and is strictly forbidden. Anyway, bringing along a symbol to remind ourselves of the fact that our projectiles are like garbage bags, which fill our detectors with debris when we throw them at one another, is cumbersome and annoying, while accurate. I hope, however, you realize that this is an important detail: Higgs bosons at a hadron collider are always accompanied by debris from the dissociating projectiles.
Two words on associated WH production and its merits
The associated production of the Higgs together with a W boson is the “golden” signature for low-mass Higgs hunters at the Tevatron collider. While producing the Higgs together with another heavy object is not effortless (you are required to produce the collision with more energetic quarks in the two colliding protons, and this makes the production less frequent), the W boson pays back with extra dividends by producing a very clean signature in its leptonic decay, and by allowing the event to be spotted easily by the online triggering system, and collected with high efficiency by the data acquisition.
If you compare the collection of WH events to the collection of directly produced Higgs bosons (, where again I prefer accuracy by specifying the X), you immediately see the advantage of the former: while their production rate is four times smaller and the leptonic W decay only occurs 20% of the times, this 0.25 x 0.2=0.05=1/20 reduction factor is a small price to pay, given the trouble one would have triggering on direct events: the decay to a pair of b quarks is the dominant one for low Higgs boson masses, but the common nature of b-jets makes it unobservable.
Higgs decays to b-quark pairs produced alone simply cannot be triggered in hadronic collisions, because they are immersed in a background which is six orders of magnitude higher in rate, namely the production of bottom-antibottom quark pairs by strong interactions. Even assuming that the online triggering system of DZERO were capable of spotting b-quark jet pairs with 100% purity (which is already a steep hypothesis), the trigger would have to accept a million background events in order to collect just one fine signal event !
Yes, life is tough for hadronic signatures at a hadron collider. Even finding the signal, which is a thousand times more frequent, is a tough business -it took CDF years to find a reasonable sample of those decays, while DZERO has not yet published anything on the matter. But the Tevatron experiments cannot ignore the fact that, if a low Higgs mass is hypothesized, the decay is the most frequent: the Higgs boson likes to decay into the heaviest pair of particles it can produce. If the total mass of a pair of W bosons or Z bosons is too heavy, the next-heaviest pair of decay products is b-quarks. This dictates the need to search for , and the trouble of triggering on such a process in turn makes the associated WH (or ZH) production the most viable signal.
The DZERO analysis
The new analysis by DZERO studies a total integrated luminosity of 2.7 inverse femtobarns. This corresponds to 150 trillion proton-antiproton collisions, but DZERO has netted almost twice as much data already by now, and it is only a matter of time before those too get included in this search: so one has to bear in mind that the statistical power of the data is soon going to increase by about 40%: the data increase corresponds to an increase in precision by the square root of two, or a factor of 1.41.
DZERO selects events which have an electron or a muon with high energy -the tag of a leptonic decay of the W boson-, missing transverse energy, and two or three hadronic jets. The presence of a energy imbalance in the plane transverse to the beam direction is a comparatively clean signature of the escape of the energetic neutrino produced together with the charged lepton by the W decay, and two jets are expected from the decay of the Higgs boson to a pair of b quarks. However, you might well ask, quid opus fuit tertium ?
No, I bet you would not ask it that way -for some reason, a reminescence of Latin sprung up in my mind. Quid opus fuit tertium – What is the matter with the third one ? The third jet is not specifically a signature of any one of the decay products of the WH pair we are after. However, if you remember what I mentioned above, we are searching for inclusive production of a WH pair: that means we accept the fact that the two projectiles also produced an additional energetic stream of hadrons in the final state. That possibility is by no means rare, and in fact it amounts to about 20% of the Higgs production events. By selecting events with two or three jets, DZERO increases its acceptance of signal events sizably.
A technique which has become commonplace in the hunt of elusive subnuclear particles is to slice and dice the data: categorizing events in disjunct classes is a powerful analysis strategy. By taking two-jet events on one side, and three-jet events on the other, DZERO can study them separately, and appreciate the different nuisances of each class. In fact, they further divide the data into subsets where one jet was tagged as a b-quark-originated one, or two of them were.
And they also keep separated the electron+jets and the muon+jets events: this also does make sense, since the experimental signatures of electrons and muons are slightly different, as are the resulting energy resolutions. In total, one has eight disjunct classes, depending on the number of jets, the number of b-tags, and the lepton species.
In order to decide whether there is a hint of Higgs bosons in any of the classes, backgrounds are studied using Monte Carlo simulations of all the Standard Model processes which could contribute to the eight selected signatures. These include the production of a W boson plus hadronic jets (“W+jets“) as well as the production of top quark pairs: both these processes produce energetic leptons in the final state; but another background is due to events which do not actually contain a lepton, and where a hadronic jet was mistook for one. The latter is called “QCD background” highlighting its origin in strong interaction processes yielding just hadronic jets: despite the rarity of a jet faking a energetic lepton, the huge rate of QCD events makes this background sizable.
Among the characteristics that can separate the WH signal from the above backgrounds, the identity of the parton originating the hadronic jets is a powerful one: b-jets are more rare than light-quark ones, but there must be two of them in a decay. DZERO uses a neural network which employs seven discriminating variables to select jets with a likely b-quark content.
The good thing with a neural-network b-tagger is that the output of the network can be dialed to decide its purity. And in fact, DZERO does exactly that. They start with a loose selection which has a rate of “false positives” of 1.5% (light-quark jets that are classified as b-tagged). If two jets have such a loose b-tag, the event is classified as a “double b-tag”; otherwise, the NN output requirement is made tighter, and “single-b-tag” events are collected by requiring that the b-tag has a better purity, with a “false positive” rate of 0.5%. These cuts have been optimized for their combined sensitivity to the Higgs signal.
Apart from b-tags, the signal displays a different kinematics than all backgrounds. Again, seven variables are used, which now describe the event kinematics: the transverse energy of the second-leading jet, the angle between jets, the dijet invariant mass, and a matrix-element discriminant, which is computed by comparing the probability density of the quadrimomenta of the objects produced in the decay in a WH event to that of backgrounds. In the figure above, the matrix element discriminant is shown for all the processes contributing to the class of W+2jet events with two b-tags. The output of the neural network shows that Higgs events fall in the right-side of the distribution, while backgrounds pile up mostly on the left, as can be seen in the figure below.
Results of the search
Since no signal is observed in the NN output distribution seen in the data, DZERO proceeds to set upper limits on the signal cross-section. For 2-jet events they use the NN output is used, while they use the dijet mass distribution for the 3-jet event classes. No justification is provided in their paper for this choice, which looks slightly odd to me, but I imagine they have done some optimization studies before taking this decision. However, I would imagine that the NN output is in principle always more discriminant than just one of the variables on which the network is constructed… Maybe somebody from DZERO could clarify this point in the comments thread, to the benefit of the other readers ?
At the end of the day, DZERO obtains limits on the cross section of the searched signal, which are still above the standard model predictions whatever the Higgs mass: therefore, they do not provide an exclusion of mass values, yet. These results, however, once combined with other results from CDF and DZERO, will one day directly imply that a SM Higgs cannot exist, if its mass is in a specified range. In the graph below you can see the limit set by this analysis on the WH production cross-section as a function of Higgs mass.
The black curve shows the 95% exclusion, while the hatched red curve shows the result that DZERO was expecting to find, based on pseudoexperiments. The comparison of the two curves is not terribly informative, but it does show that there were not surprises from the data.
The result can also be shown in the standard “LLR plot” above, which is showing, again as a function of the Higgs boson mass, the log-likelihood ratio of two hypotheses: the “background only” and the “signal+background” one. Let me explain what that is. For each mass value on the x-axis, imagine the Higgs is there. Then, with large statistics, the data would show a propension for the “signal plus background” hypothesis, and the LLR would be large and negative. If, instead, the Higgs did not exist at any mass value, the LLR would be large and positive. The two hypotheses can be run on pseudo-data of the same statistical power as the data really collected, thus producing the red and black hatched lines in the plot below. The two curves are different, but the red one does not manage to depart from the green band constructed around the black hatched one: that means that the data size and the algorithms used in the analysis do not have enough power to discriminate the two hypotheses, not even at 1-sigma level (which is the meaning of the width of the green band, while the yellow one shows two-sigma contours). The full black line shows the behavior of real data: they have a propension of confirming the background-only hypothesis at low mass, and a slight penchant for the signal+background one at about 130 GeV. But this is a really, really small fluctuation, well within the one-sigma band!
I think the LLR plot is a great way to describe the results of the search visually. It at once tells you the power of the analysis and the available data, and the outcome on the real events collected. Now, it takes twenty thick lines of text to explain it, but once you’ve grabbed its meaning…