Update on Higgs searches: CDF, WH -> lv bb September 17, 2007Posted by dorigo in news, physics, science.
Enough time has passed since the recent summer conferences to have allowed the main authors of Higgs searches in CDF and D0 being the first to show off their results in public. Therefore I do not feel the need of using further restraint, and will now proceed to distribute a simplified account of the analyses, together – of course – with my very own notes.
Today I am going to describe the summer 2007 update (1.7/fb) of the CDF search for associated production of a Higgs boson and a W boson – a process that represents a good compromise between event rate and background reduction potential, and is thus one of the main weapons available at the Tevatron to catch the boson by the tail.
Before I deal with some analysis details, I wish to pay a tribute to the most disciplined among my non-informed readers by offering some easy notes on the phenomenology of Higgs boson production and decay in proton-antiproton collisions, something which surely help to understand what is done at the Tevatron and why. Those in a hurry are instead advised to just scroll down to the description of analysis method and results below, section two of this longish post.
1 – HIGGS PRODUCTION AND DECAY AT THE TEVATRON
The Higgs boson couples to all matter fields: the quarks that make up you and me, as well as the leptons – among which you are familiar with the electron, but which also include all neutrinos and the heavier electrons called muon and tau. It also couples, of course, to W and Z bosons, the carriers of the electroweak interaction. A particle is said to couple to another when they can interact directly, without anything mediating their exchange of energy and momentum. In the picture on the right, Anna Kournikova’s tennis raquet couples directly with the tennis ball, while its coupling with her opponent’s raquet is only indirect, regardless of the violence of their exchanges (although in reality, the raquets only feel photons objecting to atoms in their strings compenetrating with the rubber of the ball!)
By the way, quarks and leptons are collectively called fermions, because they share the property of having half-integer intrinsic spin. The W, the Z, photons, gluons, and H particles are instead bosons, since they have integer intrinsic spin. Fermions and bosons are as different as fire and sea: you can destroy or create one of the latter at will, but you are prevented from creating a single fermion out of the blue: any vertex, joining three particle lines, will have either zero or two fermion lines. That is because angular momentum is quantized, among other reasons. But I am divagating.
Pictorially, it all means you can draw a line representing the propagation in space-time of one of them, make a kink at one point, and attach there another line coming out, representing another particle being created. You can thus draw Higgs bosons being emitted by a quark or lepton line, or by a W or Z boson line (in the diagram on the left, an electron emits a photon).
The vertex where the Higgs is being created by the propagating particle possesses a property, a number representing how strong the interaction is. Stronger means more probable: more frequent, that is. The Higgs-quark and Higgs-lepton vertices have a strength proportional to the square of the mass of the particle couping to the Higgs field: it is exactly the property that “gives a mass” to the matter fields. The vertices representing the coupling of W and Z bosons to a Higgs particle carry a different strength – but they are not weak at all.
Now, when a proton and an antiproton collide in the core of the CDF or D0 detectors, there is a tiny but non-zero chance that one quark in the proton annihilates with an antiquark of the same species (same flavor, same colour: its genuine antiparticle) in the antiproton, producing a Higgs boson. What will be the quarks most likely to produce H particles ? Top quarks, for sure: their mass is forty times larger than that of their next-heaviest brothers, b-quarks. Things are not so easy, because top quarks are not found easily inside a proton -but we will leave a discussion of the mechanism giving rise to top quark loop-mediated interactions for another post. What I feel urged to point out is that what you can imagine is happening is that a top quark from the proton emits a Higgs boson, and bounces back, traveling backwards in time: and a particle traveling backwards in time is nothing but its antiparticle traveling forwards. The process is shown in the top left panel below. The other graphs show other production mechanisms: fusion of top quarks (top right), higgstrahlung -also known as associated production, (bottom left), and vector boson fusion (bottom right). In the reminder of this post, we will be discussing the associated production process.
By the process described by the top left graph above you get single, direct production of H bosons alone. Let us now consider instead the associated production of H and a W boson. What happens is the following: a quark and a antiquark of different flavor annihilate, producing a W boson (W bosons in fact change flavor to quarks as they couple to them), and the collision happens to make available much more energy than that required to pay for the mass of the produced W: the produced boson is off mass-shell, and is eager to let go the excess energy, becoming a regular W. If its excess energy is large enough to pay for the mass of a Higgs boson, it becomes possible to give rise to the process depicted below: associated production of WH pairs, by means of off-shell W production.
It takes some practice with quantum field theory to compute the probability of a proton-antiproton collision to give rise to the two production processes discussed above. A graph of their cross sections (i.e., event rates for a given running condition of the accelerator) is shown on the left as a function of the unknown mass of the Higgs (also shown is associated production with a Z boson, in green). You then get to see that direct production (in red) is much more frequent than associated production (in blue). What matters, though, is that associated production is very important for the Higgs search at the Tevatron, because when the Higgs chooses to yield a quark pair, the surplus W may decay into a lepton-neutrino pair , when the charged lepton provides for an easy identification handle which is otherwise missing in the event. This allows us to search for Higgs decay products in the decay .
What I wrote above reminds me we have not discussed Higgs decay yet. Why is the decay so important ? If the Higgs couples to all fermions (quarks and leptons) proportionally to their masses squared, one would conclude that is way more important as a decay mechanism: the coupling to top quarks is larger, so the diagram weighs more in determining the possible decay modes. Unless, of course, it is energetically forbidden: but indeed, in the range of Higgs boson masses we are searching a signal, the Higgs cannot decay to two top quark pairs, because .
So, b-quarks are the heaviest things the Higgs can decay to, if its mass is lower than twice the W boson mass. That is the reason behind looking for decays, despite the large backgrounds mimicking that signature. That is also why the associated production is so important at the Tevatron: direct production of a Higgs decaying to two b-jets is affected by a prohibitively large background from generic production of pairs by quantum chromodynamics – at rates of a million to one.
2 – THE SEARCH FOR ASSOCIATED PRODUCTION EVENTS
CDF has been searching for associated production of W and H bosons in their data for more than ten years now, and yet the analysis techniques have not crystallized yet. That is because of the incredible complexity of putting together the best possible kinematical selection, the most efficient and precise b-quark identification algorithm, the best possible lepton identification strategy, the best possible correction of the energy of hadronic jets, and then re-optimizing the whole thing such that the most favourable mixture of signal and backgrounds is achieved in the selected sample.
Things are converging though: the present instance of the search, used on a dataset of 1.7 inverse femtobarns of collisions (that is, almost exactly 100 trillions of them, among which about 30 million W bosons were produced alone, and 300 together with a H if the mass of the latter is close to 120 GeV), is a very refined effort. CDF is squeezing their leptonic triggered dataset to the utmost, in the knowledge that the CMS and ATLAS experiments are still a long way from taking over the business.
Events are originally selected online by a fast system spotting the signal of a high-energy electron or muon. The resulting datasets are rich with decays, which can be extracted with high purity if one requires the absence of significant accompanying jet activity. In a WH search, however, two energetic jets are expected from H decay; after jet reconstruction, events with two jets are thus selected. This makes backgrounds harder to deal with: not only events where the lepton signal is a fake get enhanced as jet activity increases, but also the annoying background from top quark pairs starts becoming an issue. However, there is no alternative: the Higgs boson has to be reconstructed from the two jets it produces.
Indeed, the next step sees the top quark background increase in importance. That is because the two jets are searched for a signal of b-quark decay: a secondary decay point precisely reconstructed from charged tracks in the jet cone, or even just an indirect evidence of the b-decay from tracks not fitting to the interaction point. Top pair production may indeed yield a W boson signal together with four hadronic jets -two of which from b-quark decay- and if the non-b jets get lost (either because they are emitted at small angle with respect to the beam, or because their energy is insufficient for a clear identification) top events contaminate the sample. Another possibility is that both top pairs yield two leptonic W boson decays and two b-jets: then it is sufficient to lose one lepton, and the signal will look like a genuine WH event.
Finally, the event kinematics is exploited to sort out the most Higgs-like events by a Neural Network classifier. The dijet mass (shown above, for events with two identified b-jets) is of course the most important variable to study, but others have a non-negligible impact, so the combined use of all information yields the most promising discrimination. The times of good-old bump hunting seem to have come and gone in very high-energy collisions: what you get to look at, in the end, is a likelihood plot or a Neural Network output. Make no mistake here: CDF put a tremendous effort in achieving the best possible resolution on the invariant mass of dijet systems, and the payoff is a 20-30% increase in the discovery reach – the equivalent of running the Tevatron for one full year more. Everybody likes to see a mass plot, but in the case of Higgs searches -where the signal is tiny- the most statistically advanced methods are just unavoidable, and the dijet mass disappears in the soup, only to emerge as a large weight in a Neural Network output.
So be it. The figure below show CDF data (black points) compared to the expected background mix in a plot of the NN output. WH events (shown as a red line, normalized to 10 times the expected SM yield) should populate the rightmost bins, but you expect very few of them there with the statistics integrated so far.
Finally, the room left by the mixture of backgrounds to a possible WH signal is computed from the distributions shown in the plot above, and cross section limits are extracted at 95% confidence level. One then proceeds to compute the ratio between the observed limit and the expected standard model cross section for the sought process: as both depend on the unknown Higgs boson mass, the ratio does too – and indeed it is a growing curve, reflecting the combination of two effects. First, as the Higgs boson mass increases, the decay becomes less and less probable, leaving room to the one. Second, there are just fewer Higgses produced if their mass is larger. You can see the ratio plotted below: the black curve is the measured 95% limit, while the red curve with the yellow band represent the limit CDF expected to set given the strength of its analysis and the size of the dataset used. At 115 GeV, CDF excludes the existence of a standard model Higgs boson with 10 times the standard model cross section ().
Of course, this is just one – albeit an important one – of many different searches performed by CDF for the Higgs boson: other searches study different final states, different production mechanisms, and use other techniques. In fact, while the limit above only stops at 10 times the predicted Higgs cross section for , the combination of all CDF searches is actually getting close to excluding some mass values. Together with D0, the Tevatron is in fact excluding a Higgs boson whose production is 1.4 times the SM one if $M_H=160 GeV$.
3 – MY COMMENTS AND SUGGESTIONS
Now, before I leave the very few aficionados who survived this far, let me say what I believe could still get slightly but noticeably improved in the analysis described above. After being convinced by Steve Kuhlmann’s excellent studies of jet resolution that CDF excels in jet energy measurement if jets are reconstructed in a cone 0.7 radians wide, I think it is due time to stop doing what we have been doing for far too long: id est, reconstructing jets with a 0.4 radians wide cone.
The latter option, introduced in CDF for the top quark searches about 15 years ago (I have no shame in declaring I -a summer student then- had convinced myself the smaller cone was way better for top searches well before it was adopted), is good when one reconstruct high jet-multiplicity events, when it just is too probable that two R=0.7 jet cones will overlap significantly, making the decay kinematics hard to figure out.
However, when events contain fewer than three or four jets, their overlap becomes less of an issue. The advantage of a wider cone is that it allows a direct measurement of radiation emitted at large angle with respect to the direction of the original fragmenting parton. Usually, when using the standard R=0.4 cone, CDF applies an average correction for that “out-of-cone” energy. But an average correction will never be as good as a direct measurement, capable of capturing the fluctuations in the quantum process of final state radiation!
QCD experts know very well that the R=0.7 cone is also the most theoretically-motivated choice (well, they will say they prefer another clustering algorithm, but that is a secondary issue for the reconstruction of decay topologies of heavy objects). That, in fact, is the reason why CDF still uses R=0.7 for measurements of QCD observables (inclusive jet cross sections, angular distributions, and the like). But jet resolution is way, way more important for Higgs searches at low mass!
You might well ask, am I alone in thinking this ? Why then would CDF not switch to a wider cone ?
The reason is that it is a non-adiabatic change in a very well tuned and oiled machinery. B-tagging has been studied in painstaking detail with R=0.4 jets. The event kinematics, the backgrounds – everything depends on that choice. It will probably take of the order of 10 man-years to obtain a similar accuracy in the tuning of all the necessary tools with the wider cone. But still: if CDF, if the Tevatron is to have a chance at finding the Higgs, as opposed to just snatching away a small chunk of the allowed mass inteval, everything – everything – must be optimized.
Recently, I was part of the group (led by Julien Donini) which measured the b-jet energy scale in CDF by reconstructing the signal of 6300 decays and fitting its true position, comparing to simulations. The signal can be used also for jet resolution studies, and has been extracted by using the R=0.7 cone: the sound choice for a dijet decay signal. I sincerely hope CDF will use the opportunity of the 6300 isolated Z decays to test better jet measurement algorithms, running with the 0.7 cone. I do not expect much: maybe a further 10% improvement. But these are times when even a 1% improvement is worth giving attention.