My criticism of the D0 MSSM result February 9, 2007Posted by dorigo in news, personal, physics, science.
In the last post (see below, I am too lazy to link it here!), I described the recent result obtained by D0 in the search for minimal supersymmetric neutral higgs bosons. To summarize the result in a single plot, one can refer to the mass distribution shown below:
The black crosses are experimental data, and the stacked histograms are the expected backgrounds; the expected Higgs signal for a mass of 160 GeV is also shown in grey as an example.
The plot above is the result of several man-years, and since I have a high esteem of my D0 colleagues, I need to make clear from the outset that I believe they did a careful job and that the results they found are sound.
A uninstructed eye can look at this plot and say, hey, the points follow the histogram quite well. A more malicious eye, though, will notice that the y axis is logarithmic, and will immediately focus on the two highest points in the distribution, the two bins containing each about 250 events. It is not quite evident because of the logarithmic axis, but those two points are well below the predictions!
You may become confused here. What do I care, you could say, if there is a downward fluctuation of those two bins ? After all, I am looking for an excess of events.
That is right, an excess over predictions. If predictions were overestimated, any possible excess would become harder to spot – and the computed limit in the MSSM parameter space -described by M(A) and tan(beta), as described in the former post- would become stricter: D0 would be able to call off a larger chunk of parameter space (see former post for the parameter space plot). And since I am excessively malicious, I could think that an experimenter might be inconsciously driven toward overestimating his backgrounds, if he was poised to obtain a limit for a new particle rather than attempting to find one…
Let me then discuss in detail how the background prediction is computed by D0, to arrive at the stack of histograms shown above. They use a technique which is quite standard in high-energy physics these days: they compute the rate of physical processes that yield real tau leptons with Monte Carlo, and estimate the rate of processes that produce fake tau leptons with suitable control samples of experimental data.
The total number of events in the plot is 1144, while the sum of backgrounds amounts to 1287+-130 events. These two numbers are compatible within the quoted error, so one could again raise a eyebrow and say, “why are you complaining about those two high points ? Overall, the agreement is quite good!”.
Prediction and observation match well, indeed. Those two points I mentioned are maybe only slightly off. So why bother ?
I do not care about those two bins in particular, but they ring a bell in my head: they remind me that the physical backgrounds have been estimated with the Monte Carlo, rather than normalized to the data in some orthogonal sample. This is perfectly legal, and quite possibly the best way to go, but imagine what would have happened if D0 had instead chosen to normalize the Z->tau tau contribution – the largest by far, at 1163+-127 events by itself – such that the sum of all histograms best fit the data in the region of reconstructed mass below 100 GeV: the total background expectation would have gone down by maybe 10 to 20% in most of the mass spectrum, and the extracted Higgs limits would have been less strict.
I would not be writing this post if I had not looked a bit further. Again, please forgive me if I sound like I know the D0 analysis in and out. Indeed, I have only read their public documentation and looked critically at a few plots. But this is just a blog, so I will exercise my imperium here. Let us then have a look at another plot D0 post in their public note.
The second plot above shows a superset of the data contained in the first plot. It is the distribution of a quantity called the transverse W mass M(W). It is complicated for me to explain exactly what all this means, but suffices to say that it represents a discriminating variable on which the final selection is based: The 1144 events of the first plot belong to the first five bins of the distribution above, those for which M(W)<20 GeV. By cutting like that, D0 make sure that the data is devoid of real W boson decays to muon-neutrino pairs (the yellow histogram, which by the way is normalized to the data!).
My point is that the experimental data above 20 GeV in the second plot sizably undershoots the sum of backgrounds until M(W) reaches 60 GeV, just as much -if not more- than it does in the first plot (or equivalently, in the first five bins below 20 GeV). What that tells me is that the Z->tau tau contribution – the red histogram, which is still the largest contributor to the data for MW below 60 GeV – is quite likely to have been overestimated by the Monte Carlo.
There. If the Monte Carlo simulation overestimates the Z contribution, and if the Higgs limits are obtained by reference to an overestimated prediction, the limits quoted by D0 are too strict.
Any D0 folks out there who are ready to flame me or call me names ? 😉
Of course, I am just another HEP soldier, and not a too smart one. I may have misunderstood something by reading the D0 documents, or plainly failed to grasp that the limit is based on a shape analysis rather than a counting experiment, or who knows what… There are indeed ten different ways that I may have exposed myself to the ridicule. So there, I am naked here, take a shot! Where is it that my reasoning fails ?
The bottom line, until I am proven wrong, is the following here: the D0 limit might – just might – be too tight. In the evidence that their Z Monte Carlo was overestimating the yield, they should have normalized it to the data in the second plot, but they did not. The CDF excess of H->tau tau decays at 160 GeV is not ruled out yet in my opinion.
Of course, if you ask me whether I believe the 2.1-sigma excess seen by CDF is a real Higgs signal, I will have to tell a resounding “NO”. Besides, I already discussed how likely it is to see a bump in a smoothly falling distribution, if you do not search for a particular mass value before looking at the data, in a series of posts .
Update: I realize I failed to link the D0 document. You can get a pdf version here .