jump to navigation

My criticism of the D0 MSSM result February 9, 2007

Posted by dorigo in news, personal, physics, science.
trackback

In the last post (see below, I am too lazy to link it here!), I described the recent result obtained by D0 in the search for minimal supersymmetric neutral higgs bosons. To summarize the result in a single plot, one can refer to the mass distribution shown below:

The black crosses are experimental data, and the stacked histograms are the expected backgrounds; the expected Higgs signal for a mass of 160 GeV is also shown in grey as an example.

The plot above is the result of several man-years, and since I have a high esteem of my D0 colleagues, I need to make clear from the outset that I believe they did a careful job and that the results they found are sound.

However.

A uninstructed eye can look at this plot and say, hey, the points follow the histogram quite well. A more malicious eye, though, will notice that the y axis is logarithmic, and will immediately focus on the two highest points in the distribution, the two bins containing each about 250 events. It is not quite evident because of the logarithmic axis, but those two points are well below the predictions!

You may become confused here. What do I care, you could say, if there is a downward fluctuation of those two bins ? After all, I am looking for an excess of events.

That is right, an excess over predictions. If predictions were overestimated, any possible excess would become harder to spot – and the computed limit in the MSSM parameter space -described by M(A) and tan(beta), as described in the former post- would become stricter: D0 would be able to call off a larger chunk of parameter space (see former post for the parameter space plot). And since I am excessively malicious, I could think that an experimenter might be inconsciously driven toward overestimating his backgrounds, if he was poised to obtain a limit for a new particle rather than attempting to find one…

Let me then discuss in detail how the background prediction is computed by D0, to arrive at the stack of histograms shown above. They use a technique which is quite standard in high-energy physics these days: they compute the rate of physical processes that yield real tau leptons with Monte Carlo, and estimate the rate of processes that produce fake tau leptons with suitable control samples of experimental data.

The total number of events in the plot is 1144, while the sum of backgrounds amounts to 1287+-130 events. These two numbers are compatible within the quoted error, so one could again raise a eyebrow and say, “why are you complaining about those two high points ? Overall, the agreement is quite good!”.

Prediction and observation match well, indeed. Those two points I mentioned are maybe only slightly off. So why bother ?

I do not care about those two bins in particular, but they ring a bell in my head: they remind me that the physical backgrounds have been estimated with the Monte Carlo, rather than normalized to the data in some orthogonal sample.  This is perfectly legal, and quite possibly the best way to go, but imagine what would have happened if D0 had instead chosen to normalize the Z->tau tau contribution – the largest by far, at 1163+-127 events by itself – such that the sum of all histograms best fit the data in the region of reconstructed mass below 100 GeV: the total background expectation would have gone down by maybe 10 to 20% in most of the mass spectrum, and the extracted Higgs limits would have been less strict.

I would not be writing this post if I had not looked a bit further. Again, please forgive me if I sound like I know the D0 analysis in and out. Indeed, I have only read their public documentation and looked critically at a few plots. But this is just a blog, so I will exercise my imperium here. Let us then have a look at another plot D0 post in their public note.

The second plot above shows a superset of the data contained in the first plot. It is the distribution of a quantity called the transverse W mass M(W). It is complicated for me to explain exactly what all this means, but suffices to say that it represents a discriminating variable on which the final selection is based: The 1144 events of the first plot belong to the first five bins of the distribution above, those for which M(W)<20 GeV. By cutting like that, D0 make sure that the data is devoid of real W boson decays to muon-neutrino pairs (the yellow histogram, which by the way is normalized to the data!).

My point is that the experimental data above 20 GeV in the second plot sizably undershoots the sum of backgrounds until M(W) reaches 60 GeV, just as much -if not more- than it does in the first plot (or equivalently, in the first five bins below 20 GeV). What that tells me is that the Z->tau tau contribution – the red histogram, which is still the largest contributor to the data for MW below 60 GeV – is quite likely to have been overestimated by the Monte Carlo.

There. If the Monte Carlo simulation overestimates the Z contribution, and if the Higgs limits are obtained by reference to an overestimated prediction, the limits quoted by D0 are too strict.

Any D0 folks out there who are ready to flame me or call me names ?😉

Of course, I am just another HEP soldier, and not a too smart one. I may have misunderstood something by reading the D0 documents, or plainly failed to grasp that the limit is based on a shape analysis rather than a counting experiment, or who knows what… There are indeed ten different ways that I may have exposed myself to the ridicule. So there, I am naked here, take a shot! Where is it that my reasoning fails ?

The bottom line, until I am proven wrong, is the following here: the D0 limit might – just might – be too tight. In the evidence that their Z Monte Carlo was overestimating the yield, they should have normalized it to the data in the second plot, but they did not. The CDF excess of H->tau tau decays at 160 GeV is not ruled out yet in my opinion.

Of course, if you ask me whether I believe the 2.1-sigma excess seen by CDF is a real Higgs signal, I will have to tell a resounding “NO”. Besides, I already discussed how likely it is to see a bump in a smoothly falling distribution, if you do not search for a particular mass value before looking at the data, in a series of posts

Update: I realize I failed to link the D0 document. You can get a pdf version here .

Comments

1. Kea - February 10, 2007

I’m a bit confused, Tommaso. Even if you are right about the D0 limit, there doesn’t seem to be anything in exactly the same spot as the CDF signal – or am I missing something?

2. dorigo - February 10, 2007

Hi Kea,

no you are not missing anything big. I was just pointing out a slight inconsistency in this analysis by D0.

I do not know whether the same plot you see above, once the Z were rescaled to match the data outside the 20 GeV cut on Mw region, would show an excess, match prediction, or still show a deficit.

My point was just a pointless punctualization, bout the procedure… I do not believe in the CDF signal of H->tau tau anyway!

Cheers,
T.

3. Andrea Giammanco - February 10, 2007

Your conclusion sounds perfectly plausible, and I would take a “socio-scientific” lesson from this: we are now trained to be extremely careful against unconscious biases before we claim a discovery, but our trained skeptikism tends to be relaxed when we are doing the opposite, i.e. setting limits against novelties.
Another potential source of insanity is the fact that the competition among experimental collaborations is very often rudely quantified, in terms of precision on standard parameters (particle masses, mixing angles, etc.) and in terms of strict limits on new phenomena.
The latter is particularly unsane, since there is the paradox that if you are observing an excess over your background your limits are much less strict than if you don’t have any excess.
So, let’s say, for the sake of example, that there IS a new particle and collaboration A has a genuine 2-sigma excess while B has just 1-sigma (due to a downward statistical fluctuation or to lesser intrinsic sensitivity of the analysis, or to plain mistakes leading to an overestimate of the uncertainty): collaboration B will be rewarded by these stupid comparisons, since their limits will be “better”. It’s unconfortably easy to suspect that an unconscious bias exists towards such a cheap way to be regarded as “the best”.

4. dorigo - February 10, 2007

Ciao Andrea,

you are right, it is a rather embarassing thing that the most successful analysis in the search for a new particle usually ends up being the one which saw fewer events. We are seeing several such examples of this fact these days, particularly in Higgs boson searches.

The workaround usually is to quote the expected limit along with the actual one, as D0 indeed does in their plot (see the previous post). So in principle one can still compare the two experiments’ reach without luck having any role in determining which analysis was more powerful.

Despite the quoting of expected limits, I suspect that in the end it is the actual result what one carries home. I have a clear example in my mind: D0 and CDF have been competing fiercely lately in the search for the first signal of single top production. Well, the CDF analysis was better in principle, since its expected sensitivity to the signal was higher. Nonetheless, it was D0 which saw an evidence of that particle first… Now, CDF can boast about their “expected significance” as much as they like, but it is D0 who won the race!

Cheers,
T.


Sorry comments are closed for this entry

%d bloggers like this: