jump to navigation

Disclaimers in scientific plots and the D0 single top signal December 16, 2006

Posted by dorigo in mathematics, physics, science.
trackback

How often does it happen that you see a plot sending a deceiving message ? To me it happens all the time. Maybe because I am a perfectionist, or because I worked for too long in CDF (where the blessing of a plot may involve ten iterations where fonts, labels, colors, anything is changed from its original version), or maybe because I have this nasty habit of trying to understand a plot if it is shown to me, rather than just looking at it… Whatever it is, when I am shown scientific data in a graph I always find something I would have done differently, or something to criticize. Mine is the typical professoral look, if you wish – only, I am not a professor yet.

Our job as experimental physicists is difficult per se, but we are trained to perform it well and we usually do. However, the presentation of scientific data is something we are not trained well enough on, despite it being one of the most important duties we have. If we make a discovery, or just a measurement, the data it is based upon is worth billions of dollars. So we should be very careful in the way we present our results.

Now, most of the times things are easy – one just needs to specify a few things carefully, and disallow the broadcasting of a nude result stripped of the punctualizations. Say you measure top quark pair production at the Tevatron: you can quote a cross section of 6+-1 pb, and that is fine, but since your measurement in fact depends on the assumed value of the top quark mass Mt, when due to a linearly increasing signal acceptance with Mt your measurement decreases by a tenth of a picobarn every GeV above 175,  to be precise you should write it as sigma(tt)=6-0.1(Mt-175)/GeV +-1 pb.

Such specifications, alas, are usually neglected for brevity. But brevity is a close friend to imperfection. You put out a paper with the measurement and with every explanation of the Mt dependence, and you think you have done your homework, but when you plot your result together with others, you omit the precious information. Soon your result is quoted everywhere without that Mt dependence, and averaged with other results which used another central value for the top mass in their evaluation. Imperfections add up quadratically… Entropy is everywhere.

Most failures to specify important details in the broadcasting of scientific data happen in plots; and plots lend themselves also to be deceiving in their own right. Take the plot on the left for instance: it shows three determinations of single top production by the D0 collaboration. The plot compares theoretical predictions for single top production with the experimental measurements. All looks well and clean, but I have several reservations:

  1. The plot does not say that the three determination are strongly (I estimate about 90%) correlated with each other. These seem three independent determinations, and by looking at the points at face value, one would say the theoretical model is probably underestimating the cross section. Wrong! The data is actually compatible with the model at the level of 1.1-sigma or so.
  2. The theory prediction is computed for a particular value of the top quark mass, and that value is indeed printed in the plot. Of course, the same applies to the experimental determinations, which (one hopes) have at least used the same top mass value of the theory band…
  3. The plot does not explain a critical point in the interpretation of the data: the fact that the expected sensitivity of the most sensitive of the three analyses – the one with smaller error bars, the “DT” method (decision trees) – was 2.1 sigma, while the data showed a 3.4 sigma effect. That means to say that this plot would have hardly been distributed (an experimental bias due to the procedure by which we decide for publication of our results) if the number of observed signal events had fluctuated low, rather than high: by fluctuating low, the “evidence” would have not been such, and the plot would have not made it to your desk. So if you look at these points with error bars, be advised that they would have stood no chance of laying on the left of the blue band. It is only because they are on the right of the band that the plot arrived to you!

These details easily escape most observers, but they are important. In the case of the third objection above, there is little one could do to avoid the problem. Only by deciding that the analysis would be made public with a given amount of data before looking at it, one would be saved from the bias. But that is not what typically happens… And so we have to live with rare processes being first measured high, and then slowly go back to their real cross section. It happened with the top quark evidence in 1994 too, and that time it was CDF who did it… And let me predict here that when the Higgs is discovered, it will show an abnormally high cross section too! (In fact, it almost happened already, with the LEP II evidence…)

Comments

1. Fred - December 16, 2006

Would you institute a permanant condition in the working process of experimental physicists that continually addresses the proper presentation of scientific data so that the experiments’ results may be understood by those who are going to make critical decisions based on the given information?

“Most failures to specify important details in the broadcasting of scientific data happen in plots; and plots lend themselves also to be deceiving in their own right.”
Should most plots be presented automatically with a caveat or is this already assumed?

Sometime in the far future …
Will HAL 9000 (or CARL for some) be able to make the observations and assessments like you have concerning the plot shown above and offer the three reservations you subsequently stated?

“So if you look at these points with error bars, be advised that they would have stood no chance of laying on the left of the blue band. It is only because they are on the right of the band that the plot arrived to you!”
Is this some sort of political crack or what!? lol

2. dorigo - December 16, 2006

Hi Fred,
well, I think that experimenters, who are the owners of the results they produce, should be the ones taking these decisions. And it is what usually happens. However, at times things are a bit on the sloppy side anyway… But those making critical decisions should be chosen among the smart minds, and they are expected to read the documentation rahter than looking at plots!
As for caveat lines accompanying the plots, CDF has indeed a procedure by which critical plots have to be presented with an accompanying caption which is decided by the collaboration.
And the final sentence you mention is quite serious. If D0 had seen a 0.8-sigma effect (1.3 units of sigma below what they expected to get) rather than a 3.4-sigma one (1.3 units of sigma above), they would not have that plot out in the form you see it. So I mean what I said. Those points stood no chance of being away from where they are.

Cheers,
T.

3. Andrea Giammanco - December 17, 2006

Point 3 is of paramount importance, and often neglected. It’s a good idea to point it out to your readers.
But, after all, CDF published the single top results at 0.9 fb-1 without waiting for a signal to show up, and in fact one of the two CDF analyses has a downward fluctuation instead of an upward one.
I don’t think there are serious reason to suspect such a bias in the D0 procedure behind this publication, since, after all, everybody after the publication of the two CDF results was eagerly waiting for D0 to settle the situation with its independent data. I can easily imagine the D0 single top people being pushed towards publishing a complete result as soon as possible, in order to answer to this legitimate question. If this is really what happened, of course some error could have leaked inside the result due to the rush, but in a random way (with flat probability density)🙂
And, to make things even more complicated, this is one of the rare cases in which the exciting result would be the LACK of signal instead of a signal. So, one might even be justified to suspect that a clear incompatibility with the SM (i.e. a lack of signal) would benefit from the kind of bias you are describing. And, by coincidence (eheh), the very first single top result to be published with 0.9 fb-1 has been the only one which excludes the SM (the likelihood-ratio CDF analysis).

4. gordonwatts - December 18, 2006

Thanks for pointing out our sloppy work!🙂 As is usual when something like this happens the problem isn’t that we were trying to be sloppy. Instead, we (in D0) were using this plot for one thing and everyone else in the outside world is using it for another.😦

The origin of this plot, at least for me, is CDF, actually. As Andrea mentioned, CDF had a result that looked interesting, and another that didn’t look so much. We wanted to point out that all three methods show the same thing.

A few other points:
– These analyses aren’t 90% correlated – it is less I’m pretty sure. We are working hard to combine them and when we release that we’ll have a final number.
– In point 2 you talk about top mass values. Ahem. 175 is the band. The background we use is 175 as well. We have looked to see what the effect is if we had a smaller top mass by reweighting our events (such as the world average now around 171) — not much.
– Even if we hadn’t seen the 3.4 sigma result we would have still released these results and you would have seen them. You might not have seen them as quickly as you have, but they would have been out. I also doubt we would aim for a PRL if we’d not seen it. Indeed, PRL has made it clear in the past they aren’t interested in these results unless we either see it or don’t and should. Apparently, SM is too boring for them.😉

One big problem we don’t have other documentation out. Frankly, I don’t think the PRL will be very useful to you — the analysis is quite complex. Something like a PRD, or a long conference note, or a seminar that you can shoot at is what you really need. Or a blog you can ask questions on. Hmm….

5. dorigo - December 18, 2006

Hi Gordon,

no, no sloppy work… I had to comment that plot (posted originally by kea in her blog) and I came up with the idea of posting about misleading information…

Anyway I do believe you would still publish your data even with a downward signal fluke. But in that case, it would have made less sense to show the data the way the plot shows it…In fact, CDF did not produce such a plot, even if we were the inspiration for D0’s plot…

cheers
T.

6. James Graber - December 18, 2006

Hi, Tomasso,
Speaking of misleading plots, I found the red-green top mass W mass plot in one of your previous posts to be very misleading.
At a glance it seems to say the data prefers a light Higgs for the SM and a heavy Higgs for the MSSM. A closer look shows that the red band varies from mH (Higgs mass) from 114 GeV to 400 GeV but the green band varies from light SUSY to heavy SUSY.
Reading page 29 of hep-ph/0604147, I discover that the whole green band wants a light higgs

7. anon - December 19, 2006

“Only by deciding that the analysis would be made public with a given amount of data before looking at it, one would be saved from the bias.”

This is exactly what should be done. Physicists who otherwise are extremely scientifically honest are committing a grave ‘sin’ when they do not publish results just because they do not fall in line with what is expected. This is also a problem at the major journals (in terms of what they accept for publication).

8. dorigo - December 19, 2006

Andrea, yes, CDF put out two separate analyses which are inconsistent (albeit only at 2-sigma level) among each other. I guess my point is that D0’s plot would have been crafted differently, and would have gotten far less air time, if it had not been possible to endow it with the superscript “evidence”.

Cheers,
T.

9. dorigo - December 19, 2006

Hi James,
Hmmm I need to have a look… Forgot the details of that particular plot.
T.

Dear anon, I think everybody agrees to that… But there are reasons why a sane practice is at times waived. If, as in the case of CDF single top searches, two analyses which should have most in common find contradictory results, this forces the collaboration into a major review of the analyses, and it takes time, delaying the publication of results which otherwise would be much faster to broadcast. But there are other good reasons why things have a different path to publication depending on the claims they make.

Cheers,
T.

10. Tony Smith - December 20, 2006

I can understand why, in accelerator physics, a collaboration building a machine must strictly insist on a single consensus view, because each part of the machine must function with the others in order for the machine to work.

However, I do not understand why data analysis must be forced by the collaboration into a single consensus view.

It seems to me that, subject to the data and analytical procedures being correct,
it would be interesting and useful to see varying analytical techniques, especially if they produced different results.

Unfortunately (in my opinion) the collaborations which are put together with constructing a machine in mind
tend to apply the same rules to data analysis,
even though what is good for the physics of accelerator construction
may be bad for the physics of data analysis.

Maybe the same bureaucracy that builds the accelerator should have no authority over the data analysis bureaucracy ?

Tony Smith
http://www.valdostamuseum.org/hamsmith/

11. dorigo - December 20, 2006

Well, you see, these big experiments are huge endeavours, with 500 people busy for O(10y) building the apparata, and then some collecting the data. Everybody “owns” the data, so to speak, because everybody put so much effort in building the detector. So everybody needs to sign the papers. So there has to be a consensus.

Thay syllogism is hard to sidestep. But there is more. There is a strong consensus, among physicists, that if you have some data, you cannot, by performing two different pieces of magic (two analyses), reach two different, conflicting conclusions. You can certainly test different hypotheses, but if you have one hypothesis to test and two or more analyses to do it, then the results may be giving slightly different interpretations, but they cannot be conflicting among each other: this is a signal that one or more of the analyses is wrong.

I tend to believe that there is nothing wrong in publishing a wrong result from time to time. Sure, we have a “reputation”, but we do physics to discover new things, not to build a stronger reputation among our peer. So, if two analyses find quite different results, conflicting among each other, then good, let’s sit and review them carefully. But if we do not reach a consensus, too bad, let’s let the world know. Much worse is to stop a result from “coming out”, that is be known by 10000 physicists rather than be confined in the knowledge of 500, because one does not understand it.
Unfortunately, that is exactly what happened a couple of times in CDF.

Cheers,
T.

12. wynne casino las vegas - November 20, 2007

wynne casino las vegas

descendent Cervantes?adores:potentialities

13. car insurance orange county - August 1, 2008

car insurance orange county…

multiplexor stamps:sobered bagpipes biplanes identifications …

14. mortgage debt to income ratio calculator - October 1, 2008

mortgage debt to income ratio calculator…

Islamic birchen dependability arthropods …


Sorry comments are closed for this entry

%d bloggers like this: