95% confidence level? Watch your language! March 18, 2007Posted by dorigo in mathematics, physics, science.
The lingo of particle physicists is a dangerous thing, liable to backfire. While standard expressions like “we exclude the production of a Higgs boson with a mass lower than 114.4 GeV at 95% confidence level” allow high-energy physicists to communicate easily and share information quickly, by-standers should not have to listen to that language, since they only get the impression that their chances of really understanding the issue are zilch.
The problem is that it is quite hard for a scientist to translate in real time from lingo to English. One frequently forgets how hard it is for outsiders to follow an explanation, and slips occur quite often.
Take the sentence quoted at the beginning above: what does it really mean, 95% confidence level ? Mind you, there is a reason to try and explain it: the searches of the Standard Model Higgs boson are in full swing at the Tevatron, and the LHC will soon join -and rapidly take over. And things as of now revolve around the meaning of those words.
In fact, if we want to discuss the existing information on the mass of the Higgs we are left with two things: the fixed lower limit set with direct searches by the LEP II experiments –M(higgs)>114.4 GeV, at 95% Confidence Level, that is – and the indirect upper limit from global electroweak fits, which moves around as more precise measurements of top and W masses are produced. The upper limit is now sitting at M(higgs)<144 GeV (again, at 95% CL), and it has gotten there by a continuous decrease as more data was included during the last year.
So, what is a lower limit at 95% confidence level ? If you try to measure a physical quantity whose value is still unknown, two things may happen. One, you may be lucky and measure it: no need for explicitly probabilistic language there, you quote the value and the estimated error (but notice, the error hides gory probabilistic details at times). Two, you only manage to come up with a range within which the true value may lay.
When number two happens, you have to be careful with the statement you make. To quantify in a standard way the range within which you measured the physical quantity to lay, it is customary to quote the point below or above which odds are only one in twenty that the true value is. This is a 5% chance, and the remaining odds are 95%: that is what one means when one says “at 95% confidence level the value is higher than…”.
Now, this post is not about explaining what we mean with that language. That is, it is, but I want to take it to a different level now. Because, dear reader, saying “x>blah at 95% CL” is a very, very, very loose piece of information. Indeed, it only quantifies the integral of a probability distribution (oh, lingo again! But I am resigned to the idea of losing a few readers this deep in the post) at one single point x.
When the LEP II experiments made their final analysis of their data public, they not only quoted the 95% CL. They actually produced a plot, the one shown below.
In the plot, you see the function which caused the lower limit to be at 114.4 GeV. it is the black curve, which shows the CLs function of the Higgs mass found by LEP. I will not deal with the details of the definition of CLs, but you see a function which raises at blitzing speed from 10^(-6) at 111 GeV to 0.05 at 114.4 GeV. What that means is that the “95% CL limit” is a quite strong one: ok, there is a one-in-twenty chance that we missed a 114 GeV Higgs, but – look at the plot – the chance that we missed a 112 GeV Higgs are one in fifty thousand!
This has to be compared with the “indirect, global SM fits” upper limit at 144 GeV I was discussing above. That limit is still a 1-in-twenty chance, but its flavor is quite different: the chance that the Higgs is 145 GeV or larger is 1/20, but it is not dramatically smaller for 150 or 160 GeV! The probability distribution, that is, is much flatter for the indirect limit. When compared with the skyrocketing function plotted above, it is as flat as Illinois.
The above distinction, quite stark in the case I cited, is often overlooked, but it is of the utmost importance. Because although the indirect global fits point at a less than 80 GeV Higgs, we know the Higgs must be above 114 GeV – unless we declare we do not understand the first thing about LEP data. Hell, chances that it is below 110 GeV are less than one in a million! The indirect limit, besides the caveat that it is -well- indirect, and so less reliable, because slightly model-dependent and also relying on some assumptions, is much less strong a constraint.