jump to navigation

More speculations on non-175 GeV single top from D0 data December 13, 2006

Posted by dorigo in internet, physics, science.

Minutes after posting about D0’s recent evidence of single top production (to see it scroll down two posts, too lazy to link it here!) I got a message from Tony Smith (http://www.valdostamuseum.org/hamsmith/) containing many questions on the D0 analysis and their decision tree discriminant. I answered his questions, and then thought they might be of interest to two or three of the many of you who visit this blog erratically… So here is an amended version of the mail exchange.


Hi Tony,

as usual, you’re welcome, and as usual my answers have a fair chance of
being only partial answers to your questions. However I will try to do
my homework.

> It may be that my questions are too naive to be useful
> because I don’t have much intuition about what DT means physically,
> so please feel free to tell me if that is the case,
> and in that case just ignore the questions asked below in this message.
> On the other hand, if you think that the questions might be useful,
> feel free to post this message including images on your blog entry.

I don’t know D0’s decision tree well enough myself, but I know the list
of variables which are fed in the trees, from slide 24 of the talk [see link in previous post]. There, you can see that they put in the “best” top mass as a kinematical selection variable. Moreover, many other variables which are directly correlated
with the top mass itself are fed into the DT. It is a perfectly legal thing to do, but once you do it, you have to be careful to interpret the results. In particular, a single top production process with a mass different from that with which you built your trees (your “signal”) will be treated as a background if the mass difference is large enough to make the branches split regular top and low-mass top differently.
What would tell us if that is really the case would be the relative weight that the final trees give to each of the variables. If the top mass is one of the vars which is given most weight in determining how to classify the event, then any top signal with mass significantly different from 175 GeV would be washed out.

Be careful here, “weight” is not a very well defined quantity here. Some decision tree algorithms have a built-in way to determine a posteriori (i.e. when the trees are built) what weight did a variable have in selecting signal from backgrounds. Others don’t. I would not be surprised if, by asking D0 what weight does the “best” top mass have in their DT, you got a perplexed look in return, or worse, a layman explaination that the DT is not a neural network. But they might also answer with a number straight away 🙂

In any event, I have the answer myself. If you look at the plots, they speak to you. [refer to plots below – only two of the three in the presentation are shown]. The three distributions at DT<0.3, intermediate, and DT>0.6 are VERY different in the “best” top mass. AND, the high-DT data have a perfectly coincident distribution for all backgrounds and for single top. That is to say, that variable has been totally “squeezed” for its discriminating power by the classifier. In other words, what one can gather from that plot is only the relative normalization of the expected contributions to the data points, since shapes will be coincident.
A point of relevance: the relative normalization of the various colors tells you indeed that the high-DT data favors the SM single top with respect to backgrounds, as it should. But it does so based on the top mass itself, and therefore that variable is no longer a very good one to display the final result! In fact, one would prefer to keep the most discriminant variable aside, and train one’s classifier with the others,
being careful to avoid variables that are correlated with the most powerful one: that way, one would retain discriminating power in the best variable after a cut on the classifier’s output. That is the strategy adopted for higgs searches at low mass in CDF, where the higgs mass is left aside, being very discriminant by itself.

So, to summarize:

– ask a D0 person about the weight of the DT to the top mass, but be
  aware he could start telling you things you don’t want to hear about.
– maybe better, ask what are the inputs to the matrix element method.
  if they are the same as the DT method (I suspect so since there is
  no mention of those in the talk), your fancied low-mass top is still
– ask them if they would be willing to make the exercise of
  attempting to set a limit to a SM-like Single top at 145 GeV.

> Looking at the  image showing Tquark mass
> for DT less than 0.3 it seems to me that the high data points for singleT events
> are in the bins for 100-125 GeV and 150-175 GeV.
> However, I guess that low DT might mean that not many singleT events
> are expected, because the low DT histogram shows very little of the blue
> or cyan colors that correspond to expectation of singleT events,
> so maybe the low DT data is not very significant ?

Not necessarily. Low DT means low probability of a 175 GeV top, given a lot
of final state quadrimomenta AND a three-object mass are fed in the tree. So a lower mass top quark might get a low grade and end up there. By the way, have you noticed the tell-tale dip at 175 of the W+jets background (the green stuff) ? That is the sign that events with that mass are preferentially high-DT ones, if there are no more striking characteristics telling them apart from the Single top hypothesis – for
instance, ttbar does not get such a void at 175 because there are more useful variables to discriminate it from single top, and it clusters at 175 GeV anyway…

> Is there some physical reason that low DT sees events in 150-175 GeV,
> while the higher DT sees a deficiency of events in 150-175 GeV ?

Not necessarily physical – statistical probably. If systematic, then maybe it is connected to their way of training trees with so many variables correlated to each other. Usually, decision trees may get “overtrained” in such circumstances, and a way to avoid that is to do a random sampling of the variables used at each branching, and grow a huge number of trees rather than a single one, then asking trees to “vote” for a hypothesis. The random forests algorithm is such a delicious thing… I have a post about it (https://dorigo.wordpress.com/2006/03/03/random-forests/) which links to a informative site on that particular algorithm if you want more information. 

> Is it reasonable to expect that more data at both CDF and D0 will
> answer these questions ?

I think more data always helps, provided you are willing to let a good hypothesis go if the data disprove it. But it is good to be stubborn for a while longer, especially since nobody did really a search focused on low-mass single tops…


1. James Graber - December 13, 2006

Hi Tomasso,
Instead of speculations about the Top mass, this is primarily speculations about the Higgs mass.
I have been struggling for several days, trying to put together a coherent set of questions about the Higgs mass and its effect on your bet with Watts and Distler. Originally, I was going to reply to your post about the blinded W mass result.
However, I don’t seem to be making very fast progress.
So I am forwarding you this rather disjointed draft with many questions, in the hope you will see what I’m looking for.
Perhaps you can answer the questions which make sense and ignore the rest.
Jim Graber

How soon is the W mass unblinding likely to occur?
Does a higher or lower value favor your side of the bet with Watts and Distler?

Assuming the Higgs is not an MSSM Higgs, but rather “just” or “only” a standard model Higgs, what mass do you expect it will have?

Assuming no Higgs is seen below 130 GeV, do you see that as a big plus for your side of the bet?

Even if we see a fairly light Higgs, i.e. below 130 GeV, we still don’t know it’s a supersymmetric Higgs, rather than a standard model Higgs. Correct?
Other than seeing a superpartner, or a second higher mass Higgs, are there indirect ways of supporting supersymmetry that we are likely to see?
If so what are they?

What sort of 5 sigma non Standard model effect is Jacques likely to be thinking of?
How about a disagreement between the Masses of the Higgs, top, W, and or Z?

How do you expect to win your bet with Watts and Distler?

Suppose we see something that looks like a standard model Higgs but the mass is way high.
Does that count as a 5 sigma discrepancy?

I have a poll up on the mass of the Higgs over at Physics Forums.
Would you care to come over and vote or even comment?
Jim Graber

2. Unblinding the most precise W mass measurement today « A Quantum Diaries Survivor - December 14, 2006

[…] As for the significance of a low or high measurement with respect to the current world average (80403+-29 MeV), I will speak about it in the next post, where I intend to answer a long list of questions by a reader who left a comment in https://dorigo.wordpress.com/2006/12/13/more-speculations-on-non-175-gev-single-top-from-d0-data/ . […]

3. gordonwatts - December 18, 2006

All of our D0 analyses assume that we have a 175 GeV (or so) top mass. The DT’s, for example, are trained with single top Monte Carlo with a top mass of 175. So 175 is built in. The matrix element analysis uses 4 vectors as in put — so you might tihnk you were safe. But, in the end, you are not because the matrix elements we use contain the top mass — which we’ve set to 175 GeV.

As we get more and more data we will start relaxing various conditions. We would love to measure the top massin this channel. That means the first thing we need to do is construct a selection that is has less bias in top mass, and then repeat the process and measurement.

If you are curious about how much weight the top mass has… Tomasso is right. That question doesn’t really make a lot of sense in the context of a DT, unfortunately.

Finally, I agree with Tomasso — the statistics is pretty poor right now. We need more data!

I hope this is helpful!

4. dorigo - December 18, 2006

It sure is, thank you Gordon for your comment!

5. Alejandro Rivero - October 30, 2007

About comment #5 above, it seems that comment spam robots are becoming intelligent creatures. (Well, most probably it analyzed comment #1 for keywords)

6. dorigo - October 30, 2007

They are indeed, Alejandro… Some folks fond of the singularity theory say that the first intelligence will arise in spam filters 🙂


Sorry comments are closed for this entry

%d bloggers like this: