jump to navigation

Events at a particle collider March 15, 2006

Posted by dorigo in computers, physics, science.
2 comments

Helge wrote:

If you talk so much about speed, what kind of computers is this all running on? If it would take 1 computer 3 years, why not use 300?

Yes, in fact CMS and ATLAS are putting together a GRID of computers worldwide, which user can exploit to analyze the huge amounts of data they will collect. And indeed, we rather use Monte Carlo simulations produced somewhere else, and just reprocess them - but downloading the data is still not very efficient, the system is often down, etcetera. If instead we produce the simulations at home, we run on a plain 3GHz machine. This is for a very low profile analysis at which only Marco and I are working right now - larger groups have more CPU available.

The second thing, that will be a lot more technical is: What is exactly an event you are getting? Do I have to imagine this as some kind of path, I would know from bubble chamber stuff. Or is this something different?

cern bubble chamber imageIt is different. When particles hit a bubble chamber, they  collide with nuclei at rest, and all of the produced bodies will be emitted in the direction of the incident particle: you then get the typical “flower-like” structure such as the one in the picture on the right (where incident bodies are 16 GeV pions coming from the left in the pic).

Quite the opposite happens when a proton and an antiproton (at the Tevatron: at LHC it will be protons from both sides) hit head-on in the center of CDF or D0 (CMS or Atlas at the LHC). The total momentum is zero, and particles will be produced at all angles with respect to the beam. And actually, it is those particles having the largest momentum transverse to the beam the ones that most interest us: because a large transverse momentum identifies a large acceleration of the outgoing bodies, which is of course the result of a high strength interaction - and it is these hardest, most energetic interactions those that tell us more about the shortest distance scales: these are the events that a higher energy accelerator may produce but lower energy ones cannot.

cot displayThe picture on the left shows a CDF event which is most likely due to top pair production, with the subsequent decay of one top quark to a b-jet and a W which in turn produces an electron (highlighted by the box) and a neutrino (which escapes, leaving an imbalance in the transverse energy of the event - the red arrow), while the other top quark decays to three jets of hadrons. As you see, there are particles flowing in all directions. The view is orthogonal to the beam (which enters in the screen in the center of the circle), because it is the transverse component of particles the most important one. What can be seen in the diagram are only electrically charged particles, the ones that ionize the medium (Argon-Isobutane) filling the tracking detector. Farther out, all particles - both charged and neutral ones - will interact with a calorimeter: a detector that measures the energy of particles by destroying them, through strong interaction with heavy nuclei.

Tevatron vs LHC March 15, 2006

Posted by dorigo in computers, physics, science.
add a comment

Pietro wrote:

I don’t remember if you have already posted about the following topic, but it would be interesting if a physicist like you, involved in both CDF and LHC, would write something about the perspectives of the next (and last of his life?) few years of CDF before LHC starts to produce a significative amount of data

I do not remember it either… So it makes sense to write it here.

CDF and D0 are doing excellent physics with the dataset they have collected so far during Run II at the Tevatron. It is 10 times more data than what was available at the end of Run I, in 1996 - a sample that produced about 250 physics papers from each experiment (and typically they were good measurements!).

To give an idea of the physics we are doing: the top quark mass is now known with a 1.7% total error (Run II goal was 2%, but we will collect at least four times more data!). Bs meson mixing has been finally observed (this is news of a few days ago!). Several exquisite results with Bs mesons (a prerogative of the Tevatron, since B factories do not produce these particles) have been produced. The W mass measurements are returning to challenge LEP2 results, and will contribute significantly to shrink the total error (but we must remember that the current 0.05% uncertainty in the W mass is two times less informative than the 1.7% error on the top mass as far as checks of consistency in the electroweak parameters are concerned). Tight limits have been placed on the existence of several supersymmetric particles… And there still is a chance to observe the Higgs boson if the Tevatron continues to deliver data.

Of course, as LHC will start colliding protons at 7 times higher energy and 10 times higher rate, most of the above measurements are bound to be superseded. But how quickly ? When a new hadron colliding machine starts and new detectors are turned on, experience suggests that it takes quite a while to make sense of the data. And the LHC has all the right ingredients to create a nightmarish start: a huge interaction rate, obnoxious neutron fluxes, unknown trigger rates, wonderfully complex detectors.

I am not a Cassandra, but a realist. I think it will take time before the LHC steals the scene from the Tevatron. In the meantime, there will be lots of space for young physicists eager to play with real data - lots of it - and become experts of the subtle art of hadron collider data analysis.

That said… Pietro, I think after your master thesis you should think about joining CMS, or even earlier!

Answer to Fred: tiny signal and huge backgrounds March 15, 2006

Posted by dorigo in computers, physics, science.
add a comment

Fred left a comment to my post below (”Programming mood” ) and I feel tidy enough to answer him in a separate post, where my effort can be more readily available to anybody else interested…
Fred writes:

My question of the day: In layman’s terms, what is the purpose of needing to determine the effectiveness of a selection aimed at isolating a tiny signal in a huge background of QCD events? What would be a practical application attained from the results of this endeavor?

The CMS experiment at the LHC supercollider will only start taking physics data in 2008. We want to be ready for it, and we need to justify our existence as software specialists while our fellow hardwarists work their butts out in trying to put together these battleship-sized, multi-million-electronic-channel, billion-dollar detectors.

Therefore, we study what we can do with the data, even before we collect it, by simulating interesting processes and the relative concurring backgrounds with the aid of sophisticated computer programs that use our theoretical knowledge of the physics of subatomic particles, together with a detailed blueprint of the would-be detector and its response to passage of radiation, to yield an output as similar to the one we will get when we start colliding real protons. A simulation.

Now for the answer to your question. Tiny signals are the most interesting, because they are the ones not yet unearthed by previous experiments: because previous experiments were colliding particles at lower energy, and fewer of them. Both higher collision energy and higher number of collisions work together to allow us to study rarer and rarer processes, where yet unknown particles are produced.

Take the top quark as an example.

The top quark was sought by the CERN SppS collider experiments in the eighties, when the total collision energy was 630 GeV (about 680 proton masses) and the number of collisions per second was in the tens of thousands. The experiments may well have produced a few top quarks, but for sure they could not detect them: too few to make a significant signal, buried in a large background. Two top quarks (they are produced in pairs) make a total mass of 350 GeV, and thus require that the collision does almost nothing else but producing them: very improbable, very rare, very few events. No discovery.

Then came the Tevatron: it started to collide protons and antiprotons in 1987, at a three times higher energy (1800 GeV). Top pairs are less rare here! Moreover, the collision rate was in the hundreds of thousands per second. But still, no discovery in a year of data taking… It took three more years of data, collected between 1992 and 1994, to get a significant signal (a handful of top pair production events) and claim discovery.

Still, it was a tiny signal in a huge background! When CDF announced the first evidence for top in 1994, it was based on about ten events, selected amidst a dataset of several millions.

To find a new particle, you have to be smarter and smarter the smaller the signal is. And the more we progress in our understanding of particle physics, the deeper we want to go - so the Higgs boson, the long sought particle thought responsible of the mass of all others, is produced at the Tevatron with the rate of one per hour (if it exists and has the characteristics we foresee), while concurring processes are produced at a rate of 3 millions per second!

Now, the specific selection I was talking about in my previous post was not about finding the Higgs at the Tevatron, but about the much simpler task of selecting top pair production events at the LHC - there, the top quark is not so rare (well, still sort of - one pair produced every 100,000,000 collisions). Top pairs are worth collecting even at the LHC, which will have million-event samples of top decays, because associated with the top quark we can sometimes find a Higgs boson! It is sort of like digging ore from a gold mine, with the aim of extracting the metal with some chemical or mechanical process thereafter.

I hope I have somehow answered Fred’s question here… If any or all of the above is confusing, please let me know…

Programming mood March 15, 2006

Posted by dorigo in computers, mathematics, physics, science.
5 comments

In the last few years I have seen the amount of time I spend on writing code slowly but steadily decrease, as my commitments toward organizational issues and tutorship of students increased.

However, in the last weeks there has been an outburst of coding activity. It is due to my realization that I need to ramp up my commitment to the CMS experiment at CERN, and in so doing help a PhD student working with me, Marco, in his project.

CMS has a software that is not well optimized for speed. This makes it very hard to produce large amounts of Monte Carlo simulated events - something we direly need to determine the effectiveness of a selection aimed at isolating a tiny signal in a huge background of QCD events.

To make things clear, if you want to study a signal whose cross section (the probability of its production) is 10,000 times smaller than competing processes, then to have a balanced analysis you should generate 10,000 background events for every signal event - and you need many of the latter to characterize them and study their properties.

The figures above refer to producing top-antitop pairs at LHC (cross section of 800 pb) and background from QCD (cross section of about 8,000,000 pb). We have a simulation of 5000 top-antitop events. Should we embark in generating 50 million QCD events ? Of course not, it would take three years with the current resources we have.

The outburst of coding stems from an idea I had on how to clone QCD background events in a smart way. Since all we care about QCD events for our analysis is the number of jets, their transverse energy (Et), and the missing transverse energy (MEt )they produce, by properly determining resolution functions for the MEt and jet Et we can resample these functions, determining events similar to the original ones but in a much larger amount.

The beauty of this is that we can use CDF events in data and Monte Carlo (and in CDF we have LARGE Monte Carlo samples of QCD events to compare to data) to show whether our procedure is correctly extrapolating event characteristics.

I will post more about this idea when we have first results…