Neural network approach to parton distributions fitting

https://doi.org/10.1016/j.nima.2005.11.206Get rights and content

Abstract

We will show an application of neural networks to extract informations on the structure of hadrons. A Monte Carlo over experimental data is performed to correctly reproduce data errors and correlations. A neural network is then trained on each Monte Carlo replica via a genetic algorithm. Results on the proton and deuteron structure functions and on the nonsinglet parton distribution will be shown.

Introduction

The requirements of precision physics at hadron colliders have recently led to a rapid improvement in the techniques for the determination of the structure of the nucleon. Playing this game factorization is a crucial issue. Indeed, it ensures that we can extract the parton structure of the nucleon from a process with only one initial proton (say, Deep Inelastic Scattering at HERA), and then we can use this as an input for a process where two initial protons are involved (Drell-Yan at LHC). In the QCD improved parton model the DIS structure function of the nucleon can be written asF2(x,Q2)=xq=1nfeq2Cqqq(x,Q2)+2nfCgg(x,Q2)where Q2=-q2=-(k-k)2, x=Q2/2p·q, and p, k and k are the momenta of the initial nucleon, the incoming lepton, and the scattered lepton, respectively; Ci are the coefficient functions pertubatively calculable, qq(x,Q2) and g(x,Q2) the quarks and the gluon distributions that describe the nonpertubative dynamics, the so called Parton Distribution Functions (PDFs).

The extraction of a PDF from experimental data is not trivial, even if it is a well-estabilished task. In order to do that we have to evolve the PDFs to the scale of data, perform the x-convolution, add theoretical uncertainties (resummation, nuclear corrections, higher twist, heavy quark thresholds, ), and then deconvolute in order to have a function of x at a common scale Q2.

Recently, it has been pointed out that the uncertainty associated with a PDFs set is crucial [1], [2], [3]. The uncertainty on a PDF is given by the probability density P[f] in the space of functions f(x), that is, the measure we use to perform the functional integral that gives us the expectation valueF[f(x)]=DfF[f(x)]P[f(x)]where F[f] is an arbitrary function of f(x). Thus, when we extract a PDF we want to determine an infinite-dimensional object (a function) from finite set of data points, and this is a mathematically ill-posed problem.

The standard approach is to choose a simple functional form with enough free parameters (q(x,Q02)=xα(1-x)βP(x)), and to fit parameters by minimizing χ2. Some difficulties arise: errors and correlations of parameters require at least fully correlated analysis of data errors; error propagation to observables is difficult: many observables are nonlinear/nonlocal functional of parameters; theoretical bias due to choice of parametrization is difficult to assess (effects can be large if data are not precise or hardly compatible).

Here, we present an alternative approach to this problem. First, we will show our technique applied to the determination of the Structure Functions. This is the easiest case, since no evolution is required, but only data fitting, thus it is a good application to test the technique. Then, we will show how this approach can be extended for the determination of the PDFs.

Section snippets

Structure functions

The strategy presented in Refs. [4], [5] to address the problem of parametrizing deep inelastic structure functions F(x,Q2) is a combination of two techniques: a Monte Carlo sampling of the experimental data and a neural network training on each data replica.

The Monte Carlo sampling of experimental data is performed generating Nrep replicas of the original Ndat experimental data,Fi(art)(k)=(1+rN(k)σN)Fi(exp)+ris,(k)σistat+l=1Nsysrl,(k)σisys,lwhere i=1,,Ndat, r are Gaussian random numbers with

Parton distributions

The strategy presented in the above section can be used to parametrize parton distributions as well, provided one now takes into account Altarelli–Parisi QCD evolution.

Now neural networks are used to parametrize the PDF at a reference scale. We choose an architecture with 2 inputs (x,logx), two hidden layers with two neurons each, and one output, q(x,Q02). The training on each replica is performed only with the Genetic Algorithm, since we have a nonlocal function to be minimized (see Eqs. (1

References (13)

  • A. Djouadi et al.

    Phys. Lett. B

    (2004)
  • S. Frixione et al.

    JHEP

    (2004)
  • W.K. Tung, AIP Conference Proceedings, vol. 753, 2005, p....
  • S. Forte et al.

    JHEP

    (2002)
  • L. Del Debbio et al.

    JHEP

    (2005)
  • C. Peterson, T. Rognvaldsson, LU-TP-91-23, Lectures given at 1991 CERN School of Computing, Ystad,...
There are more references available in the full text version of this article.
View full text