Skip to main content

Advertisement

Log in

End-to-End Physics Event Classification with CMS Open Data: Applying Image-Based Deep Learning to Detector Data for the Direct Classification of Collision Events at the LHC

  • Original Article
  • Published:
Computing and Software for Big Science Aims and scope Submit manuscript

Abstract

This paper describes the construction of novel end-to-end image-based classifiers that directly leverage low-level simulated detector data to discriminate signal and background processes in proton–proton collision events at the Large Hadron Collider at CERN. To better understand what end-to-end classifiers are capable of learning from the data and to address a number of associated challenges, we distinguish the decay of the standard model Higgs boson into two photons from its leading background sources using high-fidelity simulated CMS Open Data. We demonstrate the ability of end-to-end classifiers to learn from the angular distribution of the photons recorded as electromagnetic showers, their intrinsic shapes, and the energy of their constituent hits, even when the underlying particles are not fully resolved, delivering a clear advantage in such cases over purely kinematics-based classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Chatrchyan S et al (2008) The CMS experiment at the CERN LHC. JINST 3:S08004. https://doi.org/10.1088/1748-0221/3/08/S08004

    Article  Google Scholar 

  2. Sirunyan AM et al (2017) Particle-flow reconstruction and global event description with the CMS detector. JINST 12(10):P10003. https://doi.org/10.1088/1748-0221/12/10/P10003

    Article  Google Scholar 

  3. Khachatryan V et al (2014) Observation of the diphoton decay of the Higgs boson and measurement of its properties. Eur Phys J C74(10):3076. https://doi.org/10.1140/epjc/s10052-014-3076-z

    Article  ADS  Google Scholar 

  4. Baldi P, Sadowski P, Whiteson D (2014) Searching for exotic particles in high-energy physics with deep learning. Nat Commun 5:4308. https://doi.org/10.1038/ncomms5308

    Article  ADS  Google Scholar 

  5. de Oliveira L, Kagan M, Mackey L, Nachman B, Schwartzman A (2016) Jet-images—deep learning edition. JHEP 07:069. https://doi.org/10.1007/JHEP07(2016)069

    Article  ADS  Google Scholar 

  6. Kasieczka G, Plehn T, Russell M, Schell T (2017) Deep-learning top taggers or the end of QCD? JHEP 05:006. https://doi.org/10.1007/JHEP05(2017)006

    Article  ADS  Google Scholar 

  7. Bhimji W et al (2018) Deep neural networks for physics analysis on low-level whole-detector data at the LHC. J Phys Conf Ser 1085(4):042034. https://doi.org/10.1088/1742-6596/1085/4/042034

    Article  Google Scholar 

  8. Madrazo CF, Cacha IH, Iglesias LL, de Lucas JM (2017) Application of a Convolutional Neural Network for image classification to the analysis of collisions in high energy physics

  9. Louppe G, Cho K, Becot C, Cranmer K (2019) QCD-aware recursive neural networks for jet physics. JHEP 01:057. https://doi.org/10.1007/JHEP01(2019)057

    Article  ADS  Google Scholar 

  10. CERN (2017) Identification of jets containing \(b\)-Hadrons with recurrent neural networks at the ATLAS experiment. Tech. Rep. ATL-PHYS-PUB-2017-003. CERN, Geneva. https://cds.cern.ch/record/2255226

  11. Guest D, Collado J, Baldi P, Hsu SC, Urban G, Whiteson D (2016) Jet flavor classification in high-energy physics with deep neural networks. Phys Rev D 94:11. https://doi.org/10.1103/physrevd.94.112002

    Article  Google Scholar 

  12. Pearkes J, Fedorko W, Lister A, Gay C (2017) Jet constituents for deep neural network based top quark tagging

  13. Egan S, Fedorko W, Lister A, Pearkes J, Gay C (2017) Long short-term memory (LSTM) networks with jet constituents for boosted top tagging at the LHC

  14. Qu H, Gouskos L (2019) Particlenet: jet tagging via particle clouds

  15. Komiske PT, Metodiev EM, Thaler J (2019) Energy flow networks: deep sets for particle jets. J High Energy Phys 2019:1. https://doi.org/10.1007/jhep01(2019)121

    Article  Google Scholar 

  16. Agostinelli S et al (2003) GEANT4: a simulation toolkit. Nucl Instrum Methods A506:250. https://doi.org/10.1016/S0168-9002(03)01368-8

    Article  ADS  Google Scholar 

  17. Banerjee S (2012) CMS simulation software. J Phys Conf Ser 396:022003. https://doi.org/10.1088/1742-6596/396/2/022003

    Article  Google Scholar 

  18. de Favereau J et al (2014) DELPHES 3, a modular framework for fast simulation of a generic collider experiment. JHEP 02:057. https://doi.org/10.1007/JHEP02(2014)057

    Article  Google Scholar 

  19. CMS Collaboration (2017) Simulated dataset GluGluHToGG\_M-125\_8TeV-pythia6 in AODSIM format for 2012 collision data. CERN Open Data Portal. https://doi.org/10.7483/OPENDATA.CMS.WQ7P.BZP3

    Article  Google Scholar 

  20. CMS Collaboration (2017) Simulated dataset DiPhotonBorn\_Pt-25To250\_8TeV\_ext-pythia6 in AODSIM format for 2012 collision data. CERN Open Data Portal. https://doi.org/10.7483/OPENDATA.CMS.WV7J.8GN0

    Article  Google Scholar 

  21. CMS Collaboration (2017) Simulated dataset GJet\_Pt40\_doubleEMEnriched\_TuneZ2star\_8TeV\_ext-pythia6 in AODSIM format for 2012 collision data. CERN Open Data Portal. https://doi.org/10.7483/OPENDATA.CMS.2W51.W8AT

    Article  Google Scholar 

  22. Sjsötrand T, Mrenna S, Skands P (2006) PYTHIA 6.4 physics and manual. JHEP 0605:026. https://doi.org/10.1088/1126-6708/2006/05/026

    Article  ADS  MATH  Google Scholar 

  23. Cacciari M, Salam GP (2008) Pileup subtraction using jet areas. Phys Lett B 659:119. https://doi.org/10.1016/j.physletb.2007.09.077

    Article  ADS  Google Scholar 

  24. Chatrchyan S et al (2014) Description and performance of track and primary-vertex reconstruction with the CMS tracker. JINST 9:P10009. https://doi.org/10.1088/1748-0221/9/10/P10009

    Article  Google Scholar 

  25. He K, Zhang X, Ren S, Sun J (2016) Proceedings, 2016 IEEE conference on computer vision and pattern recognition (CVPR): Las Vegas, NV, USA, June 27–30, 2016, pp 770–778. https://doi.org/10.1109/CVPR.2016.90

  26. Rogozhnikov A, Bukva A, Gligorov VV, Ustyuzhanin A, Williams M (2015) New approaches for boosting to uniformity. JINST 10(03):T03002. https://doi.org/10.1088/1748-0221/10/03/T03002

    Article  ADS  Google Scholar 

  27. Yandex Data School (2017) Flavours of physics challenge evaluation. https://github.com/yandexdataschool/flavours-of-physics-start/blob/master/evaluation.py

  28. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization

  29. Paszke A et al (2017) Proceedings, 31st conference on neural information processing systems (NIPS 2017). Long Beach, CA, USA

  30. Nachman B (2019) A guide for deploying deep learning in LHC searches: how to achieve optimality and account for uncertainty

  31. Andrews M, Paulini M, Gleyzer S, Poczos B (2018) End-to-end event classification of high-energy physics data. J Phys Conf Ser 1085(4):042022. https://doi.org/10.1088/1742-6596/1085/4/042022

    Article  Google Scholar 

  32. Shi X et al (2015) Convolutional LSTM network: a machine learning approach for precipitation nowcasting

  33. Dobrescu BA, Landsberg GL, Matchev KT (2001) Higgs boson decays to CP odd scalars at the Tevatron and beyond. Phys Rev D 63:075003. https://doi.org/10.1103/PhysRevD.63.075003

    Article  ADS  Google Scholar 

Download references

Acknowledgements

We thank the entire CMS Collaboration for successfully recording LHC proton–proton collision data as well as producing and releasing high-quality simulated data used in this paper. We also congratulate all members in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centres and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to CMS analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector. We would like to thank the CERN Open Data group for releasing their simulated data under an open access policy. We strongly support initiatives to provide such high-quality simulated data sets that can encourage the development of novel but also realistic algorithms, especially in the area of machine learning. We believe that their continued availability will be of great benefit to the high-energy physics community in the long run. Finally, M.A. and M.P. are supported by the Office of High Energy Physics of the U.S. Department of Energy (DOE) under award DE-SC0010118.

On behalf of all authors, M. Andrews states that there is no conflict of interest.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Andrews.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Andrews, M., Paulini, M., Gleyzer, S. et al. End-to-End Physics Event Classification with CMS Open Data: Applying Image-Based Deep Learning to Detector Data for the Direct Classification of Collision Events at the LHC. Comput Softw Big Sci 4, 6 (2020). https://doi.org/10.1007/s41781-020-00038-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s41781-020-00038-8

Keywords

Navigation