Abstract
This paper describes the construction of novel end-to-end image-based classifiers that directly leverage low-level simulated detector data to discriminate signal and background processes in proton–proton collision events at the Large Hadron Collider at CERN. To better understand what end-to-end classifiers are capable of learning from the data and to address a number of associated challenges, we distinguish the decay of the standard model Higgs boson into two photons from its leading background sources using high-fidelity simulated CMS Open Data. We demonstrate the ability of end-to-end classifiers to learn from the angular distribution of the photons recorded as electromagnetic showers, their intrinsic shapes, and the energy of their constituent hits, even when the underlying particles are not fully resolved, delivering a clear advantage in such cases over purely kinematics-based classifiers.
Similar content being viewed by others
References
Chatrchyan S et al (2008) The CMS experiment at the CERN LHC. JINST 3:S08004. https://doi.org/10.1088/1748-0221/3/08/S08004
Sirunyan AM et al (2017) Particle-flow reconstruction and global event description with the CMS detector. JINST 12(10):P10003. https://doi.org/10.1088/1748-0221/12/10/P10003
Khachatryan V et al (2014) Observation of the diphoton decay of the Higgs boson and measurement of its properties. Eur Phys J C74(10):3076. https://doi.org/10.1140/epjc/s10052-014-3076-z
Baldi P, Sadowski P, Whiteson D (2014) Searching for exotic particles in high-energy physics with deep learning. Nat Commun 5:4308. https://doi.org/10.1038/ncomms5308
de Oliveira L, Kagan M, Mackey L, Nachman B, Schwartzman A (2016) Jet-images—deep learning edition. JHEP 07:069. https://doi.org/10.1007/JHEP07(2016)069
Kasieczka G, Plehn T, Russell M, Schell T (2017) Deep-learning top taggers or the end of QCD? JHEP 05:006. https://doi.org/10.1007/JHEP05(2017)006
Bhimji W et al (2018) Deep neural networks for physics analysis on low-level whole-detector data at the LHC. J Phys Conf Ser 1085(4):042034. https://doi.org/10.1088/1742-6596/1085/4/042034
Madrazo CF, Cacha IH, Iglesias LL, de Lucas JM (2017) Application of a Convolutional Neural Network for image classification to the analysis of collisions in high energy physics
Louppe G, Cho K, Becot C, Cranmer K (2019) QCD-aware recursive neural networks for jet physics. JHEP 01:057. https://doi.org/10.1007/JHEP01(2019)057
CERN (2017) Identification of jets containing \(b\)-Hadrons with recurrent neural networks at the ATLAS experiment. Tech. Rep. ATL-PHYS-PUB-2017-003. CERN, Geneva. https://cds.cern.ch/record/2255226
Guest D, Collado J, Baldi P, Hsu SC, Urban G, Whiteson D (2016) Jet flavor classification in high-energy physics with deep neural networks. Phys Rev D 94:11. https://doi.org/10.1103/physrevd.94.112002
Pearkes J, Fedorko W, Lister A, Gay C (2017) Jet constituents for deep neural network based top quark tagging
Egan S, Fedorko W, Lister A, Pearkes J, Gay C (2017) Long short-term memory (LSTM) networks with jet constituents for boosted top tagging at the LHC
Qu H, Gouskos L (2019) Particlenet: jet tagging via particle clouds
Komiske PT, Metodiev EM, Thaler J (2019) Energy flow networks: deep sets for particle jets. J High Energy Phys 2019:1. https://doi.org/10.1007/jhep01(2019)121
Agostinelli S et al (2003) GEANT4: a simulation toolkit. Nucl Instrum Methods A506:250. https://doi.org/10.1016/S0168-9002(03)01368-8
Banerjee S (2012) CMS simulation software. J Phys Conf Ser 396:022003. https://doi.org/10.1088/1742-6596/396/2/022003
de Favereau J et al (2014) DELPHES 3, a modular framework for fast simulation of a generic collider experiment. JHEP 02:057. https://doi.org/10.1007/JHEP02(2014)057
CMS Collaboration (2017) Simulated dataset GluGluHToGG\_M-125\_8TeV-pythia6 in AODSIM format for 2012 collision data. CERN Open Data Portal. https://doi.org/10.7483/OPENDATA.CMS.WQ7P.BZP3
CMS Collaboration (2017) Simulated dataset DiPhotonBorn\_Pt-25To250\_8TeV\_ext-pythia6 in AODSIM format for 2012 collision data. CERN Open Data Portal. https://doi.org/10.7483/OPENDATA.CMS.WV7J.8GN0
CMS Collaboration (2017) Simulated dataset GJet\_Pt40\_doubleEMEnriched\_TuneZ2star\_8TeV\_ext-pythia6 in AODSIM format for 2012 collision data. CERN Open Data Portal. https://doi.org/10.7483/OPENDATA.CMS.2W51.W8AT
Sjsötrand T, Mrenna S, Skands P (2006) PYTHIA 6.4 physics and manual. JHEP 0605:026. https://doi.org/10.1088/1126-6708/2006/05/026
Cacciari M, Salam GP (2008) Pileup subtraction using jet areas. Phys Lett B 659:119. https://doi.org/10.1016/j.physletb.2007.09.077
Chatrchyan S et al (2014) Description and performance of track and primary-vertex reconstruction with the CMS tracker. JINST 9:P10009. https://doi.org/10.1088/1748-0221/9/10/P10009
He K, Zhang X, Ren S, Sun J (2016) Proceedings, 2016 IEEE conference on computer vision and pattern recognition (CVPR): Las Vegas, NV, USA, June 27–30, 2016, pp 770–778. https://doi.org/10.1109/CVPR.2016.90
Rogozhnikov A, Bukva A, Gligorov VV, Ustyuzhanin A, Williams M (2015) New approaches for boosting to uniformity. JINST 10(03):T03002. https://doi.org/10.1088/1748-0221/10/03/T03002
Yandex Data School (2017) Flavours of physics challenge evaluation. https://github.com/yandexdataschool/flavours-of-physics-start/blob/master/evaluation.py
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization
Paszke A et al (2017) Proceedings, 31st conference on neural information processing systems (NIPS 2017). Long Beach, CA, USA
Nachman B (2019) A guide for deploying deep learning in LHC searches: how to achieve optimality and account for uncertainty
Andrews M, Paulini M, Gleyzer S, Poczos B (2018) End-to-end event classification of high-energy physics data. J Phys Conf Ser 1085(4):042022. https://doi.org/10.1088/1742-6596/1085/4/042022
Shi X et al (2015) Convolutional LSTM network: a machine learning approach for precipitation nowcasting
Dobrescu BA, Landsberg GL, Matchev KT (2001) Higgs boson decays to CP odd scalars at the Tevatron and beyond. Phys Rev D 63:075003. https://doi.org/10.1103/PhysRevD.63.075003
Acknowledgements
We thank the entire CMS Collaboration for successfully recording LHC proton–proton collision data as well as producing and releasing high-quality simulated data used in this paper. We also congratulate all members in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centres and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to CMS analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector. We would like to thank the CERN Open Data group for releasing their simulated data under an open access policy. We strongly support initiatives to provide such high-quality simulated data sets that can encourage the development of novel but also realistic algorithms, especially in the area of machine learning. We believe that their continued availability will be of great benefit to the high-energy physics community in the long run. Finally, M.A. and M.P. are supported by the Office of High Energy Physics of the U.S. Department of Energy (DOE) under award DE-SC0010118.
On behalf of all authors, M. Andrews states that there is no conflict of interest.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Andrews, M., Paulini, M., Gleyzer, S. et al. End-to-End Physics Event Classification with CMS Open Data: Applying Image-Based Deep Learning to Detector Data for the Direct Classification of Collision Events at the LHC. Comput Softw Big Sci 4, 6 (2020). https://doi.org/10.1007/s41781-020-00038-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s41781-020-00038-8