skip to main content
10.1145/3429210.3429220acmotherconferencesArticle/Chapter ViewAbstractPublication PagescsbioConference Proceedingsconference-collections
research-article

Classification of Protein Crystallization Images using EfficientNet with Data Augmentation

Published:20 November 2020Publication History

ABSTRACT

In this paper, we applied EfficientNet, a scalable deep convolution neural network, with a custom data augmentation stage to a public protein crystallization image dataset called MARCO. The MARCO dataset has 493,214 protein crystallization images collected from several well-known institutions. In our experiments, EfficientNet outperformed the accuracies reported in the previous studies, and it reached an overall 96.71% testing and 91.33% validation accuracy on the dataset. Also, EfficientNet achieved 97.23% crystal detection accuracy in testing data, which is significant improvement over existing studies.

References

  1. Andrew E Bruno, Patrick Charbonneau, Janet Newman, Edward H Snell, David R So, Vincent Vanhoucke, Christopher J Watkins, Shawn Williams, and Julie Wilson. 2018. Classification of crystallization outcomes using deep convolutional neural networks. PLOS one 13, 6 (2018), e0198883.Google ScholarGoogle ScholarCross RefCross Ref
  2. Andrew E Bruno, Patrick Charbonneau, Janet Newman, Edward H Snell, David R So, Vincent Vanhoucke, Christopher J Watkins, Shawn Williams, and Julie Wilson. 2018. MARCO Dataset. (2018).Google ScholarGoogle Scholar
  3. CrystalTrak. 2009. X-ray crystallography - CrystalTrak software user manual. http://xray.dhvi.duke.edu/files/documents/training%20-%20CrystalTrak.pdf. Accessed: 05-09-2020.Google ScholarGoogle Scholar
  4. DeepCrystal. 2017. DeepCrystal Tool Website. https://www.deepcrystal.com. Accessed: 05-09-2020.Google ScholarGoogle Scholar
  5. Formulatrix. 2002. Rock Maker official website. http://formulatrix.com/protein-crystallization/products/rock-maker/index.html. Accessed: 05-09-2020.Google ScholarGoogle Scholar
  6. Richard Giegé. 2013. A historical perspective on protein crystallization from 1840 to the present day. FEBS Journal 280, 24 (2013), 6456–6497.Google ScholarGoogle ScholarCross RefCross Ref
  7. Scott Harrison, Brian Lahue, Zhengwei Peng, Anthony Donofrio, Charlie Chang, and Meir Glick. 2017. Extending ‘predict first’to the design-make-test cycle in small-molecule drug discovery.Google ScholarGoogle Scholar
  8. Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7132–7141.Google ScholarGoogle ScholarCross RefCross Ref
  9. Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014).Google ScholarGoogle Scholar
  10. Alexander McPherson and Bob Cudney. 2014. Optimization of crystallization conditions for biological macromolecules. Structural Biology and Crystallization Communications 70, 11(2014), 1445–1467.Google ScholarGoogle ScholarCross RefCross Ref
  11. Janet Newman, Evan E Bolton, Jochen Müller-Dieckmann, Vincent J Fazio, DTravis Gallagher, David Lovell, Joseph R Luft, Thomas S Peat, David Ratcliffe, Roger A Sayle, 2012. On the need for an international effort to capture, share and use crystallization screening data. Acta Crystallographica Section F: Structural Biology and Crystallization Communications 68, 3 (2012), 253–258.Google ScholarGoogle ScholarCross RefCross Ref
  12. David W Opitz and Jude W Shavlik. 1996. Actively searching for an effective neural network ensemble. Connection Science 8, 3-4 (1996), 337–354.Google ScholarGoogle ScholarCross RefCross Ref
  13. Jathushan Rajasegaran, Vinoj Jayasundara, Sandaru Jayasekara, Hirunima Jayasekara, Suranga Seneviratne, and Ranga Rodrigo. 2019. Deepcaps: Going deeper with capsule networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 10725–10733.Google ScholarGoogle ScholarCross RefCross Ref
  14. Prajit Ramachandran, Barret Zoph, and Quoc V Le. 2017. Searching for activation functions. arXiv preprint arXiv:1710.05941(2017).Google ScholarGoogle Scholar
  15. Sara Sabour, Nicholas Frosst, and Geoffrey E Hinton. 2017. Dynamic routing between capsules. In Advances in neural information processing systems. 3856–3866. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarGoogle ScholarCross RefCross Ref
  17. Madhav Sigdel, Imren Dinc, Madhu S Sigdel, Semih Dinc, Marc L Pusey, and Ramazan S Aygun. 2017. Feature analysis for classification of trace fluorescent labeled protein crystallization images. BioData mining 10, 1 (2017), 14.Google ScholarGoogle Scholar
  18. Madhav Sigdel, Marc L. Pusey, and Ramazan S. Aygun. 2015. CrystPro: Spatiotemporal Analysis of Protein Crystallization Images. Crystal Growth & Design 15, 11 (2015), 5254–5262.Google ScholarGoogle ScholarCross RefCross Ref
  19. Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, and Alexander A Alemi. 2017. Inception-v4, inception-resnet and the impact of residual connections on learning. In Thirty-first AAAI conference on artificial intelligence. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2818–2826.Google ScholarGoogle ScholarCross RefCross Ref
  21. Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, and Quoc V Le. 2019. Mnasnet: Platform-aware neural architecture search for mobile. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2820–2828.Google ScholarGoogle ScholarCross RefCross Ref
  22. Mingxing Tan and Quoc V Le. 2019. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. arXiv preprint arXiv:1905.11946(2019).Google ScholarGoogle Scholar
  23. Julie Wilson. 2006. Automated classification of images from crystallisation experiments. In Industrial Conference on Data Mining. Springer, 459–473. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Sebastien C Wong, Adam Gatt, Victor Stamatescu, and Mark D McDonnell. 2016. Understanding data augmentation for classification: when to warp?. In 2016 international conference on digital image computing: techniques and applications (DICTA). IEEE, 1–6.Google ScholarGoogle ScholarCross RefCross Ref
  25. Zhilu Zhang and Mert Sabuncu. 2018. Generalized cross entropy loss for training deep neural networks with noisy labels. In Advances in neural information processing systems. 8778–8788. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Classification of Protein Crystallization Images using EfficientNet with Data Augmentation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      CSBio2020: CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics
      November 2020
      110 pages
      ISBN:9781450388238
      DOI:10.1145/3429210

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 20 November 2020

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate23of37submissions,62%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format