Elsevier

Computers & Geosciences

Volume 171, February 2023, 105284
Computers & Geosciences

Impact of dataset size and convolutional neural network architecture on transfer learning for carbonate rock classification

https://doi.org/10.1016/j.cageo.2022.105284Get rights and content
Under a Creative Commons license
open access

Highlights

  • Top-performing CNNs are compared for application to geological classification tasks

  • We present results using the largest dataset of carbonate core images to date

  • Most geological studies of deep learning use datasets smaller than 10,000 points

  • Even transfer learning methods overfit on datasets smaller than 100,000 data points

  • Different architectures are more appropriate depending on the size of dataset used

Abstract

Modern geological practices, in both industry and academia, rely largely on a legacy of observational data at a range of scales. However, widespread ambiguities in the petrographic description of rock facies reduce the reliability of descriptive data. Previous studies have demonstrated a great potential for the use of convolutional neural networks (CNNs) in the classification of facies from digital images; however, it remains to be determined which of the available CNN architectures performs best for a geological classification task. We evaluate the ability of top-performing CNNs to classify carbonate core images using transfer learning, systematically developing a performance comparison between these architectures on a complex geological dataset. Three datasets with orders of magnitude difference in data quantity (7000–104,000 samples) were created that contain images across seven classes from the modified Dunham Classification for carbonate rocks. Following training of nine different CNNs of four architectures on these datasets, we find the Inception-v3 architecture to be most suited to this classification task, achieving 92% accuracy when trained on the larger dataset. Furthermore, we show that even when using transfer learning the size of the dataset plays a key role in the performance of the models, with those trained on the smaller datasets showing a strong tendency to overfit. This has direct implications for the application of deep learning in geosciences as many papers currently published use very small datasets of less than 5000 samples. Application of the framework developed in this research could aid the future of deep learning based carbonate classification, with further potential to be easily modified to suit the classification of cores originating from different formations and lithologies.

Keywords

Deep learning
Machine learning
Dunham classification
Geological images

Data availability

Data will be made available on request.

Cited by (0)