research-article

SA-CNN: Application to text categorization issues using simulated annealing-based convolutional neural network optimization

Authors:
Zihao Guo

Ecole Centrale de Nantes, France

Ecole Centrale de Nantes, France

0000-0002-4557-0395
View Profile

,
Yueying Cao

Ecole Centrale de Nantes, France

Ecole Centrale de Nantes, France

0000-0003-0170-4846
View Profile

EITCE '22: Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer EngineeringOctober 2022Pages 1932–1939https://doi.org/10.1145/3573428.3573788

Published:15 March 2023Publication History

EITCE '22: Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering

Pages 1932–1939

ABSTRACT

Convolutional neural networks (CNNs) are a representative class of deep learning algorithms including convolutional computation that perform translation invariant classification of input data based on their hierarchical architecture. However, classical convolutional neural network learning methods use the steepest descent algorithm for training, and the learning performance is greatly influenced by the initial weight settings of the convolutional and fully connected layers, requiring re-tuning to achieve better performance under different model structures and data. Combining the strengths of the simulated annealing algorithm in global search, we propose applying it to the hyperparameter search process in order to increase the effectiveness of convolutional neural networks (CNNs). In this paper, we introduce SA-CNN neural networks for text classification tasks based on Text-CNN neural networks and implement the simulated annealing algorithm for hyperparameter search. Experiments demonstrate that we can achieve greater classification accuracy than earlier models with manual tuning, and the improvement in time and space for exploration relative to human tuning is substantial.

References

Yoon Kim. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1746–1751, Doha, Qatar, October, 2014. Association for Computational Linguistics.Google ScholarCross Ref
Rie Johnson and Tong Zhang. Effective use of word order for text categorization with convolutional neural networks. arXiv preprint arXiv:1412.1058, 2014.Google Scholar
Tong He, Weilin Huang, Yu Qiao, and Jian Yao. Text-attentional convolutional neural network for scene text detection. IEEE transactions on image processing, 25 (6): 2529–2541, 2016.Google ScholarDigital Library
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1– 9, 2015.Google ScholarCross Ref
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.Google ScholarCross Ref
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.Google Scholar
Francisco Erivaldo Fernandes Junior and Gary G Yen. Particle swarm optimization of deep neural networks architectures for image classification. Swarm and Evolutionary Computation, 49:62–74, 2019.Google ScholarCross Ref
Deng K J Deng C M, Li C Application of genetic algorithm in text sentiment classification. Journal of Sichuan University (Natural Science Edition), 56(1):45–49, 2019.Google Scholar
Amr AbdelFatah Ahmed, Saad M Saad Darwish, and Mohamed M ElSherbiny. A novel automatic cnn architecture design approach based on genetic algorithm. In International Conference on Advanced Intelligent Systems and Informatics, pages 473–482. Springer, 2019.Google Scholar
Sehla Loussaief and Afef Abdelkrim. Convolutional neural network hyper-parameters optimization based on genetic algorithms. International Journal of Advanced Computer Science and Applications, 9(10), 2018.Google ScholarCross Ref
Jiang Su and Harry Zhang. A fast decision tree learning algorithm. In Aaai, volume 6, pages 500–505, 2006.Google Scholar
Andrew McCallum, Kamal Nigam, A comparison of event models for naive bayes text classification. In AAAI-98 workshop on learning for text categorization, volume 752, pages 41–48. Madison, WI, 1998.Google Scholar
Jingnian Chen, Houkuan Huang, Shengfeng Tian, and Youli Qu. Feature selection for text classification with na¨ıve bayes. Expert Systems with Applications, 36(3):5432–5435, 2009.Google ScholarDigital Library
Zi-Qiang Wang, Xia Sun, De-Xian Zhang, and Xin Li. An optimal svmbased text classification algorithm. In 2006 International Conference on Machine Learning and Cybernetics, pages 1378–1381. IEEE, 2006.Google Scholar
Zhou Yong, Lishi Youwen, and Xia Shixiong. An improved knn text classification algorithm based on clustering. Journal of computers, 4(3):230–237, 2009.Google Scholar
Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. Hierarchical attention networks for document classification. In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pages 1480–1489, 2016.Google ScholarCross Ref
Rie Johnson and Tong Zhang. Deep pyramid convolutional neural networks for text categorization. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 562–570, 2017.Google ScholarCross Ref
Wei Zhao, Jianbo Ye, Min Yang, Zeyang Lei, Suofei Zhang, and Zhou Zhao. Investigating capsule networks with dynamic routing for text classification. arXiv preprint arXiv:1804.00538, 2018.Google Scholar
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.Google Scholar
Scott Kirkpatrick, C Daniel Gelatt Jr, and Mario P Vecchi. Optimization by simulated annealing. science, 220(4598):671–680, 1983.Google Scholar
Weixun Yong, Jian Zhou, Danial Jahed Armaghani, MM Tahir, Reza Tarinejad, Binh Thai Pham, and Van Van Huynh. A new hybrid simulated annealing-based genetic programming technique to predict the ultimate bearing capacity of piles. Engineering with Computers, 37(3):2111–2127, 2021.Google ScholarDigital Library
DF Wong, Hon Wai Leong, and HW Liu. Simulated annealing for VLSI design, volume 42. Springer Science & Business Media, 2012.Google Scholar
Peter JM Van Laarhoven, Emile HL Aarts, and Jan Karel Lenstra. Job shop scheduling by simulated annealing. Operations research, 40(1):113–125, 1992.Google ScholarCross Ref
Christophe Andrieu, Nando De Freitas, Arnaud Doucet, and Michael I Jordan. An introduction to mcmc for machine learning. Machine learning, 50(1):5–43, 2003.Google ScholarCross Ref
Sheng Chen and Bing Lam Luk. Adaptive simulated annealing for optimization in signal processing applications. Signal Processing, 79(1):117–128, 1999.Google ScholarDigital Library
Geoffrey E Hinton. Boltzmann machine. Scholarpedia, 2 (5): 1668, 2007.Google ScholarCross Ref
LM Rasdi Rere, Mohamad Ivan Fanany, and Aniati Murni Arymurthy. Simulated annealing algorithm for deep learning. Procedia Computer Science, 72:137–144, 2015.Google ScholarCross Ref
Seyyed Mohammad Mousavi, Elham S Mostafavi, and Pengcheng Jiao. Next generation prediction model for daily solar radiation on horizontal surface using a hybrid neural network and simulated annealing method. Energy conversion and management, 153:671–682, 2017.Google Scholar
Bahram Choubin, Mahsa Abdolshahnejad, Ehsan Moradi, Xavier Querol, Amir Mosavi, Shahaboddin Shamshirband, and Pedram Ghamisi. Spatial hazard assessment of the pm10 using machine learning models in Barcelona, spain. Science of The Total Environment, 701:134474, 2020.Google ScholarCross Ref
Ayla Gu¨lcu¨ and Zeki Kus¸. Multi-objective simulated annealing for hyper-parameter optimization in convolutional neural networks. PeerJ Computer Science, 7:e338, 2021.Google ScholarCross Ref
Kevin I Smith, Richard M Everson, Jonathan E Fieldsend, Chris Murphy, and Rashmi Misra. Dominance-based multiobjective simulated annealing. IEEE Transactions on Evolutionary computation, 12(3): 323–342, 2008.Google ScholarDigital Library
Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. A convolutional neural network for modeling sentences. arXiv preprint arXiv:1404.2188, 2014.Google Scholar
Richard Socher, Brody Huval, Christopher D Manning, and Andrew Y Ng. Semantic compositionality through recursive matrix-vector spaces. In Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning, pages 1201–1211, 2012.Google Scholar
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, 26, 2013.Google Scholar

Recommendations

TextConvoNet: a convolutional neural network based architecture for text classification
Abstract
This paper presents, TextConvoNet, a novel Convolutional Neural Network (CNN) based architecture for binary and multi-class text classification problems. Most of the existing CNN-based models use one-dimensional convolving filters, where each ...
Read More
Convolutional Recurrent Neural Networks for Text Classification

Recurrent neural network (RNN) and convolutional neural network (CNN) are two prevailing architectures used in text classification. Traditional approaches combine the strengths of these two networks by straightly streamlining them or linking features ...
Read More
Attention pooling-based convolutional neural network for sentence modelling

Convolutional neural network has been proven to be a powerful semantic composition model for modelling sentences. A standard convolutional neural network usually consists of several convolutional and pooling layers at the bottom of a linear or non-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

EITCE '22: Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering
October 2022
1999 pages
ISBN:9781450397148
DOI:10.1145/3573428

Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 March 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Deep Learning
Self-optimization
Simulated Annealing Algorithm
Text Classification
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate508of972submissions,52%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 36
  Total Downloads
- Downloads (Last 12 months)30
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

SA-CNN: Application to text categorization issues using simulated annealing-based convolutional neural network optimization

EITCE '22: Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering

ABSTRACT

References

Cited By

Recommendations

TextConvoNet: a convolutional neural network based architecture for text classification

Convolutional Recurrent Neural Networks for Text Classification

Attention pooling-based convolutional neural network for sentence modelling

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

SA-CNN: Application to text categorization issues using simulated annealing-based convolutional neural network optimization

EITCE '22: Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering

ABSTRACT

References

Cited By

Recommendations

TextConvoNet: a convolutional neural network based architecture for text classification

Convolutional Recurrent Neural Networks for Text Classification

Attention pooling-based convolutional neural network for sentence modelling

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media