Abstract
The selection of discriminative terms from large quantity of terms in text documents is helpful for achieving better accuracy of text classification. To focus on the task of selecting discriminative terms from text, a deep learning based feature selection method is proposed. The method is developed by using the long short term memory (LSTM) network. A deep network based on LSTM is trained in unsupervised manner to extracted deep features from bag-of-words term frequency vectors. The deep features are integrated with term frequencies to evaluate the effectiveness of terms. The proposed method extends the limitation of term frequency information by applying deep features for feature selection. Experiments in nine public datasets demonstrate better performance of our method in selecting discriminative terms than comparative methods.
Similar content being viewed by others
Data availability
The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request.
References
Abdi A, Shamsuddin S, Hasan MS et al (2019) Deep learning-based sentiment classification of evaluative text based on multi-feature fusion. Inf Process Manage 56(4):1245–1259
Abdullah M, Hadzikadic M, Shaikh S (2018) SEDAT: sentiment and emotion detection in Arabic text using CNN-LSTM deep learning. In: Proceedings of 17th IEEE international conference on machine learning and applications (IEEE ICMLA), Orlando, pp 835–840
Abdur R, Kashif J, Haroon AB (2017) Feature selection based on a normalized difference measure for text classification. Inf Process Manage 53:473–489
Adel A, Omar N, Abdullah S, Al-Shabi A (2022) Co-operative binary bat optimizer with rough set reducts for text feature selection. Appl Sci-Basel 12(21):1–35
Agnihotri D, Verma K, Tripathi P (2017) Variable global feature selection scheme for automatic classification of text documents. Expert Syst Appl 81:268–281
Agnihotri D, Verma K, Tripathi P et al (2019) Soft voting technique to improve the performance of global filter based feature selection in text corpus. Appl Intell 49:1597–1619
Al-Dyani WZ, Ahmad FK, Kamaruddin SS (2022) adaptive binary bat and markov clustering algorithms for optimal text feature selection in news events detection model. IEEE Access 10(85655):85676
Ali F, El-Sappagh S, Kwak D (2019) Fuzzy ontology and LSTM-based text mining: a transportation network monitoring system for assisting travel. Sensors 19(2):234
Asim M, Javed K, Rehman A, Babri HA (2021) Int J Mach Learn Cyber 12(9):2461–2478
Azam N, Yao JT (2012) Comparison of term frequency and document frequency based feature selection metrics in text categorization. Expert Syst Appl 39(5):4760–4768
Balderas D, Ponce P, Molina A (2019) Convolutional long short term memory deep neural networks for image sequence prediction. Expert Syst Appl 122:152–162
Bharti KK, Singh PK (2015) Hybrid dimension reduction by integrating feature selection with feature extraction method for text clustering. Expert Syst Appl 42:3105–3114
Bharti KK, Singh PK (2016) Opposition chaotic fitness mutation based adaptive inertia weight BPSO for feature selection in text clustering. Appl Soft Comput 43:20–34
Breuel TM (2017) High performance text recognition using a hybrid convolutional-LSTM implementation. In: Proceedings of 14th IAPR international conference on document analysis and recognition (ICDAR), Kyoto, pp 11–16
Brunello A, Sciavicco G (2019) Multiobjective evolutionary feature selection and fuzzy classification of contact centre data. Expert Systems 36(3):e12375
Cekik R, Uysal AK (2020) A novel filter feature selection method using rough set for short text data. Expert Syst Appl 160:1–15
Chen Z, Tondi B, Li X et al (2019) Secure detection of image manipulation by means of random feature selection. IEEE Trans Inf Forensics Secur 14(9):2454–2469
Cheng CH, Chen HH (2019) Sentimental text mining based on an additional features method for text classification. PLoS One 14(6):e0217591
Ciarelli PM, Oliveira E (2009) Agglomeration and elimination of terms for dimensionality reduction. In: Proceedings of 9th International Conference on Intelligent Systems Design and Applications. Pias, Italy, pp 547–552
Ciarelli PM, Salles EOT, Oliveira E (2010) An evolving system based on probabilistic neural Network. In: Proceedings of 2010 Eleventh Brazilian Symposium on Neural Networks, Sao Paulo, Brazil, Vol. 1, pp. 182–187
Cui Q, EI-Arroudi K, Weng Y (2019) A feature selection method for high impedance fault detection. IEEE Trans Power Delivery 34(3):1203–1215
Deng X, Li Y, Weng J et al (2019) Feature selection for text classification: a review. Multimed Tools Appl 78:3739–3816
El-Hajj W, Hajj H (2022) An optimal approach for text feature selection. Comput Speech Lang 74:1–14
Erenel Z, Adegboye OR, Kusetogullari H (2020) A new feature selection scheme for emotion recognition from text. Appl Sci-Basel 10(15):1–13
FarghalyAbd El-Hafeez HMT (2023) A high-quality feature selection method based on frequent and correlated items for text classification. Soft Comput 27(16):11259–11274
Feng G, An B, Yang F et al (2017) Relevance popularity: a term event model based feature selection scheme for text classification. PLoS One 12(4):e0174341
Feng G, Guo J, Jing BY, Sun T (2015) Feature subset selection using naive bayes for text classification. Pattern Recogn Lett 65:109–115
Fernandes M, Canito A, Bolon-Canedo V et al (2019) Data analysis and feature selection for predictive maintenance: a case-study in the metallurgic industry. Int J Inform Manag 46:252–262
Fu G, Li B, Yang Y, Li C (2023) Re-ranking and TOPSIS-based ensemble feature selection with multi-stage aggregation for text categorization. Pattern Recogn Lett 168(47):56
Garg M (2022) UBIS: unigram bigram importance score for feature selection from short text. Expert Syst Appl 195:1–10
Ganesan K, Zhai CX (2012) Opinion-based entity ranking. Inf Retrieval 15(2):116–150
Gao Z, Xu Y, Meng F, Qi F, Lin Z (2014) Improved information gain-based feature selection for text categorization, In: Proceedings of the 4th International Conference on Wireless Communications, Vehicular Technology, Information Theory and Aerospace & Electronic Systems, Aalborg, Denmark
Ghareb AS, Bakar AA, Hamdan AR (2016) Hybrid feature selection based on enhanced genetic algorithm for text categorization. Expert Syst Appl 49:31–47
Guo Y, Li W, Wang B et al (2019) DeepACLSTM: deep asymmetric convolutional long short-term memory neural models for protein secondary structure prediction. BMC Bioinform 20:341
Hosseinalipour A, Gharehchopogh FS, Masdari M, Khademi A (2021) A novel binary farmland fertility algorithm for feature selection in analysis of the text psychology. Appl Intell 51(7):4824–4859
Hu Q, Sulla-Menashe D, Xu B et al (2019) A phenology-based spectral and temporal feature selection method for crop mapping from satellite time series. Int J Appl Earth Obs Geoinf 80:218–229
Jang B, Kim M, Harerimana G, Kang SU, Kim JW (2020) Bi-LSTM model to increase accuracy in text classification: combining Word2vec CNN and attention mechanism. Appl Sci-Basel 10(17):1–14
Jiang JW, Zhang HY, Dai CX, Zhao QJ, Feng H, Ji ZL, Ganchev I (2021) Enhancements of attention-based bidirectional LSTM for hybrid automatic text summarization. IEEE Access 9:123660–123671
Jin BL, Zhang L, Zhao L (2023) Feature selection based on absolute deviation factor for text classification. Inf Process Manage 60(3):1–31
Jin C, Ma T, Hou R et al (2015) Chi-square statistics feature selection based on term frequency and distribution for text categorization. IETE J Res 61(4):351–362
Joachims T (1999) Transductive Inference for Text Classification using Support Vector Machines. In: 16th International Conference on Machine Learning, Bled, Slovenia, pp. 200–209
Kashif J, Haroon AB, Sameen M (2016) Improving text classification performance with random forests-based feature selection. Arab J Sci Eng 41(3):951–964
Kashif J, Sameen M, Haroon AB (2015) A two-stage markov blanket based feature selection algorithm for text classification. Neurocomputing 157:91–104
Karthiga R, Mangai S (2019) Feature selection using multi-objective modified genetic algorithm in multimodal biometric system. J Med Syst 43(7):214
Kilinç D, Özçift A, Bozyiğit F, Yildirim P, Yucalar F, Borandağ E (2015) Ttc-3600: a new benchmark dataset for Turkish text categorization. J Inf Sci 43(2):174–185
Kotzias D, Denil M, De Freitas N, Smyth P (2015) From group to individual labels using deep features. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia, pp. 597–606
Kozodoi N, Lessmann S, Papakonstantinou K et al (2019) A multi-objective approach for profit-driven feature selection in credit scoring. Decis Support Syst 120:106–117
Kumar MRP, Jayagopal P (2023) Context-sensitive lexicon for imbalanced text sentiment classification using bidirectional LSTM. J Intell Manuf 34(5):2123–2132
Kushwaha N, Pant M (2018) Link based BPSO for feature selection in big data text clustering. Futur Gener Comput Syst 82:190–199
Lamirel JC, Cuxac P, Chivukula AS et al (2015) Optimizing text classification through efficient feature selection based on quality metric. J Intell Inf Syst 45(3):379–396
Leclercq M, Vittrant B, Martin-Magniette ML et al (2019) Large-scale automatic feature selection for biomarker discovery in high-dimensional OMICs data. Front Genet 10:452
Li L, Li W, Gong D (2019) Naive bayesian automatic classification of railway service complaint text based on eigenvalue extraction. Tehnicki Vjesnik-Technical Gazette 26(3):778–785
Li Q, Dong J, Zhong J et al (2019) A neural model for type classification of entities for text. Knowl-Based Syst 176:122–132
Li CB, Zhang GH, Li ZH (2018) News text classification based on improved Bi-LSTM-CNN. In: Proceedings of 9th international conference on information Technology in Medicine and Education (ITME), Hangzhou, pp 890–893
Li BY, Zhou KM, Gao W et al (2017) Attention-based LSTM-CNNs for uncertainty identification on Chinese social media texts. In: Proceedings of international conference on security, pattern analysis, and cybernetics (ICSPAC), Shenzhen, pp 609–614
Lim CG, Choi HJ (2018) LSTM-based model for extracting temporal relations from Korean text. In: Proceedings of IEEE international conference on big data and smart computing (BigComp), Shanghai, pp 666–668
Lim H, Kim DW (2020) Generalized term similarity for feature selection in text classification using quadratic programming. Entropy 22(4):1–12
Liu Y (2019) Novel volatility forecasting using deep learning-long short term memory recurrent neural networks. Expert Syst Appl 132:99–109
Liu G, Guo JB (2019) Bidirectional LSTM with attention mechanism and convolutional layer for text classification. Neurocomputing 337:325–338
Liu Y, Jin X, Shen H (2019) Towards early identification of online rumors based on long short-term memory networks. Inf Process Manage 56(4):1457–1467
Lu Y, Liang M, Ye Z, Cao L (2015) Improved particle swarm optimization algorithm and its application in text feature selection. Appl Soft Comput 35:629–636
Mahdieh L, Parham M, Fardin A, Mahdi J (2018) A novel multivariate filter method for feature selection in text classification problems. Eng Appl Artif Intell 70:25–37
Manochandar S, Punniyamoorthy M (2018) Scaling feature selection method for enhancing the classification performance of support vector machines in text mining. Comput Ind Eng 124:139–156
Marafino BJ, Boscardin JW, Dudley AR (2015) Efficient and sparse feature selection for biomedical text classification via the elastic net: application to ICU risk stratification from nursing notes. J Biomed Inform 54:114–120
Melike T, Murat CG, Selim A (2016) Helmholtz principle based supervised and unsupervised feature selection methods for text mining. Inf Process Manage 52:885–910
Mustafa AM, Rashid TA (2018) Kurdish stemmer pre-processing steps for improving information retrieval. J Inf Sci 44(1):15–27
Ni C, Chen X, Wu F et al (2019) An empirical study on pareto based multi-objective feature selection for software defect prediction. J Syst Softw 152:215–238
Nowak J, Taspinar A, Scherer R (2017) LSTM recurrent neural networks for short text and sentiment classification. In: Proceedings of 16th International Conference on Artificial Intelligence and Soft Computing (ICAISC), vol 10246, Zakopane, pp 553–562
Ong P, Tieh THC, Lai KH et al (2019) Efficient gear fault feature selection based on moth-flame optimisation in discrete wavelet packet analysis domain. J Braz Soc Mech Sci Eng 41(6):266
Parwez MA, Abulaish M, Jahiruddin (2019) Multi-label classification of microblogging texts using convolution neural network. IEEE Access 7(68678):68691
Pinheiro RHW, Cavalcanti GDC, Ren TI (2015) Data-driven global-ranking local feature selection methods for text categorization. Expert Syst Appl 42:1941–1949
Rashid TA, Mustafa AM, Saeed AM (2017) A robust categorization system for kurdish sorani text documents. Inf Technol J 16(1):27–34
Rashid TA, Mustafa AM, Saeed AM (2017b) Automatic Kurdish text classification using KDC 4007 dataset. In: Proceedings of the 5th International Conference on Emerging Internetworking, Data & Web Technologies, Wuhan, China, Vol. 6, pp.187–198
Saeed MM, Al Aghbari Z (2022) ARTC: feature selection using association rules for text classification. Neural Comput Appl 34(24):22519–22529
Sahu SK, Anand A (2018) Drug-drug interaction extraction from biomedical texts using long short-term memory network. J Biomed Inform 86:15–24
Sasankan N, Geng H, Zhong H et al (2019) Identifying predictive radiomic markers for patients in RTOG0617 using multiple feature selection methods. Med Phys 46(6):E336–E436
Shang W, Huang H, Zhu H, Lin Y, Qu Y, Wang Z (2007) A novel feature selection algorithm for text categorization. Expert Syst Appl 33(1):1–5
She XY, Zhang D (2018) Text classification based on hybrid CNN-LSTM hybrid model. In: Proceedings of 11th international symposium on computational intelligence and design (ISCID), Hangzhou, pp 185–189
Shi SM, Zhao M, Guan J et al (2017) A hierarchical LSTM model with multiple features for sentiment analysis of Sina Weibo texts. In: Proceedings of international conference on Asian language processing (IALP), Singapore, pp 379–382
Shih CH, Yan BC, Liu SH et al (2017) Investigating Siamese LSTM networks for text categorization. In: Proceedings of 9th annual summit and conference of the Asia-Pacific-signal-and-information-processing-association (APSIPA ASC), Kuala Lumpur, pp 641–646
Shu B, Ren FJ, Bao YW (2018) Investigating Lstm with k-max pooling for text classification. In: Proceedings of 11th international conference on intelligent computation technology and automation (ICICTA), Changsha, pp 31–34
Singh G, Nagpal A, Singh V (2023) Optimal feature selection and invasive weed tunicate swarm algorithm-based hierarchical attention network for text classification. Connect Sci 35(1):1–25
Song SL, Huang HT, Ruan TX (2019) Abstractive text summarization using LSTM-CNN based deep learning. Multimed Tools Appl 78(1):857–875
Sprugnoli R, Tonelli S (2019) Novel event detection and classification for historical texts. Comput Linguist 45(2):229–265
Su MH, Wu CH, Huang KY et al (2018) LSTM-based text emotion recognition using semantic and emotional word vectors. In: Proceedings of 1st Asian conference on affective computing and intelligent interaction (ACII Asia), Beijing
Sun CJ, Liu Y, Jia CE et al (2017) Recognizing text entailment via bidirectional LSTM model with inner-attention. In: Proceedings of 13th International Conference on Intelligent Computing (ICIC), vol 10363, Liverpool, pp 448–457
Tang B, Kay S, He H (2016) Toward optimal feature selection in naïve bayes for text categorization. IEEE Trans Knowl Data Eng 28(9):2508–2521
Tang X, Dai Y, Xiang Y (2019) Feature selection based on feature interactions with application to text categorization. Expert Syst Appl 120:207–216
Tan AH, Ridge K, Labs D, Terrace HMK (1999) Text mining: the state of the art and the challenges. In: Proceedings of the Pakdd Workshop on Knowledge Disocovery from Advanced Databases, pp. 65–70
Thirumoorthy K, Nuneeswaran K (2021) Feature selection using hybrid poor and rich optimization algorithm for text classification. Pattern Recogn Lett 147:63–70
Tomer M, Kumar M (2020) Improving text summarization using ensembled approach based on fuzzy with LSTM. Arab J Sci Eng 45(12):10743–10754
Tommasel A, Godoy D (2018) Short-text feature construction and selection in social media data: a survey. Artif Intell Rev 49(3):301–338
Uğuz H (2011) A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm. Knowl-Based Syst 24(7):1024–1032
Uysal AK (2016) An improved global feature selection scheme for text classification. Expert Syst Appl 43:82–92
Uysal AK (2018) On two-stage feature selection methods for text classification. IEEE Access 6:43233–43251
VeeraSekharReddy B, Rao KS, Koppula N (2023) An attention based bi-LSTM DenseNet model for named entity recognition in english texts. Wireless Pers Commun 130:1435–1448
Wan C, Wang Y, Liu Y et al (2019) Composite feature extraction and selection for text classification. IEEE Access 7:35208–35219
Wang J, Cao ZW (2017) Chinese text sentiment analysis using LSTM network based on L2 and Nadam. In: Proceedings of 2017 17th IEEE international conference on communication technology (ICCT 2017), Chengdu, pp 1891–1895
Wang G, Lochovsky FH (2004) Feature selection with conditional mutual information maximin in text categorization. In: Proceedings of the International Conference on Information and Knowledge Management, Washington, D.C., USA, pp.342–349
Wang H, Hong M (2015) Distance variance score: an efficient feature selection method in text classification. Math Probl Eng 2015:695720
Wang H, Hong M (2019) Supervised hebb rule based feature selection for text classification. Inf Process Manage 56:167–191
Wang HT, Li FB (2022) A text classification method based on LSTM and graph attention network. Connect Sci 34(1):2466–2480
Wang S, Wang X, Wang S et al (2019) Bi-directional long short-term memory method based on attention mechanism and rolling update for short-term load forecasting. Int J Electr Power Energy Syst 109:470–479
Wang W, Hong T, Xu X et al (2019) Forecasting district-scale energy dynamics through integrating building network and long short-term memory learning algorithm. Appl Energy 248:217–230
Wang Y, Feng L (2018) A new feature selection method for handling redundant information in text classification. Front Inform Technol Electron Eng 19(2):221–234
Witten IH, Frank E, Hall MA, Pal CJ (2017) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, Cambridge
Wu JL, He YY, Yu LC, Lai KR (2020) Identifying emotion labels from psychiatric social texts using a bi-directional LSTM-CNN model. IEEE Access 8(66638):66646
Wu X, Fei MR, Wu DK et al (2023) Enhanced binary black hole algorithm for text feature selection on resources classification. Knowl-Based Syst 274:1–23
Xiao LZ, Wang GZ, Zuo Y (2018) Research on patent text classification based on Word2Vec and LSTM. In: Proceedings of 11th international symposium on computational intelligence and design (ISCID), Hangzhou, pp 71–74
Xu HS, Hu B (2022) Legal text recognition using LSTM-CRF deep learning model. Comput Intell Neurosci 2022:1–10
Xu F, Yi G, Qi W et al (2018) Research on automatic summary of Chinese short text based on LSTM and keywords correction. In: Proceedings of 10th international conference on advanced computational intelligence (ICACI), Xiamen, pp 467–472
Yao WX, Liu J, Cai ZH (2017) Personal attributes extraction in Chinese text based on distant-supervision and LSTM. In: Proceedings of 12th KIPS International Conference on Ubiquitous Information Technologies and Applications (CUTE) / 9th International Conference on Computer Science and its Applications (CSA), vol 474, Taiwan, pp 511–515
Yao L, Mao C, Luo Y (2019) Clinical text classification with rule-based features and knowledge-guided convolutional neural networks. BMC Med Inform Decis Mak 19(S3):71
Yin ZY, Shao JS, Hussain MJ, Hao YJ, Chen Y, Zhang XF, Wang L (2023) DPG-LSTM: An Enhanced LSTM Framework for Sentiment Analysis in Social Media Text Based on Dependency Parsing and GCN. Appl Sci-Basel 13(1):1–17
Zhai ZL, Zhang X, Fang FF, Yao LY (2023) Text classification of Chinese news based on multi-scale CNN and LSTM hybrid model. Multimed Tools Appl 82(14):20975–20988
Zhang S, Chen Y, Huang X et al (2019) Text classification of public feedbacks using convolutional neural network based on differential evolution algorithm. Int J Comput Commun Control 14(1):124–134
Zhang JR, Li YX, Tian J et al (2018) LSTM-CNN hybrid model for text classification. In: Proceedings of 3rd IEEE advanced information technology, electronic and automation control conference (IAEAC), Chongqing, pp 1675–1680
Zhang B, Li J, Quan L et al (2019) Sequence-based prediction of protein-protein interaction sites by simplified long short-term memory network. Neurocomputing 357:86–100
Zhang Z, Ye L, Qin H et al (2019) Wind speed prediction method using shared weight long short-term memory network and Gaussian process regression. Appl Energy 247:270–284
Zheng Z (2004) Feature selection for text categorization on imbalanced data. ACM SIGKDD Explorations Newsl 6(1):80–89
Zong W, Wu F, Chu LK, Sculli D (2015) A discriminative and semantic feature selection method for text categorization. Int J Prod Econ 165:215–222
Acknowledgements
This research was supported by the Fundamental Research Funds for Guangdong Natural Science Foundation, Grant No. 2022A1515011848; Guangzhou Philosophy and Social Science, Grant No. 2020GZYB04; Guangdong Philosophy and Social Science, Grant No. GD22YYJ15.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 Results of classification accuracy analysis
Table 3
Table 4
Table 5
Table 6
Table 7
Table 8
Table 9
Table 10
Table 11
1.2 Results of semantics analysis
The bold terms are the manually selected terms which are related to topics of datasets (all the terms are stemmed, and all the uppercases are transformed to lowercases).
Table 12
Table 13
Table 14
Table 15
Table 16
Table 17
1.3 Results of sparsity analysis
Table 18
Table 19
Table 20
Table 21
Table 22
Table 23
Table 24
Table 25
Table 26
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Hong, M., Wang, H. Feature selection based on long short term memory for text classification. Multimed Tools Appl 83, 44333–44378 (2024). https://doi.org/10.1007/s11042-023-16990-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-16990-7