Abstract
Many facts change over time, which is a fundamental aspect of our physical environment. In the case of pandemic articles, the user is not interested in the creation date of the document but in the facts and the cause of the last pandemic. Fake news can be better combated by having a document with a temporal focus. Currently, neither the sequence of events nor the temporal focus is considered when obtaining news documents. Despite the limited number of temporal aspects in the available datasets, it is difficult to test and evaluate the temporal conclusions of the model. The goal of this work is to develop a temporal focus news article retrieval model based on co-training to advance research in semi-supervised learning. A mapping of the dataset is performed using (1) the evolving focus time of news articles and (2) the semi-supervised method based on coincidence contexts for learning low-dimensional continuous vectors for learning neural contrast embedding models generating focus time-based query in sequential news articles to facilitate temporal understanding by learning low-dimensional continuous vectors. A diverse dataset of news articles is used to evaluate the effectiveness of the proposed method. With semi-supervised learning and lexicon expansion, the result of the developed model can achieve 89%. The method performed better than previous baselines and traditional machine learning models with improvements of 12.65% and 4.7%, respectively.
- [1] . 2022. Leveraging multilingual news websites for building a kurdish parallel corpus. Trans. As. Low-Resour. Lang. Inf. Process. 21, 5 (2022), 1–11.Google ScholarDigital Library
- [2] Ensar Emirali and M. Elif Karslıgil. 2022. Using word embeddings in detection of temporal expressions in Turkish texts. In Proceedings of the 30th Signal Processing and Communications Applications Conference (SIU’22). IEEE, 1–4.Google Scholar
- [3] Omar Alonso, Jannik Strötgen, Ricardo Baeza-Yates, and Michael Gertz. 2011. Temporal information retrieval: Challenges and opportunities. Twaw 11 (2011), 1–8.Google Scholar
- [4] . 2021. A multimodal deep framework for derogatory social media post identification of a recognized person. Trans. As. Low-Resour. Lang. Inf. Process. 21, 1 (2021), 1–19.Google ScholarDigital Library
- [5] . 2018. Including new patterns to improve event extraction systems. In Proceedings of the 31st International Florida Artificial Intelligence Research Society Conference (FLAIRS’18). AAAI Press, 487–492.Google Scholar
- [6] . 2000. Contextual correlates of meaning. Appl. Psycholinguist. 21, 4 (2000), 505–524.Google ScholarCross Ref
- [7] Costas Mavromatis, Prasanna Lakkur Subramanyam, Vassilis N. Ioannidis, Adesoji Adeshina, Phillip R. Howard, Tetiana Grinberg, Nagib Hakim, and George Karypis. 2022. Tempoqr: temporal question reasoning over knowledge graphs. Proceedings of the AAAI Conference on Artificial Intelligence 36, 5 (2022), 5825–5833.Google Scholar
- [8] . 2017. Automatically labeled data generation for large scale event extraction. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 409–419.Google ScholarCross Ref
- [9] . 2021. A transformer-based approach to multilingual fake news detection in low-resource languages. Trans. As. Low-Resour. Lang. Inf. Process. 21, 1 (2021), 1–20.Google Scholar
- [10] . 2020. Meta-learning with dynamic-memory-based prototypical network for few-shot event detection. In Proceedings of the 13th International Conference on Web Search and Data Mining. 151–159.Google ScholarDigital Library
- [11] . 2021. Blockchain-based framework for reducing fake or vicious news spread on social media/messaging platforms. Trans. As. Low-Resour. Lang. Inf. Process. 21, 1 (2021), 1–33.Google Scholar
- [12] . 2018. A language-independent neural network for event detection. Sci. Chin. Inf. Sci. 61, 9 (2018), 1–12.Google ScholarCross Ref
- [13] Michele Filannino. 2016. Data-driven temporal information extraction with applications in general and clinical domains. Faculty of Engineering and Physical Sciences, School of Computer Science The University of Manchester, 233 page.Google Scholar
- [14] . 2021. A hybrid CNN-LSTM: A deep learning approach for consumer sentiment analysis using qualitative user-generated contents. Trans. As. Low-Resour. Lang. Inf. Process. 20, 5 (2021), 1–15.Google ScholarDigital Library
- [15] . 2021. Fake news classification: A quantitative research description. Trans. As. Low-Resour. Lang. Inf. Process. 21, 1 (2021), 1–17.Google Scholar
- [16] . 2018. Temporal specificity-based text classification for information retrieval. Turk. J. Electr. Eng. Comput. Sci. 26, 6 (2018), 2915–2926.Google Scholar
- [17] Shafiq Ur Rehman Khan, Muhammad Arshad Islam, Muhammad Aleem, Muhammad Azhar Iqbal, and Usman Ahmed. 2018. Section-based focus time estimation of news articles. IEEE Access 6 (2018), 75452–75460.Google Scholar
- [18] . 2021. Introduction to special issue on misinformation, fake news and rumor detection in low-resource languages.Google Scholar
- [19] Viet Dac Lai, Tuan Ngo Nguyen, and Thien Huu Nguyen. 2020. Event detection: Gate diversity and syntactic importance scores for graph convolution neural networks. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP’20). 5405–5411.Google Scholar
- [20] . 2009. Improving search relevance for implicitly temporal queries. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 700–701.Google ScholarDigital Library
- [21] . 2021. Neural unsupervised semantic role labeling. Trans. As. Low-Resour. Lang. Inf. Process. 20, 6 (2021), 1–16.Google ScholarDigital Library
- [22] . 2021. CEHR-BERT: Incorporating temporal information from structured EHR data to improve prediction tasks. In Machine Learning for Health. PMLR, 239–260.Google Scholar
- [23] . 2014. Glove: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1532–1543.Google ScholarCross Ref
- [24] . 2022. Time masking for temporal language models. In Proceedings of the 15th ACM International Conference on Web Search and Data Mining. 833–841.Google ScholarDigital Library
- [25] . 2021. Persian fake news detection: Neural representation and classification at word and text levels. Trans. As. Low-Resour. Lang. Inf. Process. 21, 1 (2021), 1–11.Google Scholar
- [26] . 2020. Casie: Extracting cybersecurity event information from text. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 8749–8757.Google ScholarCross Ref
- [27] . 2017. Semantically enhanced medical information retrieval system: A tensor factorization based approach. IEEE Access 5 (2017), 7584–7593.Google ScholarCross Ref
- [28] . 2021. Event occurrence date estimation based on multivariate time series analysis over temporal document collections. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 398–407.Google ScholarDigital Library
- [29] . 2021. Multi-sentence argument linking via an event-aware hierarchical encoder. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 3578–3582.Google ScholarDigital Library
- [30] . 2016. Hierarchical attention networks for document classification. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1480–1489.Google ScholarCross Ref
- [31] . 2015. Chinese spelling checker based on an inverted index list with a rescoring mechanism. ACM Trans. As. Low-Resour. Lang. Inf. Process. 14, 4 (2015), 1–28.Google ScholarDigital Library
- [32] . 2017. Time sensitive blog retrieval using temporal properties of queries. J. Inf. Sci. 43, 1 (2017), 103–121.Google ScholarDigital Library
Index Terms
- Automatically Temporal Labeled Data Generation Using Positional Lexicon Expansion for Focus Time Estimation of News Articles
Recommendations
Query Expansion with Temporal Segmented Texts
ECIR 2014: Proceedings of the 36th European Conference on IR Research on Advances in Information Retrieval - Volume 8416The use of temporal data extracted from text, to improve the effectiveness of Information Retrieval systems, has recently been the focus of important research work. Our research hypothesis is that the usage of the temporal relationship between words ...
Semantic Modelling of Document Focus-Time for Temporal Information Retrieval
WWW '22: Companion Proceedings of the Web Conference 2022An accurate understanding of the temporal dynamics of Web content and user behaviors plays a crucial role during the interactive process between search engine and users. In this work, we focus on how to improve the retrieval performance via a better ...
Modeling Temporal Evidence from External Collections
WSDM '19: Proceedings of the Twelfth ACM International Conference on Web Search and Data MiningNewsworthy events are broadcast through multiple mediums and prompt the crowds to produce comments on social media. In this paper, we propose to leverage on this behavioral dynamics to estimate the most relevant time periods for an event (i.e., query). ...
Comments