Abstract
A new question representation method is proposed for automated question matching over accumulated question-answer data archive. The representation defines four kinds of question words as question-type words, user-centered words, shareable-pattern words, and irrelevant words for question analysis. These question words are further annotated by a semantic labeling ontology to enhance the semantic representation for the purpose of word ambiguity reduction. We tested the matching precision on 5,000 questions with respect to various generators and the result demonstrated the stability of the method. We further compared the method with Cosine similarity and WordNet-based semantic similarity as baselines on a standard TREC dataset containing 5,536 questions. The results presented that our method improved MRR by 8.6 % and accuracy by 9.6 % on average, indicating its effectiveness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Liu, W.Y., Hao, T.Y., Chen, W., Feng, M.: A web-based platform for user-interactive question-answering. World Wide Web 12(2), 107–124 (2009)
Hao, T.Y., Xu, F.F., Lei, J.S., Liu, W.Y., Li, Q.: Toward automatic answers in user-interactive question answering systems. Int. J. Softw. Sci. Comput. Intell. 3(4), 52–66 (2011)
Liu, Y., Bian, J., Agichtein, E.: Predicting information seeker satisfaction in community question answering. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp.483–490 (2008)
Seo, J.W., Croft, W.B., Smith, D.A.: Online community search using conversational structures. Inf. Retrieval 14(6), 547–571 (2011)
Agichtein, E., et al.: Finding high-quality content in social media. In: Proceedings of the 2008 International Conference on Web Search and Data Mining (2008)
Hao, T.Y., Agichtein, E.: Finding similar questions in collaborative question answering archives: toward bootstrapping-based equivalent pattern learning. Inf. Retrieval 15(3–4), 332–353 (2012)
Voorhees, E.: The TREC-8 Question Answering Track Report, NIST Special Publication of the Eighth Text REtrieval Conference TREC 8, National Institute of Standards and Technology, pp. 743–751 (1999)
Wu, C.H., Yeh, J.F., Chen, M.J.: Domain-specific FAQ retrieval using independent aspects. ACM Transactions Asian Language Information Processing 4(1), 1–17 (2005)
Hammond, K., Bruke, R., Martin, C., Lytinen, S.: FAQ-Finder: a case based approach to knowledge navigation. In: Working Notes of the AAAI Spring Symposium on Information Gathering from Heterogeneous Distributed Environments, AAAI (1995)
Burke, R.D., Hammond, K., Kulyukin, V., Lytinen, S.L., Tomuro, N., Schoenberg, S.: Question answering from frequently-asked-question files: experiences with the FAQ finder system, Technique report TR-97-05, University of Chicago, Chicago (1997)
Whitehead, S.D.: Auto-FAQ: an experiment in cyberspace leveraging. J. Comput. Netw. ISDN Syst. 28, 137–146 (1995)
Tomuro, N.: Question terminology and representation for question type classification. In: Proceedings of the 2nd International Workshop on Computational Terminology (COMPUTERM02), Taipei (2002)
Jeon, J., Croft, W.B., Lee, J.H.: Finding similar questions in large question and answer archives. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 84–90 (2005)
Xue, X., Jeon, J., Croft, W.B.: Retrieval models for question and answer archives. In: Proceedings of SIGIR, pp. 475–482 (2008)
Wang, K., Ming, Z., Chua, T.-S.: A syntactic tree matching approach to finding similar questions in community-based qa services. In: Proceedings of SIGIR, pp. 187–194 (2009)
Cao, X., Cong, G., Cui, B., Jensen, C.S.: A generalized framework of exploring category information for question retrieval in community question answer archives. In: Proceedings of the 19th International Conference on World Wide Web, pp. 201–210 (2010)
Zhang, K., Wu, W., Wu, H., Li, Z., Zhou, M.: Question retrieval with high quality answers in community question answering. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, pp. 371–380 (2014)
Zhang, W.N., Liu, T., Yang, Y., Cao, L., Zhang, Y., Ji, R.: A topic clustering approach to finding similar questions from large question and answer archives. PLoS ONE 9(3), e71511 (2014)
Cui, H., Sun, R., Li, K., Kan, M.Y., Chua, T.S.: Question answering passage retrieval using dependency relations. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 400–407 (2005)
Gao, Y., Wang, M., Zha, Z.J., Shen, J., Li, X., Wu, X.: Visual-textual joint relevance learning for tag-based social image search. IEEE Trans. Image Process. 22(1), 363–376 (2013)
Toba, H., Ming, Z.Y., Adriani, M., Chua, T.S.: Discovering high quality answers in community question answering archives using a hierarchy of classifiers. Inf. Sci. 261, 101–115 (2014)
Li, X., Roth, D.: Learning question classifiers. In: Proceedings of the 19th International Conference on Computational Linguistics, vol. 1, pp. 1–7 (2002)
Hao, T.Y., Ni, X.L., Quan, X.J., Liu, W.Y.: Automatic construction of semantic dictionary for question categorization. J. Syst. Cybern. Inform. 7(6), 86–90 (2009)
Singhal, A.: Modern information retrieval: a brief overview. Bull. IEEE Comput. Soc. Tech. Comm. Data Eng. 24(4), 35–43 (2001)
Source code of WordNet-based semantic similarity measurement. http://www.codeproject.com/Articles/11835/WordNet-based-semantic-similarity-measurement. Accessed 2015
Experimental Data for Question Classification. http://cogcomp.cs.illinois.edu/Data/QA/QC/. Accessed 2015
Acknowledgements
This work was supported by National Natural Science Foundation of China (grant No. 61403088 and No.61305094).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Hao, T., Qiu, X., Jiang, S. (2015). Leveraging Semantic Labeling for Question Matching to Facilitate Question-Answer Archive Reuse. In: Huang, DS., Bevilacqua, V., Premaratne, P. (eds) Intelligent Computing Theories and Methodologies. ICIC 2015. Lecture Notes in Computer Science(), vol 9225. Springer, Cham. https://doi.org/10.1007/978-3-319-22180-9_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-22180-9_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22179-3
Online ISBN: 978-3-319-22180-9
eBook Packages: Computer ScienceComputer Science (R0)