skip to main content
10.1145/3269206.3272026acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Towards Effective Extraction and Linking of Software Mentions from User-Generated Support Tickets

Authors Info & Claims
Published:17 October 2018Publication History

ABSTRACT

Software support tickets contain short and noisy text from the customers. Software products are often represented by various surface forms and informal abbreviations. Automatically identifying software mentions from support tickets and determining the official names and versions are helpful for many downstream applications, \eg routing the support tickets to the right expert groups for support. In this work, we study the problem ofsoftware product name extraction andlinking from support tickets. We first annotate and analyze sampled tickets to understand the language patterns. Next, we design features using local, contextual, and external information sources, for extraction and linking models. In experiments, we show that linear models with the proposed features are able to deliver better and more consistent results, compared with the state-of-the-art baseline models, even on dataset with sparse labels.

References

  1. Shivali Agarwal, Vishalaksh Aggarwal, Arjun R Akula, Gargi Banerjee Dasgupta, and Giriprasad Sridhara. 2017. Automatic problem extraction and analysis from unstructured text in IT tickets . IBM Journal of Research and Development , Vol. 61, 1 (2017), 4--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Vishalaksh Aggarwal, Shivali Agarwal, Gaargi B Dasgupta, Giriprasad Sridhara, and Vijay E. 2016. ReAct: A System for Recommending Actions for Rapid Resolution of IT Service Incidents. In IEEE International Conference on Services Computing. 1--8.Google ScholarGoogle Scholar
  3. Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics , Vol. 5 (2017), 135--146.Google ScholarGoogle ScholarCross RefCross Ref
  4. Gerlof Bouma. 2009. Normalized (pointwise) mutual information in collocation extraction. Proceedings of German Society for Computational Linguistics and Language Technology (2009), 31--40.Google ScholarGoogle Scholar
  5. Peter F Brown, Peter V Desouza, Robert L Mercer, Vincent J Della Pietra, and Jenifer C Lai. 1992. Class-based n-gram models of natural language. Computational linguistics , Vol. 18, 4 (1992), 467--479. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Chunyang Chen, Zhenchang Xing, and Ximing Wang. 2017. Unsupervised software-specific morphological forms inference from informal discussions. In International Conference on Software Engineering . 450--461. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Nancy Chinchor and Beth Sundheim. 1993. MUC-5 evaluation metrics. In Conference on Message Understanding . 69--78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Yu Deng, KE Maghraoui, TD Griffin, V Agarwal, SG Tamilselvam, RD Sharnagat, TH Alexander, NE Gómez, CM Cramer, A Bivens, et almbox. 2017. Advanced search system for IT support services . IBM Journal of Research and Development , Vol. 61, 1 (2017), 3--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jiang Guo, Wanxiang Che, Haifeng Wang, and Ting Liu. 2014. Revisiting Embedding Features for Simple Semi-supervised Learning.. In Conference on Empirical Methods in Natural Language Processing . 110--120.Google ScholarGoogle ScholarCross RefCross Ref
  10. Aria Haghighi and Dan Klein. 2006. Prototype-driven learning for sequence models. In Conference of the North American Chapter of the Association for Computational Linguistics. 320--327. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jianglei Han and Mohammad Akbari. 2018. Vertical Domain Text Classification: Towards Understanding IT Tickets Using Deep Neural Networks. In AAAI Conference on Artificial Intelligence .Google ScholarGoogle Scholar
  12. Ea-Ee Jan, Kuan-Yu Chen, and Tsuyoshi Idé. 2015. Probabilistic text analytics framework for information technology service desk tickets. In IFIP/IEEE Symposium on Integrated Network Management. 870--873.Google ScholarGoogle ScholarCross RefCross Ref
  13. Ea-Ee Jan, Jian Ni, Niyu Ge, Naga Ayachitula, and Xiaolan Zhang. 2013. A statistical machine learning approach for ticket mining in IT service delivery. In IFIP/IEEE Symposium on Integrated Network Management. 541--546.Google ScholarGoogle Scholar
  14. John Lafferty, Andrew McCallum, and Fernando CN Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In International Conference on Machine Learning . 282 -- 289. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. 2016. Neural Architectures for Named Entity Recognition. In Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . 260--270.Google ScholarGoogle Scholar
  16. Vladimir I Levenshtein. 1966. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet physics doklady, Vol. 10. 707--710.Google ScholarGoogle Scholar
  17. Ta Hsin Li, Rong Liu, Noi Sukaviriya, Ying Li, Jeaha Yang, Michael Sandin, and Juhnyoung Lee. 2014. Incident Ticket Analytics for IT Application Management Services. In IEEE International Conference on Services Computing. 568--574. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Xuezhe Ma and Eduard Hovy. 2016. End-to-End Sequence Labeling via Bi-directional LSTM-CNNs-CRF. In Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. 1064--1074.Google ScholarGoogle ScholarCross RefCross Ref
  19. Gabor Melli and Christian Romming. 2012. An Overview of the CPROD1 Contest on Consumer Product Recognition within User Generated Postings and Normalization against a Large Product Catalog. In IEEE International Conference on Data Mining Workshops . Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013a. Efficient estimation of word representations in vector space. arXiv:1301.3781 (2013).Google ScholarGoogle Scholar
  21. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013b. Distributed representations of words and phrases and their compositionality. In Conference on Neural Information Processing Systems. 3111--3119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. David Nadeau. 2007. Semi-supervised named entity recognition: learning to recognize 100 entity types with little supervision . Ph.D. Dissertation. University of Ottawa. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Xuelian Pan, Erjia Yan, Qianqian Wang, and Weina Hua. 2015. Assessing the impact of software on science: A bootstrapped learning of software entities in full-text papers. Journal of Informetrics , Vol. 9, 4 (2015), 860--871.Google ScholarGoogle ScholarCross RefCross Ref
  24. Chiu Jason P.C. and Eric Nichols. 2016. Named entity recognition with bidirectional LSTM-CNNs. Transactions of the Association for Computational Linguistics , Vol. 4 (2016), 357--370.Google ScholarGoogle ScholarCross RefCross Ref
  25. Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Conference on Empirical Methods in Natural Language Processing. 1532--1543.Google ScholarGoogle ScholarCross RefCross Ref
  26. Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Conference of the North American Chapter of the Association for Computational Linguistics .Google ScholarGoogle ScholarCross RefCross Ref
  27. Karthikeyan Ponnalagu. 2017. Ontology-driven root-cause analytics for user-reported symptoms in managed IT systems . IBM Journal of Research and Development , Vol. 61, 1 (2017), 5--53. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Rahul Potharaju, Joseph Chan, Luhui Hu, Cristina Nita-Rotaru, Mingshi Wang, Liyuan Zhang, and Navendu Jain. 2015. ConfSeer: Leveraging Customer Support Knowledge Bases for Automated Misconfiguration Detection. Proceedings of the VLDB Endowment , Vol. 8, 12 (2015). Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Rahul Potharaju, Navendu Jain, and Cristina Nita-Rotaru. 2013. Juggling the Jigsaw: Towards Automated Problem Inference from Network Trouble Tickets.. In USENIX Symposium on Networked Systems Design and Implementation . 127--141. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Duangmanee Pew Putthividhya and Junling Hu. 2011. Bootstrapped named entity recognition for product attribute extraction. In Conference on Empirical Methods in Natural Language Processing. 1557--1567. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Lev Ratinov and Dan Roth. 2009. Design challenges and misconceptions in named entity recognition. In Conference on Computational Natural Language Learning. 147--155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Stefan Rüd, Massimiliano Ciaramita, Jens Müller, and Hinrich Schütze. 2011. Piggyback: Using search engines for robust cross-domain named entity recognition. In Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. 965--975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Hinrich Schütze, Christopher D Manning, and Prabhakar Raghavan. 2008. Introduction to information retrieval .Cambridge University Press.Google ScholarGoogle Scholar
  34. Pontus Stenetorp, Sampo Pyysalo, Goran Topić, Tomoko Ohta, Sophia Ananiadou, and Jun'ichi Tsujii. 2012. BRAT: a web-based tool for NLP-assisted text annotation. In European Chapter of the Association for Computational Linguistics. 102--107. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Charles Sutton and Andrew McCallum. 2012. An introduction to conditional random fields. Foundations and Trends® in Machine Learning , Vol. 4, 4 (2012), 267--373. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Henry S Vieira, Altigran S da Silva, Pável Calado, Marco Cristo, and Edleno S de Moura. 2016. Towards the Effective Linking of Social Media Contents to Products in E-Commerce Catalogs. In International Conference on Information and Knowledge Management. 1049--1058. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Qing Wang, Wubai Zhou, Chunqiu Zeng, Tao Li, Larisa Shwartz, and Genady Ya Grabarnik. 2017. Constructing the knowledge base for cognitive IT service management. In IEEE International Conference on Services Computing. 410--417.Google ScholarGoogle ScholarCross RefCross Ref
  38. Shen Wei, Wang Jianyong, and Han Jiawei. 2015. Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions. IEEE Transactions on Knowledge and Data Engineering , Vol. 27, 2 (2015), 443--460.Google ScholarGoogle ScholarCross RefCross Ref
  39. Sen Wu, Zhanpeng Fang, and Jie Tang. 2012. Accurate product name recognition from user generated content. In IEEE International Conference on Data Mining Workshops. 874--877. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Shuo Yang, Lei Zou, Zhongyuan Wang, Jun Yan, and Ji-Rong Wen. 2017. Efficiently Answering Technical Questions - A Knowledge Graph Approach. In AAAI Conference on Artificial Intelligence .Google ScholarGoogle Scholar
  41. Yangjie Yao and Aixin Sun. 2016. Mobile phone name extraction from internet forums: a semi-supervised approach. World Wide Web , Vol. 19, 5 (2016), 783--805. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Deheng Ye, Zhenchang Xing, Chee Yong Foo, Zi Qun Ang, Jing Li, and Nachiket Kapre. 2016a. Software-specific named entity recognition in software engineering social content. In IEEE Conference on Software Analysis, Evolution, and Reengineering , Vol. 1. 90--101.Google ScholarGoogle ScholarCross RefCross Ref
  43. Deheng Ye, Zhenchang Xing, Chee Yong Foo, Jing Li, and Nachiket Kapre. 2016b. Learning to extract api mentions from informal natural language discussions. In International Conference on Software Maintenance and Evolution . 389--399.Google ScholarGoogle ScholarCross RefCross Ref
  44. Wubai Zhou, Tao Li, Larisa Shwartz, and Genady Ya Grabarnik. 2015a. Recommending ticket resolution using feature adaptation. In International Conference on Network and Service Management. 15--21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Wubai Zhou, Liang Tang, Tao Li, Larisa Shwartz, and Genady Ya Grabarnik. 2015b. Resolution recommendation for event tickets in service management. In IFIP/IEEE Symposium on Integrated Network Management. 287--295.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Towards Effective Extraction and Linking of Software Mentions from User-Generated Support Tickets

                        Recommendations

                        Comments

                        Login options

                        Check if you have access through your login credentials or your institution to get full access on this article.

                        Sign in
                        • Published in

                          cover image ACM Conferences
                          CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management
                          October 2018
                          2362 pages
                          ISBN:9781450360142
                          DOI:10.1145/3269206

                          Copyright © 2018 ACM

                          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

                          Publisher

                          Association for Computing Machinery

                          New York, NY, United States

                          Publication History

                          • Published: 17 October 2018

                          Permissions

                          Request permissions about this article.

                          Request Permissions

                          Check for updates

                          Qualifiers

                          • research-article

                          Acceptance Rates

                          CIKM '18 Paper Acceptance Rate147of826submissions,18%Overall Acceptance Rate1,861of8,427submissions,22%

                          Upcoming Conference

                        PDF Format

                        View or Download as a PDF file.

                        PDF

                        eReader

                        View online with eReader.

                        eReader