Skip to main content
Log in

Topic and Sentiment Words Extraction in Cross-Domain Product Reviews

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

Sentiment analysis is very popular in natural language processing and text mining. The traditional sentiment analysis methods use supervised and unsupervised classifiers in a single domain and achieve good results. When training data and test data come from different domains, these methods become poor. The problem of cross-domain opinion analysis is that it is not easy to get a large number of tagged data sets and it is impossible to tag all the data in the interesting domains. We propose an extraction method for topic and sentiment words based on conditional random field and syntactic structure to analyze the sentiment orientation of Chinese product reviews. We aim to extract topic and sentiment words from target domain and identify their sentiment orientation with one or a few topic and sentiment words being tagged in the source domain and words in the target domain without any tagged information. Our experimental results show that our method is effective in cross-domain sentiment analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Cambria, E., Schuller, B., Xia, Y., & Havasi, C. (2013). New avenues in opinion mining and sentiment analysis. IEEE Intelligent Systems, 28(2), 15–21.

    Article  Google Scholar 

  2. Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of EMNLP (pp. 79–86).

  3. Feldman, R. (2013). Techniques and applications for sentiment analysis. Communications of the ACM, 56(4), 82–89.

    Article  Google Scholar 

  4. Zhang, P., & He, Z. (2013). A weakly supervised approach to Chinese sentiment classification using partitioned self-training. Journal of Information Science, 39(6), 815–831.

    Article  Google Scholar 

  5. Turney, P. D. (2002). Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. Meeting on Association for Computational Linguistics (pp. 417–424). Association for Computational Linguistics.

  6. Pardo, M. Á. A., Vilares, D., & Gómez-Rodríguez, C. (2015). A syntactic approach for opinion mining on spanish reviews. Natural Language Engineering, 21(1), 139–163.

    Article  Google Scholar 

  7. Wilson, T., Wiebe, J., & Hoffmann, P. (2009). Recognizing contextual polarity: An exploration of features for phrase-level sentiment analysis. Computational Linguistics, 35(3), 399–433.

    Article  Google Scholar 

  8. Yang, X., Zhang, T., Xu, C., & Yang, M. H. (2015). Boosted multifeature learning for cross-domain transfer. ACM Transactions on Multimedia Computing Communications & Applications, 11(3), 1–18.

    Google Scholar 

  9. Bollegala, D., Weir, D., & Carroll, J. (2013). Cross-domain sentiment classification using a sentiment sensitive thesaurus. IEEE Transactions on Knowledge and Data Engineering, 25(8), 1719–1731.

    Article  Google Scholar 

  10. Bollegala, D., Mu, T., & Goulermas, J. Y. (2016). Cross-domain sentiment classification using sentiment sensitive embeddings. IEEE Transactions on Knowledge and Data Engineering, 28(2), 398–410.

    Article  Google Scholar 

  11. Lafferty, J. D., Mccallum, A., & Pereira, F. C. N. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Eighteenth international conference on machine learning (Vol. 3, pp. 282–289). Morgan Kaufmann Publishers Inc.

  12. Fei, S., & Pereira, F. (2003). Shallow parsing with conditional random fields. In Conference of the North American chapter of the Association for Computational Linguistics on human language technology (Vol. 53, pp. 134–141). Association for Computational Linguistics.

  13. Zhou, Y., Hu, Q., Jia, Y., & Jia, Y. (2015). Combining heterogeneous deep neural networks with conditional random fields for Chinese dialogue act recognition. Neurocomputing, 168(C), 408–417.

    Article  Google Scholar 

  14. Xiong, Y. (2012). Integrating N-gram model information for Chinese word segmentation based on conditional random fields. In: International conference on machine learning and cybernetics (Vol. 2, pp. 762–766). IEEE.

  15. Ruokolainen, T. (2012). Applying piecewise approximation in perceptron training of conditional random fields. In International conference on advances in intelligent data analysis (Vol. 7619, pp. 324–333). Springer.

  16. Luo, J., & Li, Y. (2013). Intrusion detection method based on fuzzy conditional random fields. Journal of Computational Information Systems, 9(20), 8361–8370.

    Google Scholar 

  17. Marcińczuk, M., Stanek, M., Piasecki, M., & Musiał, A. (2012). Rich set of features for proper name recognition in polish texts. In International conference on security and intelligent information systems (Vol. 7053, pp. 332–344). Springer.

  18. Tao, L., Elhamifar, E., Khudanpur, S., Hager, G. D., & Vidal, R. (2012). Sparse hidden Markov models for surgical gesture classification and skill evaluation. In International conference on information processing in computer-assisted interventions (Vol. 7330, pp. 167–177). Springer.

  19. Szeliski, R., Zabih, R., Scharstein, D., Veksler, O., Kolmogorov, V., Agarwala, A., et al. (2006). A comparative study of energy minimization methods for Markov random fields. In European conference on computer vision (Vol. 30, pp. 16–29). Berlin: Springer.

  20. Lafferty, J. D., Mccallum, A., & Pereira, F. C. N. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Eighteenth international conference on machine learning (Vol. 3, pp. 282–289). Morgan Kaufmann Publishers Inc.

  21. Hao, Z., Wang, H., & Wen, W. (2013). Product named entity recognition for chinese query questions based on a skip-chain crf model. Neural Computing and Applications, 23(2), 371–379.

    Article  Google Scholar 

  22. Liao, L., Fox, D., & Kautz, H. (2007). Extracting places and activities from GPS traces using hierarchical conditional random fields. International Journal of Robotics Research, 26(1), 119–134.

    Article  Google Scholar 

  23. Nicolas, S., Dardenne, J., Paquet, T., & Heutte, L. (2010). Document image segmentation using a 2D conditional random field model. In International conference on document analysis and recognition (Vol. 1, pp. 407–411). IEEE.

  24. Tang, J., Hong, M., Li, J., & Liang, B. (2006). Tree-structured conditional random fields for semantic annotation. Lecture Notes in Computer Science, 4273, 640–653.

    Article  Google Scholar 

  25. Delaye, A., & Liu, C. L. (2014). Multi-class segmentation of free-form online documents with tree conditional random fields. International Journal on Document Analysis and Recognition, 17(4), 313–329.

    Article  Google Scholar 

  26. Ng, V., Dasgupta, S., & Arifin, S. M. N. (2006). Examining the role of linguistic knowledge sources in the automatic identification and classification of reviews. In COLING/ACL on main conference poster sessions (Vol. 13, pp. 611–618). Association for Computational Linguistics.

Download references

Acknowledgements

This research is supported by the Specialized Research Fund for the Doctoral Program of Higher Education (Grant No.: 20133718110014), the National Statistical Science Research (Grant No.: 2016LZ12), the Science and Technology of Taian (Grant Nos.: 2015GX2012 and 201630576), the National Economy and Society Information Development Soft Science of Shandong Province (Grant No.: 2015EI017). The author would like to thank all the students and teachers for their efforts. We are also appreciating the reviewers and editors for their valuable suggestions and comments to improve this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ge Wang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, G., Pu, P. & Liang, Y. Topic and Sentiment Words Extraction in Cross-Domain Product Reviews. Wireless Pers Commun 102, 1773–1783 (2018). https://doi.org/10.1007/s11277-017-5235-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11277-017-5235-7

Keywords

Navigation