Skip to main content
Log in

An efficient and effective approach for multi-fact extraction from text corpus

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Relation extraction (RE) is a fundamental task with various real-world applications. Although significant progress has been achieved in this research field, it is still limited to single-fact extraction. In practice, however, people tend to describe multiple relations in a single sentence. Apparently, multi-fact extraction is more reasonable yet challenging due to the mixture of diverse information. To address this issue, we introduce a novel syntax-based model for multi-fact extraction. Specifically, we propose a relational-expressiveness-based pruning strategy to refine the dependency parsing tree of each sentence, and then incorporate the customized and simplified syntax information into sentence encoding via Graph Convolutional Networks. Besides, distance embeddings are developed in our model to inform the extractor of the status of each word regarding different entity pairs in a sentence based on its shortest dependency path to the entities of interest. In addition, we explore fine-grained pooling strategy to integrate various evidences for the relation extractor to make accurate predictions. We conduct extensive experiments on the publicly-available datasets, and the experimental results verify the superiority of our model for multi-fact extraction in terms of both effectiveness and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. https://nlp.stanford.edu/projects/glove/

  2. https://github.com/stanfordnlp/CoreNLP

  3. https://github.com/UKPLab/emnlp2017-relation-extraction

  4. Available at https://github.com/UKPLab/emnlp2017-relation-extraction & https://github.com/thunlp/gp-gnn

  5. https://catalog.ldc.upenn.edu/LDC2006T06

References

  1. Agichtein, E., Gravano, L.: Qxtract: A building block for efficient information extraction from plain-text databases. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data. San Diego, California, USA, June 9-12, 2003 (2003)

  2. Angeli, G., Premkumar, M.J.J., Manning, C.D.: Leveraging linguistic structure for open domain information extraction. In: Proceedings of ACL, pp. 344–354 (2015)

  3. Arasu, A., Garcia-Molina, H.: Extracting structured data from web pages. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data. San Diego, California, USA, June 9-12, 2003, pp. 337–348 (2003)

  4. Bastings, J., Titov, I., Aziz, W., Marcheggiani, D., Sima’an, K.: Graph convolutional encoders for syntax-aware neural machine translation. In:Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, pp. 1957–1967 (2017)

  5. Bastos, A., Nadgeri, A., Singh, K., Mulang, I.O., Shekarpour, S., Hoffart, J., Kaul, M.: RECON: relation extraction using knowledge graph context in a graph neural network. In: Leskovec, J., Grobelnik, M., Najork, M., Tang, J., Zia, L. (eds.) WWW ’21: The Web Conference 2021, Virtual Event / Ljubljana, Slovenia, April 19-23, 2021, ACM / IW3C2, pp. 1673–1685 (2021)

  6. Bollacker, K.D., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, Vancouver, BC, Canada, June 10-12, 2008, pp. 1247–1250 (2008)

  7. Chen D, Manning CD A fast and accurate dependency parser using neural networks. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 740–750 (2014)

  8. Cui, W., Xiao, Y., Wang, H., Song, Y., Hwang, S., Wang, W.: KBQA: learning question answering over QA corpora and knowledge bases. Proc VLDB Endow 10(5), 565–576 (2017)

    Article  Google Scholar 

  9. Ebisu, T., Ichise, R.: Generalized translation-based embedding of knowledge graph. IEEE Trans Knowl Data Eng 32(5), 941–951 (2020)

    Article  Google Scholar 

  10. Ebrahimi, J., Dou, D.: Chain based RNN for relation classification. In: Proceedings of NAACL, pp. 1244–1249 (2015)

  11. Gulhane, P., Rastogi, R., Sengamedu, S.H., Tengli, A.: Exploiting content redundancy for web information extraction. Proc VLDB Endow 3(1), 578–587 (2010)

    Article  Google Scholar 

  12. Hu, S., Zou, L., Yu, J.X., Wang, H., Zhao, D.: (2018) Answering natural language questions by subgraph matching over knowledge graphs. In: 34th IEEE International Conference on Data Engineering, ICDE 2018. Paris, France, April 16-19, 2018, pp. 1815–1816

  13. Kuang, J., Cao, Y., Zheng, J., He, X., Gao, M., Zhou, A.: Improving neural relation extraction with implicit mutual relations. In: 36th IEEE International Conference on Data Engineering, ICDE 2020. Dallas, TX, USA, April 20-24, 2020, pp. 1021–1032 (2020)

  14. Li, Z., Sharaf, M.A., Sitbon, L., Du, X., Zhou, X.: Core: A context-aware relation extraction method for relation completion. IEEE Trans Knowl Data Eng 26(4), 836–849 (2014)

    Article  Google Scholar 

  15. Lin, Y., Shen, S., Liu, Z., Luan, H., Sun, M.: Neural relation extraction with selective attention over instances. In: Proceedings of ACL, pp. 2124–2133 (2016)

  16. Liu, H., Li, Z., Sheng, D., Zheng, H., Shen, Y.: Multi-entity collabora-tive relation extraction. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2021. Toronto, ON, Canada, June 6-11, 2021, IEEE, pp. 7678–7682 (2021)

  17. Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Proceedings of ACL, pp. 1003–1011 (2009)

  18. Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. In: Proceedings of ACL, pp. 1105–1116 (2016)

  19. Nadgeri, A., Bastos, A., Singh, K., Mulang, I.O., Hoffart, J., Shekarpour, S., Saraswat, V.: Kgpool: Dynamic knowledge graph context selection for relation extraction. In: Zong, C., Xia, F., Li, W., Navigli, R. (eds.) Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, Online Event, August 1-6, 2021, Association for Computational Linguistics, Findings of ACL, vol ACL/IJCNLP 2021, pp. 535–548 (2021)

  20. Qu, J., Ouyang, D., Hua, W., Ye, Y., Zhou, X.: Discovering correlations between sparse features in distant supervision for relation extraction. In: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, WSDM 2019, Melbourne, VIC, Australia, February 11-15, 2019, pp. 726–734 (2019)

  21. Reichartz, F., Korte, H., Paass, G.: Semantic relation extraction with kernels over typed dependency trees. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, July 25-28, 2010, pp. 773–782 (2010)

  22. Riedel, S., Yao, L., McCallum, A.: Modeling relations and their mentions without labeled text. In: Proceedings of EMNLP, pp. 148–163 (2010)

  23. Sorokin, D., Gurevych, I.: Context-aware representations for knowledge base relation extraction. In: Proceedings of EMNLP, pp. 1784–1789 (2017)

  24. Surdeanu, M., Tibshirani, J., Nallapati, R., Manning, C.D.: Multi-instance multi-label learning for relation extraction. In: Proceedings of EMNLP, pp. 455–465 (2012)

  25. Vashishth, S., Joshi, R., Prayaga, S.S., Bhattacharyya, C., Talukdar, P.P.: RESIDE: improving distantly-supervised neural relation extraction using side information. In: Proceedings of EMNLP, pp 1257–1266 (2018)

  26. Wang, H., Tan, M., Yu, M., Chang, S., Wang, D., Xu, K., Guo, X., Potdar, S.: Extracting multiple-relations in one-pass with pre-trained transformers. In: Korhonen, A., Traum, D.R., Màrquez L (eds.) Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, Association for Computational Linguistics, pp. 1371–1377 (2019)

  27. Wong, T., Lam, W.: Learning to adapt web information extraction knowledge and discovering new attributes via a bayesian approach. IEEE Trans Knowl Data Eng 22(4), 523–536 (2010)

    Article  Google Scholar 

  28. Wu, M., Pan, S., Zhu, X., Zhou, C., Pan, L.: Domain-adversarial graph neural networks for text classification. In: 2019 IEEE International Conference on Data Mining, ICDM 2019. Beijing, China, November 8-11, 2019, pp. 648–657 (2019)

  29. Wu, S., Hsiao, L., Cheng, X., Hancock, B., Rekatsinas, T., Levis, P., Ré C.: Fonduer: Knowledge base construction from richly formatted data. In: Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018. Houston, TX, USA, June 10-15, 2018, pp. 1301–1316 (2018)

  30. Xu, Y., Mou, L., Li, G., Chen, Y., Peng, H., Jin, Z.: Classifying relations via long short term memory networks along shortest dependency paths. In: Proceedings of EMNLP, pp. 1785–1794 (2015)

  31. Zeng, D., Liu, K., Lai, S., Zhou, G., Zhao, J.: Relation classification via convolutional deep neural network. In: Proceedings of COLING, pp. 2335–2344 (2014)

  32. Zhang, Y., Qi, P., Manning, C.D.: Graph convolution over pruned dependency trees improves relation extraction. In: Proceedings of EMNLP, pp. 2205–2215 (2018)

  33. Zhang, Y., Yao, Q., Shao, Y., Chen, L.: Nscaching: Simple and efficient negative sampling for knowledge graph embedding. In: 35th IEEE International Conference on Data Engineering, ICDE 2019. Macao, China, April 8-11, 2019, pp. 614–625 (2019)

  34. Zhu, H., Lin, Y., Liu, Z., Fu, J., Chua, T., Sun, M.: Graph neural networks with generated parameters for relation extraction. In: Proceedings of ACL, pp 1331–1339 (2019)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dantong Ouyang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qu, J., Hua, W., Ouyang, D. et al. An efficient and effective approach for multi-fact extraction from text corpus. World Wide Web 25, 195–218 (2022). https://doi.org/10.1007/s11280-021-00982-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-021-00982-4

Keywords

Navigation