Skip to main content
Log in

No-but-semantic-match: computing semantically matched xml keyword search results

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Users are rarely familiar with the content of a data source they are querying, and therefore cannot avoid using keywords that do not exist in the data source. Traditional systems may respond with an empty result, causing dissatisfaction, while the data source in effect holds semantically related content. In this paper we study this no-but-semantic-match problem on XML keyword search and propose a solution which enables us to present the top-k semantically related results to the user. Our solution involves two steps: (a) extracting semantically related candidate queries from the original query and (b) processing candidate queries and retrieving the top-k semantically related results. Candidate queries are generated by replacement of non-mapped keywords with candidate keywords obtained from an ontological knowledge base. Candidate results are scored using their cohesiveness and their similarity to the original query. Since the number of queries to process can be large, with each result having to be analyzed, we propose pruning techniques to retrieve the top-k results efficiently. We develop two query processing algorithms based on our pruning techniques. Further, we exploit a property of the candidate queries to propose a technique for processing multiple queries in batch, which improves the performance substantially. Extensive experiments on two real datasets verify the effectiveness and efficiency of the proposed approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19
Figure 20
Figure 21

Similar content being viewed by others

References

  1. Akbarnejad, J., Chatzopoulou, G., Eirinaki, M., Koshy, S., Mittal, S., On, D., Polyzotis, N., Varman, J.S.V.: SQL Querie recommendations. PVLDB 3(2), 1597–1600 (2010)

    Google Scholar 

  2. Aleman-Meza, B., Halaschek-Wiener, C., Sahoo, S.S., Sheth, A.P., Arpinar, I.B.: Template based semantic similarity for security applications. In: ISI, pp. 621–622 (2005)

  3. Amer-Yahia, S., Cho, S., Srivastava, D.: Tree pattern relaxation. In: EDBT, pp. 496–513. Springer, London (2002)

  4. Bao, Z., Ling, T.W., Chen, B., Lu, J.: Effective XML keyword search with relevance oriented ranking. In: ICDE, pp. 517–528 (2009)

  5. Bao, Z., Zeng, Y., Ling, T.W., Zhang, D., Li, G., Jagadish, H.V.: A general framework to resolve the mismatch problem in XML keyword search. VLDB J. 24(4), 493–518 (2015)

    Article  Google Scholar 

  6. Bouquet, P., Kuper, G.M., Zanobini, S.: Asking and answering queries semantically. In: WOA, pp. 22–27 (2005)

  7. Brodianskiy, T., Cohen, S.: Self-correcting queries for xml. In: CIKM, pp. 11–20. ACM, New York (2007)

  8. Cakmak, A., Özsoyoglu, G.: Taxonomy-superimposed graph mining. In: EDBT, pp. 217–228 (2008)

  9. Chatzopoulou, G., Eirinaki, M., Polyzotis, N.: Query recommendations for interactive database exploration. In: SSDBM, pp. 3–18 (2009)

  10. Chu, W.W., Yang, H., Chiang, K., Minock, M., Chow, G., Larson, C.: CoBase: A scalable and extensible cooperative information system. Intell. Inf. Syst. 6(2), 223–259 (1996)

    Article  Google Scholar 

  11. Cohen, S., Brodianskiy, T.: Correcting queries for xml. Inf. Syst. 34(8), 757–777 (2009)

    Article  Google Scholar 

  12. Corby, O., Dieng-Kuntz, R., Faron-Zucker, C., Gandon, F.L.: Searching the semantic web approximate query processing based on ontologies. IEEE Intell. Syst. 21(1), 20–27 (2006)

    Article  Google Scholar 

  13. Drosou, M., Pitoura, E.: Ymaldb: exploring relational databases via result-driven recommendations. VLDB J. 22(6), 849–874 (2013)

    Article  Google Scholar 

  14. Farfan, F., Hristidis, V., Ranganathan, A., Weiner, M.: XOntoRank: Ontology-aware search of electronic medical records. In: ICDE, pp. 820–831 (2009)

  15. Feng, J., Li, G., Wang, J., Zhou, L.: Finding and ranking compact connected trees for effective keyword proximity search in XML documents. Inf. Syst. 35(2), 186–203 (2010)

    Article  Google Scholar 

  16. Guo, L., Shao, F., Botev, C., Shanmugasundaram, J.: XRANK: Ranked Keyword Search over XML Documents. In: SIGMOD, pp. 16–27 (2003)

  17. Hill, J., Thorson, J., Guo, B., Chen, Z.: Toward ontology-guided knowledge-driven XML query relaxation. In: Second International Conference on Computational Intelligence, Modeling and Simulation, pp. 448–453 (2010)

  18. Hristidis, V., Koudas, N., Papakonstantinou, Y., Srivastava, D.: Keyword proximity search in XML trees. IEEE Trans. Knowl Data Eng. 18(4), 525–539 (2006)

    Article  Google Scholar 

  19. Huh, S., Moon, K.H., Ahn, J.K.: Cooperative query processing via knowledge abstraction and query relaxation. In: Advanced Topics in Database Research, pp. 1:211–228 (2002)

  20. Islam, S., Liu, C., Zhou, R.: A framework for query refinement with user feedback. J Syst. Softw. 86(6), 1580–1595 (2013)

    Article  Google Scholar 

  21. Islam, S., Liu, C., Zhou, R.: FlexIQ: a flexible interactive querying framework by exploiting the skyline operator. J Syst. Softw. 97, 97–117 (2014)

    Article  Google Scholar 

  22. Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)

    Article  Google Scholar 

  23. Kim, M.S., Kong, Y.H.: Ontology-DTD matching algorithm for efficient XML query. In: Fuzzy Systems and Knowledge Discovery, pp. 3614:1093–1102 (2005)

  24. Kim, M.S., Kong, Y.H., Jeon, C.W.: Remote-specific XML query mobile agents. In: Data Engineering Issues in E-Commerce and Services, pp. 4055:143–151 (2006)

  25. Little, E., Sambhoos, K., Llinas, J.: Enhancing graph matching techniques with ontologies. In: FUSION, pp. 1–8 (2008)

  26. Lu, Y., Wang, W., Li, J., Liu, C.: Xclean: providing valid spelling suggestions for XML keyword queries. In: ICDE, pp. 661–672 (2011)

  27. Mei, J., Ma, L., Pan, Y.: Ontology query answering on databases. In: International Semantic Web Conference, pp. 445–458 (2006)

  28. Meng, X., Cao, L., Shao, J.: Semantic approximate keyword query based on keyword and query coupling relationship analysis. In: CIKM, pp. 529–538 (2014)

  29. Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: Wordnet: an on-line lexical database. Int. J. Lexicogr. 3, 235–244 (1990)

    Article  Google Scholar 

  30. Muslea, I.: Machine learning for online query relaxation. In: KDD, pp. 246–255. ACM, New York (2004)

  31. Muslea, I., Lee, T.J.: Online query relaxation via Bayesian causal structures discovery. In: AAAI, pp. 831–836. AAAI Press (2005)

  32. Nambiar, U., Kambhampati, S.: Answering imprecise queries over autonomous web databases. In: ICDE, p 45 (2006)

  33. Pu, K.Q., Yu, X.: Keyword query cleaning. PVLDB 1(1), 909–920 (2008)

    Google Scholar 

  34. Schenkel, R., Theobald, A., Weikum, G.: Semantic similarity search on semistructured data with the XXL search engine. Inf. Retr. 8(4), 521–545 (2005)

    Article  Google Scholar 

  35. Sun, C., Chan, C.Y., Goenka, A.K.: Multiway Slca-Based Keyword Search in XML Data. In: WWW, pp. 1043–1052 (2007)

  36. Truong, B.Q., Bhowmick, Dyreson, C., Sun, A.: MESSIAH: missing element-conscious SLCA nodes search in XML data. In: SIGMOD, pp. 37–48 (2013)

  37. Wu, Y., Yang, S., Yan, X.: Ontology-based subgraph querying. In: ICDE, pp. 697–708 (2013)

  38. Wu, Z., Palmer, M.: Verbs semantics and lexical selection. In: ACL, pp. 133–138, Stroudsburg, PA, USA (1994)

  39. Xu, J., Croft, W.B.: Query expansion using local and global document analysis. In: SIGIR, pp. 4–11 (1996)

  40. Xu, Y., Papakonstantinou, Y.: Efficient keyword search for smallest Lcas in XML databases. In: SIGMOD, pp. 537–538 (2005)

  41. Yao, J., Cui, B., Hua, L., Huang, Y.: Keyword query reformulation on structured data. In: ICDE, pp. 953–964 (2012)

  42. Yao, L., Liu, C., Li, J., Zhou, R.: Efficient computation of multiple XML keyword queries. In: WISE, pp. 368–381 (2013)

  43. Zhou, J., Bao, Z., Wang, W., Zhao, J., Meng, X.: Efficient query processing for xml keyword queries based on the IDList index. VLDB J. 23(1), 25–50 (2014)

    Article  Google Scholar 

  44. Zhou, X., Guagaz, J., Balke, W.T., Nejdl, W.: Query relaxation using malleable schemas. In: SIGMOD, pp. 1:545–556 (2007)

Download references

Acknowledgments

This work is supported by the Australian Research Council discovery grants DP140103499 and DP160102412.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mehdi Naseriparsa.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Naseriparsa, M., Islam, M.S., Liu, C. et al. No-but-semantic-match: computing semantically matched xml keyword search results. World Wide Web 21, 1223–1257 (2018). https://doi.org/10.1007/s11280-017-0503-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-017-0503-8

Keywords

Navigation