ABSTRACT
For supporting keyword search on structured data, current solutions require large indexes to be built that redundantly store subgraphs called neighborhoods. Further, for exploring keyword search results, large graphs have to be loaded into memory. We propose a solution, which employs much more compact index structures for neighborhood lookups. Using these indexes, we reduce keyword search result exploration to the traditional database problem of top-k join processing, enabling results to be computed efficiently. In particular, this computation can be performed on data streams successively loaded from disk (i.e., does not require the entire input to be loaded at once into memory). For supporting this, we propose a top-k procedure based on the rank join operator, which not only computes the k-best results, but also selects query plans in a top-k fashion during the process. In experiments using large real-world datasets, our solution reduced storage requirements and also outperformed the state-of-the-art in terms of performance and scalability.
- J. Cheng and J. X. Yu. On-line exact shortest distance query processing. In EDBT, pages 481--492, Saint Petersburg, Russia, 2009. ACM. Google ScholarDigital Library
- E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. In ACM-SIAM Symposium on Discrete algorithms, pages 937--946, 2002. Google ScholarDigital Library
- B. Ding, J. X. Yu, S. Wang, L. Qin, X. Zhang, and X. Lin. Finding top-k min-cost connected trees in databases. In ICDE, pages 836--845. IEEE, 2007.Google ScholarCross Ref
- H. He, H. Wang, J. Yang, and P. S. Yu. Blinks: ranked keyword searches on graphs. In SIGMOD Conference, pages 305--316, 2007. Google ScholarDigital Library
- V. Hristidis, L. Gravano, and Y. Papakonstantinou. Efficient ir-style keyword search over relational databases. In VLDB, pages 850--861, 2003. Google ScholarDigital Library
- H. Hwang, V. Hristidis, and Y. Papakonstantinou. Objectrank: a system for authority-based search on databases. In SIGMOD Conference, 2006. Google ScholarDigital Library
- I. F. Ilyas, W. G. Aref, and A. K. Elmagarmid. Joining ranked inputs in practice. In VLDB, pages 950--961, 2002. Google ScholarDigital Library
- I. F. Ilyas, G. Beskales, and M. A. Soliman. A survey of top-k query processing techniques in relational database systems. ACM Comput. Surv., 40(4), 2008. Google ScholarDigital Library
- V. Kacholia, S. Pandit, S. Chakrabarti, S. Sudarshan, R. Desai, and H. Karambelkar. Bidirectional expansion for keyword search on graph databases. In VLDB, pages 505--516, 2005. Google ScholarDigital Library
- G. Ladwig and T. Tran. Index structures and top-k join algorithms for native keyword search databases. Technical report, 2011. http://people.aifb.kit.edu/gla/tr/cikm_kwtopk.pdf.Google Scholar
- G. Li, B. C. Ooi, J. Feng, J. Wang, and L. Zhou. Ease: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data. In SIGMOD Conference, pages 903--914, 2008. Google ScholarDigital Library
- F. Liu, C. T. Yu, W. Meng, and A. Chowdhury. Effective keyword search in relational databases. In S. Chaudhuri, V. Hristidis, and N. Polyzotis, editors, SIGMOD Conference, pages 563--574. ACM, 2006. Google ScholarDigital Library
- Y. Luo, X. Lin, W. Wang, and X. Zhou. Spark: top-k keyword query in relational databases. In SIGMOD Conference, pages 115--126, 2007. Google ScholarDigital Library
- L. Qin, J. Yu, and L. Chang. Ten thousand sqls: Parallel keyword queries computing. PVLDB, 3(1):58--69, 2010. Google ScholarDigital Library
- L. Qin, J. X. Yu, and L. Chang. Keyword search in databases: the power of rdbms. In SIGMOD Conference, pages 681--694, 2009. Google ScholarDigital Library
- R. Schenkel, A. Theobald, and G. Weikum. HOPI: an efficient connection index for complex XML document collections. In EDBT, pages 665--666. 2004.Google ScholarCross Ref
- T. Tran, H. Wang, S. Rudolph, and P. Cimiano. Top-k exploration of query candidates for efficient keyword search on graph-shaped (rdf) data. In ICDE, 2009. Google ScholarDigital Library
- J. X. Yu, L. Qin, and L. Chang. Keyword search in relational databases: A survey. IEEE Data Eng. Bull., 33(1):67--78, 2010.Google Scholar
Index Terms
- Index structures and top-k join algorithms for native keyword search databases
Recommendations
Efficient continuous top-k keyword search in relational databases
WAIM'10: Proceedings of the 11th international conference on Web-age information managementKeyword search in relational databases has been widely studied in recent years. Most of the previous studies focus on how to answer an instant keyword query. In this paper, we focus on how to find the top-k answers in relational databases for continuous ...
Finding Top-k Answers in Keyword Search over Relational Databases Using Tuple Units
Existing studies on keyword search over relational databases usually find Steiner trees composed of connected database tuples as answers. They on-the-fly identify Steiner trees by discovering rich structural relationships between database tuples, and ...
Structure-aware indexing for keyword search in databases
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementMost of existing methods of keyword search over relational databases find the Steiner trees composed of relevant tuples as the answers. They identify the Steiner trees by discovering the rich structural relationships between tuples, and neglect the fact ...
Comments