Abstract
Supporting full-text query in an XML mediator is a difficult problem. This is because most data-sources do not provide keyword search and ranking. In this paper, we report on the integration of the main functionalities of the emerging XQuery Text standard in XLive, a full XML/XQuery mediator. Our approach is to index on keywords virtual documents in views. Selected virtual documents are on demand mapped to data source objects. Thus, the mediator selection operator is efficiently extended to support full-text search on views. Keyword search and result ranking are integrated. We rank results using a relevance formula adapted to XPath, based on number of keywords in elements and distance from the searched nodes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abiteboul, S., Cluet, S., Ferranet, G., Rousset, M.C.: The Xyleme project. Computer Networks 39(3), 225–238 (2002)
Amer-Yahia, S., Botev, C., Shanmugasundaram, J.: TeXQuery: A Full-Text Search Extension to XQuery, WWW’04 (2004)
BEA: Liquid data for WebLogic 1.1 (2004) http://e-docs.bea.com/liquiddata/docs11/
Bremer J. M., Gertz, M.: XQuery/IR: Integrating XML Document and Data Retrieval, WebDB (2002)
Buxton, S., Rys, M.:(eds.): XQuery and XPath Full-Text Requirements, W3C Working Draft 02 (May 02, 2003). http://www.w3.org/TR/xquery-full-text-requirements/
Chen, Q., Lim, A., Ong, K.W.: D(k)-index: An adaptive structural summary for graph-structured data. In: Proc. of SIGMOD (2003)
Chin-Wan, C., Min, J., Shim, K.: APEX: an adaptive path index for XML data. In: SIGMOD Conference 2002, pp. 121–132 (2002)
Cooper, B., Sample, N., Franklin, M.J., Hjaltason, G.R., Shadmon, M.: A Fast Index for Semistructured Data. VLDB, 341–350 (2001)
Dang-Ngoc, T.-T., Gardarin, G.: Federating heterogeneous data sources with XML. In: Proc. of IASTED IKS Conference, Scottsdale, USA, pp. 193–198 (2003)
Fuhr, N., Großjohann, K.: XIRQL: A Query Language for Information Retrieval in XML Documents. SIGIR, 172–180 (2001)
Gardarin, G., Yeh, L.: Treeguide Index: Enabling Efficient XML Query Processing, Bases de Données Avancées, Montpellier (Octobre 2005)
IBM: DB2 Information Integrator for Content (2004), http://www-306.ibm.com/software/ data/eip/
Kaushik, R., Shenoy, P., Bohannon, P., Gudes, E.: Exploiting local similarity for indexing paths in graph-structured data. In: Proc. of ICDE (2002)
Lin, G., Shao, F., Botev, C., Shanmugasundaram, J.: XRANK: Ranked Keyword Search over XML Documents. In: SIGMOD Conference, pp. 16–27 (2003)
Milo, T., Suciu, D.: Structures for Path Expressions. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 277–295. Springer, Heidelberg (1998)
Papakonstantinou, Y., Borkar, V., Orgiyan, M., Stathatos, K., Suta, L., Vassalos, V., Velikhov, P.: XML queries and algebra in the Enosys integration platform. Data Knowl. Eng. 44(3), 299–322 (2003)
Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. VLDB journal, 334–350 (2001)
Theobald, A., Weikum, G.: The Index-Based XXL Search Engine for Querying XML Data with Relevance Ranking. In: Chaudhri, A.B., Unland, R., Djeraba, C., Lindner, W. (eds.) EDBT 2002. LNCS, vol. 2490, pp. 477–495. Springer, Heidelberg (2002)
Widom, J., et al.: Lore, a DBMS for XML, http://www-db.stanford.edu/lore/
XQuare: The XQuare project: open source information integration components based on XML and XQuery (2004), http://xquare.objectweb.org/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jamard, C., Gardarin, G. (2007). Extending an XML Mediator with Text Query. In: Filipe, J., Cordeiro, J., Pedrosa, V. (eds) Web Information Systems and Technologies. Lecture Notes in Business Information Processing, vol 1. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74063-6_9
Download citation
DOI: https://doi.org/10.1007/978-3-540-74063-6_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74062-9
Online ISBN: 978-3-540-74063-6
eBook Packages: Computer ScienceComputer Science (R0)