Skip to main content

Selectivity Estimation for Exclusive Query Translation in Deep Web Data Integration

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5463))

Abstract

In Deep Web data integration, some Web database interfaces express exclusive predicates of the form Q e  = P i (P i  ∈ P 1, P 2,...,P m ), which permits only one predicate to be selected at a time. Accurately and efficiently estimating the selectivity of each Q e is of critical importance to optimal query translation. In this paper, we mainly focus on the selectivity estimation on infinite-value attribute which is more difficult than that on key attribute and categorical attribute. Firstly, we compute the attribute correlation and retrieve approximate random attribute-level samples through submitting queries on the least correlative attribute to the actual Web database. Then we estimate Zipf equation based on the word rank of the sample and the actual selectivity of several words from the actual Web database. Finally, the selectivity of any word on the infinite-value attribute can be derived by the Zipf equation. An experimental evaluation of the proposed selectivity estimation method is provided and experimental results are highly accurate.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. The Deep Web: Surfacing Hidden Value, http://www.completeplanet.com/Tutorials/

  2. Oliken. F.: Random Sampling from databases. PhD Thesis, University of California, Berkeley (1993)

    Google Scholar 

  3. Zhang. Z., He. B., Chang. K. C. C.: On-the-fly Constraint Mapping across Web Query Interfaces. In: IIWEB (2004)

    Google Scholar 

  4. Mandelbrot, B.B.: Fractal Geometry of Nature. W. H. Freeman and Co., New York (1988)

    MATH  Google Scholar 

  5. Dasgupta. A., Das. G., Mannila. H.: A random walk approach to sampling hidden databases. In: SIGMOD, pp. 629–640 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jiang, F., Meng, W., Meng, X. (2009). Selectivity Estimation for Exclusive Query Translation in Deep Web Data Integration. In: Zhou, X., Yokota, H., Deng, K., Liu, Q. (eds) Database Systems for Advanced Applications. DASFAA 2009. Lecture Notes in Computer Science, vol 5463. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00887-0_53

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-00887-0_53

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-00886-3

  • Online ISBN: 978-3-642-00887-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics