Skip to main content
Log in

S-MRST: a novel framework for indexing uncertain data

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

This paper studies the problem of probabilistic range query over uncertain data. Although existing solutions could support such query, it still has space for improvement. In this paper, we firstly propose a novel index called S-MRST for indexing uncertain data. For one thing, via using an irregular shape for bounding uncertain data, it has a stronger space pruning ability. For another, by taking the gradient of probability density function into consideration, S-MRST is also powerful in terms of probability pruning ability. More important, S-MRST is a general index which could support multiple types of probabilistic queries. Theoretical analysis and extensive experimental results demonstrate the effectiveness and efficiency of the proposed index.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19
Figure 20
Figure 21

Similar content being viewed by others

References

  1. Agarwal, P. K., Cheng, S.-W., Tao, Y., Ke, Y.: Indexing uncertain data. In: PODS, pp. 137–146 (2009)

  2. Angiulli, F, Fassetti, F: Indexing uncertain data in general metric spaces. IEEE Trans. Knowl. Data Eng. 24(9), 1640–1657 (2012)

    Article  Google Scholar 

  3. Bernecker, T., Emrich, T., Kriegel, H.-P., Renz, M., Zankl, S., Zu̇fle, A.: Efficient probabilistic reverse nearest neighbor query processing on uncertain data. PVLDB 4(10), 669–680 (2011)

    Google Scholar 

  4. Cheema, M. A., Lin, X., Wang, W., Zhang, W., Pei, J.: Probabilistic reverse nearest neighbor queries on uncertain data. IEEE Trans. Knowl. Data Eng. 22 (4), 550–564 (2010)

    Article  Google Scholar 

  5. Chen, S, Ooi, B. C., Tan, K.-L., Nascimento, M. A.: St 2b-tree: a self-tunable spatio-temporal b +-tree index for moving objects. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, pp. 29–42. Vancouver (2008)

  6. Chen, L, Gao, Y, Li, X, Jensen, C.S., Chen, G., Zheng, B: Indexing metric uncertain data for range queries. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 951–965. Melbourne (2015)

  7. Cheng, R., Kalashnikov, D. V., Prabhakar, S.: Evaluating probabilistic queries over imprecise data. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 551–562. San Diego (2003)

  8. Cheng, R., Xia, Y., Prabhakar, S., Shah, R., Vitter, J. S.: Efficient indexing methods for probabilistic threshold queries over uncertain data. In: VLDB, pp. 876–887 (2004)

  9. Faradjian, A., Gehrke, J., Philippe, B.: GADT: A probability space ADT for representing and querying the physical world. In: Proceedings of the 18th International Conference on Data Engineering, pp. 201–211. San Jose (2002)

  10. Jeffery, S.R., Franklin, M. J., Garofalakis, M. N.: An adaptive RFID middleware for supporting metaphysical data independence. VLDB J. 17(2), 265–289 (2008)

    Article  Google Scholar 

  11. Kalashnikov, D.V., Ma, Y., Mehrotra, S., Hariharan, R.: Index for fast retrieval of uncertain spatial point data. In: GIS, pp. 195–202 (2006)

  12. Kriegel, H.-P., Kunath, P., Renz, M.: Probabilistic nearest-neighbor query on uncertain objects. In: Advances in Databases: Concepts, Systems and Applications, 12th International Conference on Database Systems for Advanced Applications, DASFAA 2007, pp. 337–348. Bangkok (2007)

  13. Lian, X., Chen, L.: Efficient processing of probabilistic reverse nearest neighbor queries over uncertain data. VLDB J. 18(3), 787–808 (2009)

    Article  Google Scholar 

  14. Lian, X, Chen, L: Ranked query processing in uncertain databases. IEEE Trans. Knowl. Data Eng. 22(3), 420–436 (2010)

    Article  Google Scholar 

  15. Lian, X, Chen, L: Causality and responsibility: Probabilistic queries revisited in uncertain databases. In: 22nd ACM International Conference on Information and Knowledge Management, CIKM’13, pp. 349–358. San Francisco (2013)

  16. Mokbel, M.F., Chow, C.-Y., Aref, W. G.: The new casper: Query processing for location services without compromising privacy. In: Proceedings of the 32nd International Conference on Very Large Data Bases, pp. 763–774. Seoul (2006)

  17. Ohsawa, Y., Sakauchi, M.: The bd-tree - a new n-dimensional data structure with highly efficient dynamic characteristics. In: IFIP Congress, pp. 539–544 (1983)

  18. Pei, J., Jiang, B., Lin, X., Yuan, Y.: Probabilistic skylines on uncertain data. In: Proceedings of the 33rd International Conference on Very Large Data Bases, University of Vienna, pp. 15–26. Austria (2007)

  19. Shapiro, J.M.: Embedded image coding using zerotrees of wavelet coefficients. IEEE Trans. Signal Process. 41(12), 3445–3462 (1993)

    Article  MATH  Google Scholar 

  20. Song, C, Li, Z, Ge, T.: Top-k oracle: A new way to present top-k tuples for uncertain data. In: 29th IEEE International Conference on Data Engineering, ICDE 2013, pp. 146–157. Brisbane (2013)

  21. Tao, Y., Cheng, R., Xiao, X., Ngai, W.K., Kao, B., Prabhakar, S.: Indexing multi-dimensional uncertain data with arbitrary probability density functions. In: VLDB, pp. 922–933 (2005)

  22. Tran, T.T. L., Sutton, C. A., Cocci, R., Nie, Y., Diao, Y., Shenoy, P. J.: Probabilistic inference over rfid streams in mobile environments. In: ICDE, pp. 1096–1107 (2009)

  23. Tong, Y, Chen, L, Cheng, Y, Philip, S. Yu: Mining frequent itemsets over uncertain databases. PVLDB 5(11), 1650–1661 (2012)

    Google Scholar 

  24. Tong, Y, Chen, L, Ding, B: Discovering threshold-based frequent closed itemsets over probabilistic data. In: EEE 28th International Conference on Data Engineering (ICDE 2012), pp. 270–281. Washington (2012)

  25. Tong, Y., Cao, C.C., Chen, L.: TCS: Efficient topic discovery over crowd-oriented service data. In: The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, pp. 861–870. New York (2014)

  26. Tong, Y., Zhang, X., Cao, C.C., Chen, L.: Efficient probabilistic supergraph search over large uncertain graphs. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, pp. 809–818, CIKM 2014. Shanghai (2014)

  27. Tong, Y, Chen, L, She, J: Mining frequent itemsets in correlated uncertain databases. J. Comput. Sci. Technol. 30(4), 696–712 (2015)

    Article  MathSciNet  Google Scholar 

  28. Yang, X., Wang, B., Qiu, T., Wang, Y., Li, C.: Improving regular-expression matching on strings using negative factors. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, pp. 361–372. New York (2013)

  29. Yang, X., Wang, Y., Wang, B., Wang, W.: Local filtering: Improving the performance of approximate queries on string collections. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 377–392. Melbourne (2015)

  30. Zhang, Y., Lin, X., Zhang, W., Wang, J., Lin, Q.: Effectively indexing the uncertain space. IEEE Trans. Knowl. Data Eng. 22(9), 1247–1261 (2010)

    Article  Google Scholar 

  31. Zhang, Y., Zhang, W., Lin, Q., Lin, X.: Effectively indexing the multi-dimensional uncertain objects for range searching. In: EDBT, pp. 504–515 (2012)

  32. Zhu, R., Wang, B., Wang, G.: Indexing uncertain data for supporting range queries. In: Web-Age Information Management - 15th International Conference, WAIM 2014, Proceedings, pp. 72–83. Macau (2014)

Download references

Acknowledgment

This work is partially supported by the NSF of China for Outstanding Young Scholars under grant No. 61322208, the NSF of China for Key Program under grant No. 61532021, and the NSF of China under grant Nos. 61272178 and 61572122.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bin Wang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, R., Wang, B., Luo, S. et al. S-MRST: a novel framework for indexing uncertain data. World Wide Web 20, 697–727 (2017). https://doi.org/10.1007/s11280-016-0409-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-016-0409-x

Keywords

Navigation