Ranking distributed database in tuple-level uncertainty

AbdulAzeem, Yousry M.; Eldesouky, Ali I.; Ali, Hesham A.; Salem, Mofreh M.

doi:10.1007/s00500-014-1306-9

Ranking distributed database in tuple-level uncertainty

Methodologies and Application
Published: 06 May 2014

Volume 19, pages 965–980, (2015)
Cite this article

Soft Computing Aims and scope Submit manuscript

Yousry M. AbdulAzeem¹,
Ali I. Eldesouky¹,
Hesham A. Ali¹ &
…
Mofreh M. Salem¹

294 Accesses
3 Citations
Explore all metrics

Abstract

Ranking in uncertain database environments has gained a great importance recently. Many techniques were introduced to rank uncertain databases and others to rank distributed certain databases. Unfortunately, there are not that much techniques in ranking distributed uncertain databases. This paper proposes a framework that improves ranking processing in the case of uncertain and distributed database. In the proposed framework, new communication and computation-efficient algorithms are investigated for retrieving the top-k tuples from distributed sites. These algorithms are applied in tuple-level uncertainty. The main concern of the proposed algorithms is to reduce the communication rounds utilized and amount of data transmitted while achieving efficient ranking. Experimental results emphasize that both proposed algorithms have a great impact on reducing communication cost. Also, the results clarify that the first algorithm is efficient in the case of a low number of sites while the second achieves better performance in the context of a higher number of sites.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

AbdulAzeem YM, Eldesouky A, Ali H (2012) Ranking in uncertain distributed database environments. In: The seventh international conference on computer engineering and systems (ICCES), pp 275–280
Agrawal P, Benjelloun O, Sarma AD, Hayworth C, Nabar S, Sugihara T, Widom J (2006) Trio: a system for data, uncertainty, and lineage. In: Proceedings of the 32nd international conference on very large data bases, VLDB Endowment, VLDB ’06, pp 1151–1154
Andreou P, Zeinalipour-Yazti D, Chrysanthis PK, Samaras G (2011) Power efficiency through tuple ranking in wireless sensor network monitoring. Distrib Parallel Databases 29(1–2):113–150
Article Google Scholar
Antova L, Koch C, Olteanu D (2007) From complete to incomplete information and back. In: Proceedings of the 2007 ACM SIGMOD international conference on management of data, SIGMOD ’07. ACM, New York, NY, USA, pp 713–724
Antova L, Jansen T, Koch C, Olteanu D (2008) Fast and simple relational processing of uncertain data. In: Proceedings of the 2008 IEEE 24th international conference on data engineering, ICDE ’08. IEEE Computer Society, Washington, DC, USA, pp 983–992
Antova L, Koch C, Olteanu D (2009) \({{10}^{10}}^6\) worlds and beyond: efficient representation and processing of incomplete information. VLDB J 18(5):1021–1040
Article Google Scholar
Babcock B, Olston C (2003) Distributed top-k monitoring. In: Proceedings of the 2003 ACM SIGMOD international conference on management of data, SIGMOD ’03. ACM, New York, NY, USA, pp 28–39
Benjelloun O, Sarma AD, Halevy A, Widom J (2006) Uldbs: databases with uncertainty and lineage. In: Proceedings of the 32nd international conference on very large data bases, VLDB Endowment, VLDB ’06, pp 953–964
Beskales G, Soliman MA, Ilyas IF (2008) Efficient search for the top-k probable nearest neighbors in uncertain databases. Proc VLDB Endow 1(1):326–339
Article Google Scholar
Calders T, Garboni C, Goethals B (2010) Efficient pattern mining of uncertain data with sampling. In: Zaki M, Yu J, Ravindran B, Pudi V (eds) Advances in knowledge discovery and data mining, Lecture Notes in Computer Science, vol 6118. Springer, Berlin, pp 480–487
Cao P, Wang Z (2004) Efficient top-k query calculation in distributed networks. In: Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing, PODC ’04. ACM, New York, NY, USA, pp 206–215
Cheng R, Kalashnikov DV, Prabhakar S (2003) Evaluating probabilistic queries over imprecise data. In: Proceedings of the 2003 ACM SIGMOD international conference on management of data, SIGMOD ’03. ACM, New York, NY, USA, pp 551–562
Cheng R, Kalashnikov DV, Prabhakar S (2004) Querying imprecise data in moving object environments. IEEE Trans Knowl Data Eng 16(9):1112–1127
Article Google Scholar
Cho Y, Son J, Chung YD (2008) Pot: an efficient top-k monitoring method for spatially correlated sensor readings. In: Proceedings of the 5th workshop on Data management for sensor networks, DMSN ’08. ACM, New York, NY, USA, pp 8–13
Dalvi NN, Suciu D (2007) Efficient query evaluation on probabilistic databases. VLDB J 16(4):523–544
Article Google Scholar
Deshpande A, Guestrin C, Madden SR, Hellerstein JM, Hong W (2004) Model-driven data acquisition in sensor networks. In: Proceedings of the thirtieth international conference on very large data Bases, VLDB Endowment, VLDB ’04, vol 30, pp 588–599
El-Desouky AI, Ali HA, AbdulAzeem YM (2010) Ranking distributed uncertain database systems: discussion and analysis. In: International conference on computer engineering and systems (ICCES), pp 295–300
Fagin R, Lotem A, Naor M (2003) Optimal aggregation algorithms for middleware. J Comput Syst Sci 66(4):614–656
Article MATH MathSciNet Google Scholar
Hua M, Pei J, Zhang W, Lin X (2008) Efficiently answering probabilistic threshold top-k queries on uncertain data. In: Proceedings of the 24th IEEE international conference on data, engineering, pp 1403–1405
Jestes J, Cormode G, Li F, Yi K (2011) Semantics of ranking queries for probabilistic data. IEEE Trans Knowl Data Eng 23(12):1903–1917
Article Google Scholar
Kanagal B, Deshpande A (2008) Online filtering, smoothing and probabilistic modeling of streaming data. In: Proceedings of the 2008 IEEE 24th international conference on data engineering, ICDE ’08. IEEE Computer Society, Washington, DC, USA, pp 1160–1169
Li C, Huang L, Tian L (2011) Efficient building algorithms of decision tree for uniformly distributed uncertain data. In: Ding Y, Wang H, Xiong N, Hao K, Wang L (eds) Seventh international conference on natural computation, ICNC 2011. IEEE, New York, pp 105–108
Li F, Yi K, Jestes J (2009) Ranking distributed probabilistic data. In: Proceedings of the 2009 ACM SIGMOD international conference on management of data, SIGMOD ’09. ACM, New York, NY, USA, pp 361–374
Lin CW, Hong TP (2012) A new mining approach for uncertain databases using cufp trees. Expert Syst Appl 39(4):4084–4093
Article MathSciNet Google Scholar
Lin CW, Hong TP, Chen YF, Lin TC, Pan ST (2013) An integrated mffp-tree algorithm for mining global fuzzy rules from distributed databases. J Univers Comput Sci 19(4):521–538
Google Scholar
Ljosa V, Singh AK (2008) Top-k spatial joins of probabilistic objects. In: Proceedings of the 2008 IEEE 24th international conference on data engineering, ICDE ’08. IEEE Computer Society, Washington, DC, USA, pp 566–575
Marian A, Bruno N, Gravano L (2004) Evaluating top-k queries over web-accessible databases. ACM Trans Database Syst 29(2):319–362
Article Google Scholar
Neumann T, Bender M, Michel S, Schenkel R, Triantafillou P, Weikum G (2009) Distributed top-k aggregation queries at large. Distrib Parallel Databases 26(1):3–27
Article Google Scholar
Qian A, Lu Y, Xiaofeng D, Zou L, Li Z (2009) Efficient top-k monitoring of abnormality in sensor networks. In: Proceedings of the 2009 ninth IEEE international conference on computer and information technology, CIT ’09, vol 02. IEEE Computer Society, Washington, DC, USA, pp 348–353
Re C, Dalvi N, Suciu D (2007) Efficient top-k query evaluation on probabilistic data. Proceedings of the 23th IEEE international conference on data engineering. IEEE Computer Society, Los Alamitos, CA, USA, pp 886–895
Google Scholar
SAMOS (2013) Samos. shipboard automated meteorological and oceanographic system. http://samos.coaps.fsu.edu
Sarma AD, Benjelloun O, Halevy A, Widom J (2006) Working models for uncertain data. In: Proceedings of the 22nd international conference on data engineering, ICDE ’06. IEEE Computer Society, Washington, DC, USA, pp 7–27
Sharfman I, Schuster A, Keren D (2007) A geometric approach to monitoring threshold functions over distributed data streams. ACM Trans Database Syst 32(4):1–32
Article Google Scholar
Soliman MA, Ilyas IF, Chang KCC (2007) Top-k query processing in uncertain databases. In: Proceedings of the 23th IEEE international conference on data engineering, pp 896–905
Tao Y, Cheng R, Xiao X, Ngai WK, Kao B, Prabhakar S (2005) Indexing multi-dimensional uncertain data with arbitrary probability density functions. In: Proceedings of the 31st international conference on very large data bases, VLDB Endowment, VLDB ’05, pp 922–933
Vlachou A, Doulkeridis C, Nørvåg K, Vazirgiannis M (2008) On efficient top-k query processing in highly distributed environments. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data, SIGMOD ’08. ACM, New York, NY, USA, pp 753–764
Wu M, Xu J, Tang X, Lee WC (2007) Top-k monitoring in wireless sensor networks. IEEE Trans Knowl Data Eng 19(7):962–976
Article Google Scholar
Xiong L, Chitti S, Liu L (2005) Top-k queries across multiple private databases. In: Proceedings of the 25th IEEE international conference on distributed computing systems, ICDCS ’05. IEEE Computer Society, Washington, DC, USA, pp 145–154
Ye M, Liu X, Lee WC, Lee DL (2010) Probabilistic top-k query processing in distributed sensor networks. Proceedings of the 26th IEEE international conference on data engineering. IEEE Computer Society, Los Alamitos, CA, USA, pp 585–588
Google Scholar
Yi K, Li F, Kollios G, Srivastava D (2008) Efficient processing of top-k queries in uncertain databases with x-relations. IEEE Trans Knowl Data Eng 20(12):1669–1682
Article Google Scholar
Yu H, Li HG, Wu P, Agrawal D, El Abbadi A (2005) Efficient processing of distributed top-k queries. In: Proceedings of the 16th international conference on database and expert systems applications, DEXA’05. Springer, Berlin, pp 65–74
Zhang Q, Li F, Yi K (2008a) Finding frequent items in probabilistic data. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data, SIGMOD ’08. ACM, New York, NY, USA, pp 819–832
Zhang W, Lin X, Pei J, Zhang Y (2008b) Managing uncertain data: probabilistic approaches. In: Proceedings of the 2008 the ninth international conference on web-age information management, WAIM ’08. IEEE Computer Society, Washington, DC, USA, pp 405–412
Zhang X, Chomicki J (2009) Semantics and evaluation of top-k queries in probabilistic databases. Distrib Parallel Databases 26(1):67–126
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Engineering and Systems Department, Faculty of Engineering, Mansoura University, Mansoura, Daqahlia, Egypt
Yousry M. AbdulAzeem, Ali I. Eldesouky, Hesham A. Ali & Mofreh M. Salem

Authors

Yousry M. AbdulAzeem
View author publications
You can also search for this author in PubMed Google Scholar
Ali I. Eldesouky
View author publications
You can also search for this author in PubMed Google Scholar
Hesham A. Ali
View author publications
You can also search for this author in PubMed Google Scholar
Mofreh M. Salem
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yousry M. AbdulAzeem.

Additional information

Communicated by A. Lotfi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

AbdulAzeem, Y.M., Eldesouky, A.I., Ali, H.A. et al. Ranking distributed database in tuple-level uncertainty. Soft Comput 19, 965–980 (2015). https://doi.org/10.1007/s00500-014-1306-9

Download citation

Published: 06 May 2014
Issue Date: April 2015
DOI: https://doi.org/10.1007/s00500-014-1306-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Ranking distributed database in tuple-level uncertainty

Abstract

Access this article

Similar content being viewed by others

Ranking Uncertain Distributed Database at Tuple Level

Efficient pruning for top-K ranking queries on attribute-wise uncertain datasets

Uncertain top-k query processing in distributed environments

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

Ranking Uncertain Distributed Database at Tuple Level

Efficient pruning for top-K ranking queries on attribute-wise uncertain datasets

Uncertain top-k query processing in distributed environments

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation