Abstract
We examine how to apply the hash-join paradigm to spatial joins, and define a new framework for spatial hash-joins. Our spatial partition functions have two components: a set of bucket extents and an assignment function, which may map a data item into multiple buckets. Furthermore, the partition functions for the two input datasets may be different.We have designed and tested a spatial hash-join method based on this framework. The partition function for the inner dataset is initialized by sampling the dataset, and evolves as data are inserted. The partition function for the outer dataset is immutable, but may replicate a data item from the outer dataset into multiple buckets. The method mirrors relational hash-joins in other aspects. Our method needs no pre-computed indices. It is therefore applicable to a wide range of spatial joins.Our experiments show that our method outperforms current spatial join algorithms based on tree matching by a wide margin. Further, its performance is superior even when the tree-based methods have pre-computed indices. This makes the spatial hash-join method highly competitive both when the input datasets are dynamically generated and when the datasets have pre-computed indices.
- 1 M. Kitsuregawa, H. Tanaka, and T. Moto-Oka, "Application of hash to data base machine and its architecture," New Generation Computing, vol. 1, no. 1, pp. 66-74, 1983.Google ScholarCross Ref
- 2 D. J. DeWitt, R. H. Katz, F. Olken, L. D. Shapiro, M. R. Stonebraker, and D. Wood, "Implementation techniques for main memory database systems," in Proceedings of A CM SIGMOD International Conference on Management o/Data, pp. 1-8, 1984. Google ScholarDigital Library
- 3 D. J. DeWitt and R. Gerber, "Multiprocessor hashbased join algorithms," in Proceedings of VLDB 85, pp. 151-164, Stockholm, 1985.Google Scholar
- 4 M. Nakayama, M. Kitsuregawa, and M. Takagi, "Hashpartitioned join method using dynamic destaging strategy," in Proceedings of the 14th VLDB Conference, pp. 468-478, 1988. Google ScholarDigital Library
- 5 M. Kitsuregawa, M. Nakayama, and M. Takagi, "The effect of bucket size tuning in the dynamic hybrid grace hash join method," in Proceedings of the Fifteenth International Conference on Very Large Data Bases, pp. 257-266, Amsterdam, 1989. Google ScholarDigital Library
- 6 L. D. Shapiro, "Join processing in database systems with large main memories," A CM Transactions on Database Systems, vol. 11, no. 3, pp. 239-264, September 1986. Google ScholarDigital Library
- 7 P. Mishra and M. H. Eich, "Join processing in relational databases," A CM Computing Surveys, vol. 24, no. 1, pp. 64-113, March 1992. Google ScholarDigital Library
- 8 M.-L. Lo and C. V. Ravishankar, "Spatial joins using seeded trees," in Proceedings of A CM SIGMOD International Conference on Management of Data, pp. 209- 220, Minneapolis, MN, May 1994. Google ScholarDigital Library
- 9 M.-L. Lo and C. V. Ravishankar, "Generating seeded trees from data sets," in The Fourth International Symposium on Large Spatial Databases (Advances in Spatial Databases: SSD '95), Portland, Maine, August 26-29 1995, Springer-Verlag. Google ScholarDigital Library
- 10 J. Orenstein, "A comparison of spatial query processing techniques for native and parameter spaces," in Proceedings of A CM SIGMOD International Conference on Management of Data, pp. 343-352, 1990. Google ScholarDigital Library
- 11 D. Rotem, "Spatial join indices," in Proceedings of International Conference on Data Engineering, pp. 500-509, Kobe, Japan 1991. Google ScholarDigital Library
- 12 W. Lu and J. Han, "Distance-associated join indices for spatial range search," in Proceedings of Internat2onal Conference on Data Engineering, pp. 284-292, 1992. Google ScholarDigital Library
- 13 J. A. Orenstein, "Redundancy in spatial databases," in Proceedings oj' A CM SIGMOD Internatzonal Conference on Management of Data, Portland, OR, 1989. Google ScholarDigital Library
- 14 J. Orenstein, "An algorithm for computing the overlay of k-dimensional spaces," in Advances in Spatial Databases (SSD '91), 0. Gunther and H.-J. Schek, editors, pp. 381-400, Zurich, Switzerland, August 28-30 1991, Springer-Verlag. Google ScholarDigital Library
- 15 O. Gunther, "Efficient computation of spatial joins," Proceedings o:f international Conference on Data Engineering, pp. 50-59, 1993. Google ScholarDigital Library
- 16 R. H. Guting and W. Schilling, "A practical divideand-conquer algorithm for the rectangle intersection problem," Information Sciences, vol. 42, no. 2, pp. 95- 112, July 1987. Google ScholarDigital Library
- 17 A. Guttman, "R-trees: A dynamic index structure for spatial searching," Proceedings of A CM SIGMOD International Conference on Management of Data, pp. 47-57, Aug. 1984. Google ScholarDigital Library
- 18 N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger, "The R*-tree: An efficient and robust access method for points and rectangles," Proceedings of A CM SIGMOD International Conference on Management of Data, pp. 322-332, May 1990. Google ScholarDigital Library
- 19 C. Faloutsos, T. Sellis, and N. Roussopoulos, "Analysis of object oriented spatial access methods," Proceedings of ACM SIGMOD Internatzonal Conference on Management of Data, pp. 427-439, 1987. Google ScholarDigital Library
- 20 T. Sellis, N. Roussopoulos, and C. Faloutsos, "The R+- tree: A dynamic index for multi-dimensional objects," in Proceedings of Very Large Data Bases, pp. 3-11, Brighton, England, 1987. Google ScholarDigital Library
- 21 T. Brinkhoff, H.-P. Kriegel, and B. Seeger, "Efficient processing of spatial joins using R-trees," Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 237-246, May 1993. Google ScholarDigital Library
- 22 J. M.Patel and D. DeWitt, "Partition based spatialmerge join," in Proceedings of the 1996 A CM.SIGMOD conference, Montreal, Canada, 3-6 June 1996. Google ScholarDigital Library
- 23 C. Faloutsos and Y. Rong, "Dot: A spatial access method using fractals," in Proceedings of International Conference on Data Engzneering, pp. 152-159, 1991. Google ScholarDigital Library
- 24 B. of Census, "Tiger/lines precensus files: 1990 technical documentation," Technical report, Bureau of Census, Washington, DC, 1989.Google Scholar
- 25 D. J. DeWitt, N. Kabra, J. Luo, J. M. Patel, and J. Yu, "Client-server paradise," in Proceedings of the 20th VLDB Conference, Santiage, Chile, September 1994. Google ScholarDigital Library
Index Terms
- Spatial hash-joins
Recommendations
Spatial hash-joins
SIGMOD '96: Proceedings of the 1996 ACM SIGMOD international conference on Management of dataWe examine how to apply the hash-join paradigm to spatial joins, and define a new framework for spatial hash-joins. Our spatial partition functions have two components: a set of bucket extents and an assignment function, which may map a data item into ...
On Spatial Joins in MapReduce
SIGSPATIAL '17: Proceedings of the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information SystemsThis paper provides the first attempt for a full-fledged query optimizer for MapReduce-based spatial join algorithms. The optimizer develops its own taxonomy that covers almost all possible ways of doing a spatial join for any two input datasets. The ...
Applying Segmented Right-Deep Trees to Pipelining Multiple Hash Joins
The pipelined execution of multijoin queries in a multiprocessor-based database system is explored in this paper. Using hash-based joins, multiple joins can be pipelined so that the early results from a join, before the whole join is completed, are sent ...
Comments