doi:10.1016/j.cpc.2006.06.001
Copyright © 2006 Elsevier B.V. All rights reserved.
Collision-free spatial hash functions for structural analysis of billion-vertex chemical bond networks
Cheng Zhanga, Bhupesh Bansala, Paulo S. Branicioa, c, Rajiv K. Kaliaa, Aiichiro Nakanoa,
,
, Ashish Sharmaa, b and Priya Vashishtaa
aCollaboratory for Advanced Computing and Simulations, Department of Computer Science, Department of Physics & Astronomy, Department of Chemical Engineering & Materials Science, University of Southern California, Los Angeles, CA 90089-0242, USA
bDepartment of Biomedical Informatics, Ohio State University, Columbus, OH 43210, USA
cDepartmento de Física, Universidade Federal de São Carlos, São Carlos, SP 13565, Brazil
Received 1 May 2006;
revised 7 June 2006;
accepted 13 June 2006.
Available online 20 July 2006.
References and further reading may be available for this article. To view references and further reading you must
purchase this article.
Abstract
State-of-the-art molecular dynamics (MD) simulations generate massive datasets involving billion-vertex chemical bond networks, which makes data mining based on graph algorithms such as K-ring analysis a challenge. This paper proposes an algorithm to improve the efficiency of ring analysis of large graphs, exploiting properties of K-rings and spatial correlations of vertices in the graph. The algorithm uses dual-tree expansion (DTE) and spatial hash-function tagging (SHAFT) to optimize computation and memory access. Numerical tests show nearly perfect linear scaling of the algorithm. Also a parallel implementation of the DTE + SHAFT algorithm achieves high scalability. The algorithm has been successfully employed to analyze large MD simulations involving up to 500 million atoms.
Keywords: Ring analysis; Topological network; Molecular dynamics simulation; Spatial hash function
PACS classification codes: 07.05.Kf; 07.05.Tp; 61.43.Bn; 61.72.Ff; 82.20.Wt; 89.20.Ff
Fig. 1. K-rings emanating from vertex x in a simple cubic structure, in which the edges are denoted by solid or dashed lines. Only the paths shown with solid lines in (a), (b) and (c) are considered as K-rings. The path in (d), though closed and without any cycles, is not a K-ring, because it is not the shortest path between the neighboring vertices P and R.
Fig. 2. K-ring statistics for vertex x in an network. Only the paths labeled as a, b and c are K-rings. Paths b′ and c′, though closed and without any cycles, are not K-rings, because they are not the shortest paths between the corresponding neighboring vertices of x. Vertex x has 3 K-rings a,b,c of lengths 6, 4, and 5, respectively.
Fig. 3. Comparison of the proposed dual-tree expansion (DTE) algorithm with the original algorithm using a 6-member ring as an example. The depth of search in the proposed algorithm is reduced by half.
Fig. 4. An example of the collision-free spatial hash function in 2D. Within any window no larger than the hash function's modulus, there will be no two identical numbers.
Fig. 5. (a) Log–log plot of clock time vs. problem size, where the DTE combined with SHAFT is compared against DTE alone and STE. DTE with SHAFT gives the best performance for large problem size and scales roughly linear. Lines are linear fits with slopes 1.14, 1.21, and 1.03 for STE, DTE, and DTE + SHAFT algorithms, respectively. (b) Number of instructions vs. problem size for the three algorithms in a log–log plot. Lines are least-square fits with slopes 1.03, 1.01, and 1.01 for STE, DTE, and DTE + SHAFT algorithms, respectively. (c) Log–log plot of cache misses vs. problem size for STE, DTE and DTE + SHAFT. Lines are least-square fits with slopes 1.33, 1.26, and 1.06 for STE, DTE, and DTE + SHAFT algorithms, respectively.
Fig. 6. Execution time of the DTE + SHAFT algorithm as a function of the number of computing nodes with a fixed problem size (5×105 vertices). The line is the least square fit with slope −1.09.
Fig. 7. (a) A thin slice of a 500 million-atom alumina target 40 nm in front of the projectile during hypervelocity impact simulation. Deviation in the number of 6-member rings from perfect crystalline atoms (blue) is color-coded using the gradient bar above. (b) The same plane colored by deviation in coordination number from perfect crystalline atoms (blue). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Table 1.
Dual-tree expansion algorithm

Table 2.
Spatial hash-function tagging algorithm
