skip to main content
research-article
Open Access

O’Reach: Even Faster Reachability in Large Graphs

Published:21 October 2022Publication History

Skip Abstract Section

Abstract

One of the most fundamental problems in computer science is the reachability problem: Given a directed graph and two vertices s and t, can sreach t via a path? We revisit existing techniques and combine them with new approaches to support a large portion of reachability queries in constant time using a linear-sized reachability index. Our new algorithm O’Reach can be easily combined with previously developed solutions for the problem or run standalone.

In a detailed experimental study, we compare a variety of algorithms with respect to their index-building and query times as well as their memory footprint on a diverse set of instances. Our experiments indicate that the query performance often depends strongly not only on the type of graph but also on the result, i.e., reachable or unreachable. Furthermore, we show that previous algorithms are significantly sped up when combined with our new approach in almost all scenarios. Surprisingly, due to cache effects, a higher investment in space doesn’t necessarily pay off: Reachability queries can often be answered even faster than single memory accesses in a precomputed full reachability matrix.

Skip 1INTRODUCTION Section

1 INTRODUCTION

Graphs are used to model problem settings of various different disciplines. A natural question that arises frequently is whether one vertex of the graph can reach another vertex via a path of directed edges. Reachability finds application in a wide variety of fields, such as program and dataflow analysis [24, 25], user-input dependence analysis [27], XML query processing [34], and more [40]. Another prominent example is the Semantic Web which is composed of RDF/OWL data. These are often very huge graphs with rich content. Here, reachability queries are often necessary to deduce relationships among the objects.

There are two straightforward solutions to the reachability problem: The first is to answer each query individually with a graph traversal algorithm, such as breadth-first search (BFS) or depth-first search (DFS), in worst-case \( \mathcal {O}(m+n) \) time and \( \mathcal {O}(n) \) space. Secondly, we can precompute a full all-pairs reachability matrix in an initialization step and answer all ensuing queries in worst-case constant time. In return, this approach suffers from a space complexity of \( \mathcal {O}(n^2) \) and an initialization time of \( \mathcal {O}(n\cdot m) \) using the Floyd–Warshall algorithm [6, 7, 35] or starting a graph traversal at each vertex in turn. Alternatively, the initialization step can be performed in \( \mathcal {O}(n^\omega) \) via fast matrix multiplication, where \( \mathcal {O}(n^\omega) \) is the time required to multiply two \( n \times n \) matrices (\( 2 \le \omega \lt 2.38 \) [20]). With increasing graph size however, both the initialization time and space complexity of this approach become impractical. We, therefore, strive for alternative algorithms which decrease these complexities whilst still providing fast query lookups.

Contribution. In this article, we study a variety of approaches that are able to support fast reachability queries. All of these algorithms perform some kind of preprocessing on the graph and then use the collected data to answer reachability queries in a timely manner. Based on simple observations, we provide a new algorithm, O’Reach, that can improve the query time for a wide range of cases over state-of-the-art reachability algorithms at the expense of some additional precomputation time and space or be run standalone. Furthermore, we show that previous algorithms are significantly sped up when combined with our new approach in almost all scenarios. In addition, we show that the expected query performance of various algorithms does not only depend on the type of graph, but also on the ratio of successful queries, i.e., with result reachable. Surprisingly, through cache effects and a significantly smaller memory footprint, especially unsuccessful reachability queries can be answered faster than single memory accesses in a precomputed reachability matrix.

Skip 2PRELIMINARIES Section

2 PRELIMINARIES

Terms and Definitions. Let \( G=(V, E) \) be a simple directed graph with vertex set V and edge set \( E\subseteq V \times V \). As usual, \( n=|V| \) and \( m=|E| \). An edge \( (u, v) \) is said to be outgoing at u and incoming at v, and u and v are called adjacent. The out-degree \( \mathrm{deg}^{+}(u) \) (in-degree \( \mathrm{deg}^{-}(u) \)) of a vertex u is its number of outgoing (incoming) edges. A vertex without incoming (outgoing) edges is called a source (sink). The out-neighborhood \( \textsf {N}^{{+}}(v) \) (in-neighborhood \( \textsf {N}^{{-}}(v) \)) of a vertex u is the set of all vertices v such that \( (u, v) \in E \) (\( (v, u) \in E \)). The reverse of an edge \( (u, v) \) is an edge \( (v, u) = {(u, v)}^{{\mathrm{R}}} \). The reverse \( {G}^{{\mathrm{R}}} \) of a graph G is obtained by keeping the vertices of G, but substituting each edge \( (u, v) \in E \) by its reverse, i.e., \( {G}^{{\mathrm{R}}} = (V, {E}^{{\mathrm{R}}}) \).

A sequence of vertices \( s = v_0 \rightarrow \dots \rightarrow v_k = t \), \( k \ge 0 \), such that for each pair of consecutive vertices \( v_i \rightarrow v_{i+1} \), \( (v_i, v_{i+1})\in E \), is called an s-t path. If such a path exists, s is said to reach t and we write \( s \rightarrow ^*t \) for short, and \( s \not\rightarrow ^*t \) otherwise. The out-reachability \( \textsf {R}^{+}(u) = \lbrace v \mid u \rightarrow ^*v\rbrace \) (in-reachability \( \textsf {R}^{-}(u) = \lbrace v \mid v \rightarrow ^*u\rbrace \)) of a vertex \( u \in V \) is the set of all vertices that u can reach (that can reach u).

A weakly connected component(WCC) of G is a maximal set of vertices \( C \subseteq V \) such that \( \forall u, v \in C: u \rightarrow ^*v \) in \( G=(V, E \cup {E}^{{\mathrm{R}}}) \), i.e., also using the reverse of edges. Note that if two vertices \( u, v \) reside in different WCCs, then \( u \not\rightarrow ^*v \) and \( v \not\rightarrow ^*u \). A strongly connected component(SCC) of G denotes a maximal set of vertices \( S \subseteq V \) such that \( \forall u, v \in S: u \rightarrow ^*v \wedge v \rightarrow ^*u \) in G. Contracting each SCC S of G to a single vertex \( v_S \), called its representative, while preserving edges between different SCCs as edges between their corresponding representatives, yields the condensation \( {G}^{{\mathrm{C}}} \) of G. We denote the SCC a vertex \( v \in V \) belongs to by \( \mathcal {S}(v) \). A directed graph G is strongly connected if it only has a single SCC and acyclic if each SCC is a singleton, i.e., if G has n SCCs. Observe that G and \( {G}^{{\mathrm{R}}} \) have exactly the same WCCs and SCCs and that \( {G}^{{\mathrm{C}}} \) is a directed acyclic graph (DAG). WCCs of a graph can be computed in \( \mathcal {O}(n+m) \) time, e.g., via a BFS that ignores edge directions. The SCCs of a graph can be computed in linear time [29] as well.

A topological ordering \( \tau : V \rightarrow \mathbb {N}_0 \) of a DAG G is a total ordering of its vertices such that \( \forall (u, v) \in E: \tau (u) \lt \tau (v) \). Note that the topological ordering of G isn’t necessarily unique, i.e., there can be multiple different topological orderings. For a vertex \( u \in V \), the forward topological level \( \mathcal {F}(u) = \min _\tau \tau (u) \), i.e., the minimum value of \( \tau (u) \) among all topological orderings \( \tau \) of G. Consequently, \( \mathcal {F}(u) = 0 \) if and only if u is a source. The backward topological level \( \mathcal {B}(u) \) of \( u \in V \) is the topological level of u with respect to \( {G}^{{\mathrm{R}}} \) and \( \mathcal {B}(u) = 0 \) if and only if u is a sink. A topological ordering, as well as the forward and backward topological levels, can be computed in linear time [6, 19, 30], see also Section 4.

A reachability query Query(\( s, t \)) for a pair of vertices \( s, t \in V \) is called positive and answered with true if \( s \rightarrow ^*t \), and otherwise negative and answered with false. Trivially, Query(\( v, v \)) is always true, which is why we only consider non-trivial queries between distinct vertices \( s \ne t \in V \) from here on. Let \( \mathcal {P} \) (\( \mathcal {N} \)) denote the set of all positive (negative) non-trivial queries of G, i.e., the set of all \( (s, t) \in V \times V \), \( s \ne t \), such that Query(\( s, t \)) is positive (negative). The reachability \( \rho \) in G is the ratio of positive queries among all non-trivial queries, i.e., \( \rho = \frac{|\mathcal {P}|}{n(n-1)} \). Note, that due to the restriction to non-trivial queries,1 \( 0 \le \rho \le 1 \). The Reachabilityproblem, studied in this article, consists in answering a sequence of reachability queries for arbitrary pairs of vertices on a given input graph G.

Basic Observations. With respect to processing a reachability Query(\( s, t \)) in a graph G for an arbitrary pair of vertices \( s \ne t \in V \), the following basic observations are immediate and have partially also been noted elsewhere [22] :

(B1)

If s is a sink or t is a source, then \( s \not\rightarrow ^*t \).

(B2)

If s and t belong to different WCCs of G, then \( s \not\rightarrow ^*t \).

(B3)

If s and t belong to the same SCC of G, then \( s \rightarrow ^*t \).

(B4)

If \( \tau (\mathcal {S}(t)) \lt \tau (\mathcal {S}(s)) \) for any topological ordering \( \tau \) of \( {G}^{{\mathrm{C}}} \), then \( s \not\rightarrow ^*t \).

As mentioned above, the precomputations necessary for Observations (B2) and (B3) can be performed in \( \mathcal {O}(n+m) \) time. Note, however, that Observations (B3) and (B4) together are equivalent to asking whether \( s \rightarrow ^*t \): If \( s \rightarrow ^*t \) and \( \mathcal {S}(s) \ne \mathcal {S}(t) \), then for every topological ordering \( \tau \), \( \tau (\mathcal {S}(s)) \lt \tau (\mathcal {S}(t)) \). Otherwise, if \( s \not\rightarrow ^*t \), a topological ordering \( \tau \) with \( \tau (\mathcal {S}(t)) \lt \tau (\mathcal {S}(s)) \) can be computed by topologically sorting \( {G}^{{\mathrm{C}}} \cup \lbrace (\mathcal {S}(t),\mathcal {S}(s))\rbrace \). Hence, the precomputations necessary for Observation (B4) would require solving the Reachability problem for all pairs of vertices already. Furthermore, a DAG can have exponentially many different topological orderings. In consequence, weaker forms are employed, such as the following [22, 38, 39] (see also Section 4):

(B5)

If \( \mathcal {F}(\mathcal {S}(t)) \lt \mathcal {F}(\mathcal {S}(s)) \) w. r. t. \( {G}^{{\mathrm{C}}} \), then \( s \not\rightarrow ^*t \).

(B6)

If \( \mathcal {B}(\mathcal {S}(s) \lt \mathcal {B}(\mathcal {S}(t)) \) w. r. t. \( {G}^{{\mathrm{C}}} \), then \( s \not\rightarrow ^*t \).

Assumptions. Following the convention introduced in the preceding work [3, 22, 38, 39] (cf. Section 3), we only consider Reachability on DAGs from here on and implicitly assume that the condensation, if necessary, has already been computed and Observation (B3) has been applied. For better readability, we also drop the use of \( \mathcal {S}(\cdot) \).

Skip 3RELATED WORK Section

3 RELATED WORK

A large amount of research on reachability indices has been conducted. Existing approaches can roughly be put into three categories: compression of transitive closure [2, 13, 14, 15, 32, 34], hop-labeling-based algorithms [4, 5, 16, 26, 37], as well as pruned search [18, 22, 28, 31, 33, 36, 38, 39]. As Merz and Sanders [22] noted, the first category gives very good query times for small networks but doesn’t scale very well to large networks (which is the focus of this work). Therefore, we do not consider approaches based on this technique more closely. Hop labeling algorithms typically build paths from labels that are stored for each vertex. For example, in 2-hop labeling, each vertex stores two sets containing vertices it can reach in the given graph as well as in the reverse graph. A query can then be reduced to the set intersection problem. Pruned-search-based approaches precompute information to speed up queries by pruning the search.

Due to its volume, it is impossible to compare against all previous work. We mostly follow the methodology of Merz and Sanders [22] and focus on five recent techniques. The two most recent hop-labeling-based approaches are TF [3] and PPL [37]. In the pruned search category, the three most recent approaches are PReaCH [22], IP [36], and BFL [28]. We now go into more detail:

TF. The work by Cheng et al. [3] uses a data structure called topological folding. On the condensation DAG, the authors define a topological structure that is obtained by recursively folding the structure in half each time. Using this topological structure, the authors create labels that help to quickly answer reachability queries.

PPL. Yano et al. [37] use pruned landmark labeling and pruned path labeling as labels for their reachability queries. In general, the method follows the 2-hop labeling technique mentioned above, which stores sets of vertices for each vertex v and reduces queries to the set intersection problem. Their techniques are able to reduce the size of the stored labels and hence improve query time and space consumption.

PReaCH. Merz and Sanders [22] apply the approach of contraction hierarchies [9, 10] known for shortest-path queries to the reachability problem. The method first tries to answer queries by using pruning and precomputed information such as topological levels (Observation (B5) and (B6)). It adopts and improves techniques from GRAIL [38, 39] for that task, which is distinctly outperformed by PReaCH in the subsequent experiments. Should these techniques not answer the query, PReaCH instead performs a bidirectional BFS using the computed hierarchy, i.e., for a \( {\rm Q}{\rm\small{UERY}}(s,t) \) the BFS only considers neighboring vertices with larger topological levels and along the CH. The overall approach is simple and guarantees linear space and near linear preprocessing time.

IP. Wei et al. [36] use a randomized labeling approach by applying independent permutations on the labels. Contrary to other labeling approaches, IP checks for set-containment instead of set-intersection. Therefore, IP tries to answer negative queries by checking for at least one vertex that it is contained in only one of the two sets, where each set can consist of at most \( k_{\texttt {IP}} \) vertices. If this test fails, IP checks another label, which contains precomputed reachability information from the \( h_{\texttt {IP}} \) vertices with largest out-degree, and otherwise falls back to DFS.

BFL. Su et al. [28] propose a labeling method which is based on IP, but additionally uses Bloom filters for storing and comparing labels, which are then used to answer negative queries. As parameters, BFL accepts \( s_{\texttt {BFL}} \) and \( d_{\texttt {BFL}} \), where \( s_{\texttt {BFL}}{} \) denotes the length of the Bloom filters stored for each vertex and \( d_{\texttt {BFL}}{} \) controls the false positive rate. By default, \( d_{\texttt {BFL}}{} = 10\cdot s_{\texttt {BFL}}{} \).

Table 1 subsumes the time and space complexities of the new algorithm O’Reach that we introduce in Section 4 as well as all algorithms mentioned in this article except for TF, where the expressions describing the theoretical complexities are bulky and quite complex themselves.

Table 1.
AlgorithmInitialization TimeIndex Size (\( {\text{Byte}} \))Queries: TimeSpace
BFS/DFS\( \mathcal {O}(1) \)0\( \mathcal {O}(n+m) \)\( \mathcal {O}(n) \)
Full matrix\( \mathcal {O}(n \cdot (n+m)) \)\( n^2/8 \)\( \mathcal {O}(1) \)\( \mathcal {O}(1) \)
PPL [37]\( \mathcal {O}(n\log n+m) \)\( \mathcal {O}(n \log n) \)\( \mathcal {O}(\log n) \)\( \mathcal {O}(\log n) \)
PReaCH [22]\( \mathcal {O}(m+n\log n) \)56n\( \mathcal {O}(1) \) / \( \mathcal {O}(n+m) \)\( \mathcal {O}(n) \)
IP(\( k_{\texttt {IP}} \), \( h_{\texttt {IP}} \)) [36]\( \mathcal {O}((k_{\texttt {IP}}{} + h_{\texttt {IP}}{})(n+m)) \)\( \mathcal {O}((k_{\texttt {IP}}{} + h_{\texttt {IP}}{})n) \)\( \mathcal {O}(k_{\texttt {IP}}{}) \) / \( \mathcal {O}(k_{\texttt {IP}}{}\cdot n \cdot \rho {}^2) \)\( \mathcal {O}(n) \)
BFL(\( s_{\texttt {BFL}} \)) [28]\( \mathcal {O}(s_{\texttt {BFL}}{}\cdot (n+m)) \)\( 2\lceil \frac{s_{\texttt {BFL}}{}}{8}\rceil n \)\( \mathcal {O}(s_{\texttt {BFL}}{}) \) / \( \mathcal {O}(s_{\texttt {BFL}}{}\cdot n + m) \)\( \mathcal {O}(n) \)
O’Reach(\( d, k, p \)) (Section 4)\( \mathcal {O}((d+kp)(n+m)) \)\( (12 + 12 d+ 2 \lceil \frac{k}{8}\rceil)n \)\( \mathcal {O}(k+ d+ 1) \) / \( \mathcal {O}(n+m) \)\( \mathcal {O}(n) \)
  • Parameters: \( k_{\texttt {IP}}{} \): #permutations, \( h_{\texttt {IP}}{} \): #vertices with precomputed \( \textsf {R}^{+}(\cdot) \), \( s_{\texttt {BFL}}{} \): size of Bloom filter (bits), \( \rho {} \): reachability in G, d: #topological orderings, k: #supportive vertices, p: #candidates per supportive vertex.

Table 1. Time and Space Complexity of Reachability Algorithms

  • Parameters: \( k_{\texttt {IP}}{} \): #permutations, \( h_{\texttt {IP}}{} \): #vertices with precomputed \( \textsf {R}^{+}(\cdot) \), \( s_{\texttt {BFL}}{} \): size of Bloom filter (bits), \( \rho {} \): reachability in G, d: #topological orderings, k: #supportive vertices, p: #candidates per supportive vertex.

Skip 4O’REACH: FASTER REACHABILITY VIA OBSERVATIONS Section

4 O’REACH: FASTER REACHABILITY VIA OBSERVATIONS

In this section, we propose our new algorithm O’Reach, which is based on a set of simple, yet powerful observations that enable us to answer a large proportion of reachability queries in constant time and brings together techniques from both hop labeling and pruned search. Unlike regular hop-labeling-approaches, however, its initialization time is linear. As a further plus, our algorithm is configurable via multiple parameters and extremely space-efficient with an index of only \( 38n \,{\text{Byte}} \) in the most space-saving configuration that could handle all instances used in Section 5 and uses all features.

Overview. The hop labeling technique used in our algorithm is inspired by a recent result for experimentally faster reachability queries in a dynamic graph by Hanauer et al. [11]. The idea here is to speed up reachability queries based on a selected set of so-called supportive vertices, for which complete out- and in-reachability is maintained explicitly. This information is used in three simple observations, which allow to answer matching queries in constant time. In our algorithm, we transfer this idea to the static setting. We further increase the ratio of queries answerable in constant time by a new perspective on topological orderings and their conflation with DFS, which provides additional reachability information and further increases the ratio of queries answerable in constant time. In case we cannot answer a query via an observation, we fall back to either a pruning bidirectional BFS or one of the existing algorithms.

In the following, we switch the order and first discuss topological orderings in depth, followed by our adaptation of supportive vertices. For both parts, consider a reachability Query(\( s, t \)) for two vertices \( s, t \in V \) with \( s \ne t \).

4.1 Extended Topological Orderings

Taking up the observation that topological orderings can be used to answer a reachability query decisively negative, we first investigate how Observation (B4) can be used most effectively in practice. Before we dive deeper into this subject, let us briefly review some facts concerning topological orderings and reachability in general.

Theorem 4.1.

Let \( \mathcal {N}(\tau) \subseteq \mathcal {N} \) denote the set of negative queries a topological ordering \( \tau \) can answer, i.e., the set of all \( (s, t) \in \mathcal {N} \) such that \( \tau (t) \lt \tau (s) \), and let \( \rho ^{{-}}(\tau) = \mathcal {N}(\tau) / \mathcal {N} \) be the answerable negative query ratio.

(i)

The reachability in any DAG is at most 50%. In this case, the topological ordering is unique.

(ii)

Any topological ordering \( \tau \) witnesses the non-reachability between exactly 50% of all pairs of distinct vertices. Therefore, \( \rho ^{{-}}(\tau) \ge 50\% \).

(iii)

Every topological ordering of the same DAG can answer the same ratio of all negative queries via Observation (B4), i.e., for two topological orderings \( \tau \), \( \tau ^{\prime } \): \( \rho ^{{-}}(\tau) = \rho ^{{-}}(\tau ^{\prime }) \).

(iv)

For two different topological orderings \( \tau \ne \tau ^{\prime } \) of a DAG, \( \mathcal {N}(\tau) \ne \mathcal {N}(\tau ^{\prime }) \).

Proof.

Let G be a DAG.

(i)

As G is acyclic, there is at least one topological ordering \( \tau \) of G. Then, for every edge \( (u, v) \) of G, \( \tau (u) \lt \tau (v) \), which implies that each vertex u can reach at most all those vertices \( w \ne u \) with \( \tau (u) \lt \tau (w) \). Consequently, a vertex u with \( \tau (u) = i \) can reach at most \( n-i-1 \)other vertices (note that \( i \ge 0 \)). Thus, the reachability in G is at most \( \frac{1}{n(n-1)}\sum _{i=0}^{n-1} (n-i-1) = \frac{1}{n(n-1)}\sum _{j=0}^{n-1} j = \frac{n(n-1)}{n(n-1)\cdot 2} = \frac{1}{2} \). Conversely, assume that the reachability in G is \( \frac{1}{2} \). Then, each vertex u with \( \tau (u) = i \) reaches exactly all \( n-i-1 \) other vertices ordered after it, which implies that there exists no other topological ordering \( \tau ^{\prime } \) with \( \tau ^{\prime }(u) \gt \tau (u) \). By induction on i, the topological ordering of G is unique.

(ii)

Let \( \tau \) be an arbitrary topological ordering of G. Then, each vertex u with \( \tau (u) = i \) can certainly reach those vertices v with \( \tau (v) \lt \tau (u) \). Hence, \( \tau \) witnesses the non-reachability of exactly \( \sum _{i=1}^{n-1} i = \frac{n(n-1)}{2} \) pairs of distinct vertices.

(iii)

As Observation (B4) corresponds exactly to the non-reachability between those pairs of vertices witnessed by the topological ordering, the claim follows directly from (ii).

(iv)

As \( \tau \ne \tau ^{\prime } \), there is at least one \( i \in \mathbb {N}_0 \) such that \( \tau (u) = i = \tau ^{\prime }(v) \) and \( u \ne v \). Let \( j = \tau (v) \). If \( j \gt i \), the number of non-reachabilities from v to another vertex witnessed by \( \tau \) exceeds the number of those witnessed by \( \tau ^{\prime } \), and falls behind it otherwise. In both cases, the difference in numbers immediately implies a difference in the set of vertex pairs, which proves the claim. \( \qedhere \)

In consequence, it is pointless to look for one particularly good topological ordering. Instead, to get the most out of Observation (B4), we need topological orderings whose sets of answerable negative queries differ greatly, such that their union covers a large fraction of \( \mathcal {N} \). Note that both forward and backward topological levels each represent the set of topological orderings that can be obtained by ordering the vertices in blocks grouped by their level and arbitrarily permuting the vertices in each block. Different algorithms [6, 19, 29] for computing a topological ordering in linear time have been proposed over the years, with Kahn’s algorithm [19] in combination with a queue being one that always yields a topological ordering represented by forward topological levels. We, therefore, complement the forward and backward topological levels by stack-based approaches, as in Kahn’s algorithm [19] in combination with a stack or Tarjan’s DFS-based algorithm [29] for computing the SCCs of a graph, which as a by-product also yields a topological ordering of the condensation. To diversify the set of answerable negative queries further, we additionally randomize the order in which vertices are processed in case of ties and also compute topological orderings on the reverse graph, in analogy to backward topological levels.

We next show how, with a small extension, the stack-based topological orderings mentioned above can be used to additionally answer positive queries. To keep the description concise, we concentrate on Tarjan’s algorithm [29] in the following and reduce it to the part relevant for obtaining a topological ordering of a DAG. In short, the algorithm starts a DFS at an arbitrary vertex \( s \in S \), where \( S \subseteq V \) is a given set of vertices to start from. Whenever it visits a vertex v, it marks v as visited and recursively visits all unvisited vertices in its out-neighborhood. On return, it prepends v to the topological ordering. A loop over \( S = V \) ensures that all vertices are visited. Note that although the vertices are visited in DFS order, the topological ordering is different from a DFS numbering as it is constructed “from back to front” and corresponds to a reverse sorting according to what is also called finishing time of each vertex.

To answer positive queries, we exploit the invariant that when visiting a vertex v, all yet unvisited vertices reachable from v will be prepended to the topological ordering prior to v being prepended. Consequently, v can certainly reach all vertices in the topological ordering between v and, exclusively, the vertex w that was at the front of the topological ordering when v was visited. Let x denote the vertex preceding w in the final topological ordering, i.e., the vertex with the largest index that was reached recursively from v. For a topological ordering \( \tau \) constructed in this way, we call \( \tau (x) \) the high index of v and denote it with \( \tau _H(v) \). Furthermore, v may be able to also reach w and vertices beyond, which occurs if \( v \rightarrow ^*y \) for some vertex y, but y had already been visited earlier. We, therefore, additionally track the max index, the largest index of any vertex that v can reach, and denote it with \( \tau _{X}(v) \). Figure 1(a) shows how to compute an extended topological ordering with both high and max indices in pseudo-code and highlights our extensions. Compared to Tarjan’s original version [29], the running time remains unaffected by our modifications and is still in \( \mathcal {O}(n+m) \).

Fig. 1.

Fig. 1. (a) Extended Topological Sorting. (b) Three extended topological orderings of two graphs: The labels correspond to the order in the start set S. If the label is empty, the vertex need not be in S or can have any larger number. The brackets to the left show the range \( [\tau (v), \tau _H(v)] \) , the braces to the right the range \( [\tau (v), \tau _{X}(v)] \) .

Note that neither max nor high indices yield an ordering of V: Every vertex that is visited recursively starting from v and before vertex x with \( \tau (x) = \tau _H(v) \), inclusively, has the same high index as v, and the high index of each vertex in a graph consisting of a single path, e.g., would be n – 1. In particular, neither max nor high index forms a DFS numbering and also differ in definition and use from the DFS finishing times \( \hat{\phi } \) used in PReaCH, where a vertex v can certainly reach vertices with DFS number up to \( \hat{\phi } \) and certainly none beyond. Conversely, v may be able to also reach vertices with a smaller DFS number than its own, which cannot occur in a topological ordering.

If ExtendedTopSort is run on the reverse graph, it yields a topological ordering \( \tau ^{\prime } \) and high and max indices \( \tau ^{\prime }_H \) and \( \tau ^{\prime }_{X} \), such that reversing \( \tau ^{\prime } \) yields again a topological ordering \( \tau \) of the original graph. Furthermore, \( \tau _L(v) := n - 1 - \tau ^{\prime }_H(v) \) is a low index for each vertex v, which denotes the smallest index of a vertex in \( \tau \) that can certainly reach v, i.e., the out-reachability of v is replaced by in-reachability. Analogously, \( \tau _{N}(v) := n - 1 - \tau ^{\prime }_{X}(v) \) is a min index in \( \tau \) and no vertex u with \( \tau (u) \lt \tau _{N}(v) \) can reach v.

The following observations show how such an extended topological ordering \( \tau \) can be used to answer both positive and negative reachability queries:

(T1)

If \( \tau (s) \le \tau (t) \le \tau _H(s) \), then \( s \rightarrow ^*t \).

(T2)

If \( \tau (t) \gt \tau _{X}(s) \), then \( s \not\rightarrow ^*t \).

(T3)

If \( \tau (t) = \tau _{X}(s) \), then \( s \rightarrow ^*t \).

(T4)

If \( \tau _L(t) \le \tau (s) \le \tau (t) \), then \( s \rightarrow ^*t \).

(T5)

If \( \tau (s) \lt \tau _{N}(t) \), then \( s \not\rightarrow ^*t \).

(T6)

If \( \tau (s) = \tau _{N}(t) \), then \( s \rightarrow ^*t \).

Recall that by definition, \( \tau (s) \le \tau _H(s) \le \tau _{X}(s) \) and \( \tau _{N}(t) \le \tau _L(t) \le \tau (t) \). Figure 1(b) depicts three examples of extended topological orderings. In contrast to negative queries, not every extended topological ordering is equally effective in answering positive queries, and it can be arbitrarily bad, as shown in the extremes on the left (worst) and at the center (best) of Figure 1(b):

Theorem 4.2.

Let \( \mathcal {P}(\tau) \subseteq \mathcal {P} \) be the set of positive queries an extended topological ordering \( \tau \) can answer and let \( \rho ^{{+}}(\tau) = \mathcal {P}(\tau) / \mathcal {P} \) be the answerable positive query ratio. Then, \( 0 \le \rho ^{{+}}(\tau) \le 1 \).

Instead, the effectiveness of an extended topological ordering depends positively on the size of the ranges \( \left[\tau (v), \tau _H(v)\right] \) and \( \left[\tau _L(v), \tau (v)\right] \), and negatively on \( \left[\tau _H(v), \tau _{X}(v)\right] \) and \( \left[\tau _{N}(v), \tau _L(v)\right] \) which in turn depend on the recursion depths during construction and the order of recursive calls. The former two can be maximized if the first, non-recursive call to Visit() in line 4 in ExtendedTopSort always has a source as its argument, i.e., if the algorithm’s parameter S corresponds to the set of all sources. Clearly, this still guarantees that every vertex is visited.

In addition to the forward and backward topological levels, O’Reach thus computes a set of d extended topological orderings starting from sources, where d is a tuning parameter, and \( d/2 \) of them are obtained via the reverse graph. It then applies Observation (B4) as well as Observations (T1)(T6) to all extended topological orderings.

4.2 Supportive Vertices

We now show how to apply and improve the idea of supportive vertices in the static setting. A vertex v is supportive if the set of vertices that v can reach and that can reach v, \( {R^{+}}(v) \) and \( {R^{-}}(v) \), respectively, have been precomputed and membership queries can be performed in sublinear time. We can then answer reachability queries using the following simple observations [11]:

(S1)

If \( s\in {R^{-}}(v) \) and \( t\in {R^{+}}(v) \) for any \( v \in V \), then \( s \rightarrow ^*t \).

(S2)

If \( s\in {R^{+}}(v) \) and \( t\not\in {R^{+}}(v) \) for any \( v \in V \), then \( s \not\rightarrow ^*t \).

(S3)

If \( s\not\in {R^{-}}(v) \) and \( t\in {R^{-}}(v) \) for any \( v \in V \), then \( s \not\rightarrow ^*t \).

To apply these observations, our algorithm selects a set of k supportive vertices during the initialization phase. In contrast to the original use scenario in the dynamic setting, where the graph changes over time and it is difficult to choose “good” supportive vertices that can help to answer many queries, the static setting leaves room for further optimizations here: With respect to Observation (S1), we consider a supportive vertex v “good” if \( |{R^{+}}(v)| \cdot |{R^{-}}(v)| \) is large as it maximizes the possibility that \( s \in {R^{-}}(v) \wedge t \in {R^{+}}(v) \). With respect to Observation (S2) and (S3), we expect a “good” supportive vertex to have out- or in-reachability sets, respectively, of size close to \( \frac{n}{2} \), i.e., when \( |{R^{+}}(v)|\cdot |V \setminus {R^{+}}(v)| \) or \( |{R^{-}}(v)|\cdot |V \setminus {R^{-}}(v)| \), respectively, are maximal. Furthermore, to increase total coverage and avoid redundancy, the set of queries Query(\( s, t \)) covered by two different supportive vertices should ideally overlap as little as possible.

O’Reach takes a parameter k specifying the number of supportive vertices to pick. Intuitively speaking, we expect vertices in the topological “mid-levels” to be better candidates than those at the ends, as their out- and in-reachabilities (or non-reachabilities) are likely to be more balanced. Furthermore, if all vertices on one forward (backward) level i were supportive, then every Query(\( s, t \)) with \( \mathcal {F}(s) \lt i \lt \mathcal {F}(t) \) (\( \mathcal {B}(t) \lt i \lt \mathcal {B}(s) \)) could be answered using only Observation (S1). As finding a “perfect” set of supportive vertices is computationally expensive and we strive for linear preprocessing time, we experimentally evaluated different strategies for the selection process. Due to page limits, we only describe the most successful one: A forward (backward) level i is called central, if \( \frac{1}{5}L_{\max } \le i \le \frac{4}{5}L_{\max } \), where \( L_{\max } \) is the maximum topological level. A level i is called slim if there are at most h vertices having this level, where h is a parameter to O’Reach. We first compute a set of candidates of size at most \( k\cdot p \) that contains all vertices on the slim forward or backward levels, arbitrarily discarding vertices as soon as the threshold \( k\cdot p \) is reached. p is another parameter to O’Reach and together with k controls the size of the candidate set. If the threshold is not reached, we fill up the set of candidates by picking the missing number of vertices uniformly at random from all other vertices whose forward level is central. In the next step, the out- and in-reachabilities of all candidates are obtained and the k vertices v with the largest \( |{R^{+}}(v)| \cdot |{R^{-}}(v)| \) are chosen as supportive vertices. This strategy primarily optimizes for Observation (S1), but worked better in experiments than strategies that additionally tried to optimize for Observations (S2) and (S3). The time complexity of this process is in \( \mathcal {O}(kp(n+m) + kp\log (kp)) \).

We remark that this is a general-purpose approach that has shown to work well across different types of instances, albeit possibly at the expense of an increased initialization time. It seems natural that more specialized routines for different graph classes can improve both running time and coverage.

4.3 The Complete Algorithm

Given a graph G and a sequence of queries Q, we summarize in the following how O’Reach proceeds. During initialization, it performs the following steps:

Step 1:

Compute the WCCs.

Step 2:

Compute forward/backward topological levels.

Step 3:

Obtain d random extended topological orderings.

Step 4:

Pick k supportive vertices, compute \( \textsf {R}^{+}(\cdot) \) and \( \textsf {R}^{-}(\cdot). \)

Steps 1 and 2 run in linear time. As shown in Sections 4.1 and 4.2, the same applies to Steps 3 and 4, assuming that all parameters are constants. The required space is linear for all steps. The reachability index consists of the following information for each vertex v: one integer for the WCC, one integer each for \( \mathcal {F}(v) \) and \( \mathcal {B}(v) \), three integers for each of the d extended topological orderings \( \tau \) (\( \tau (v), \tau _H(v)/\tau _L(v), \tau _{X}(v)/\tau _{N}(v) \)), two bits for each of the k supportive vertices, indicating its reachability to/from v. For graphs with and \( n \le 2^{32} \), \( 4 \,{\text{Byte}} \) per integer suffice. Furthermore, we group the bits encoding the reachabilities to and from the supportive vertices, respectively, and represent them each by one suitably sized integer, e.g., using uint8_t (\( 8 \,\rm bit \)), for \( k\le 8 \) supportive vertices. As the smallest integer has at least \( 8 \,\rm bit \) on most architectures, we store \( {12 + 12 d{} + 2 \cdot \lceil \frac{k}{8}\rceil } \,{\text{Byte}} \) per vertex.

For each query Query(\( s, t \)), O’Reach tries to answer it using one of the observations in the order given below, which on the one hand has been optimized by some preliminary experiments on a small subset of benchmark instances (see Section 5 for details) and on the other hand strives for a fair alternation between “positive” and “negative” observations to avoid overfitting. Note that all observation-based tests run in constant time. As soon as one of them can answer the query affirmatively, the result is returned immediately. A test leading to a positive or negative answer is marked as or , respectively.

Test 1:

\( s = t \)?

Test 2:

topological levels (B5), (B6).

Test 3:

k supportive vertices, positive (S1).

Test 4:

first topological ordering (B4), (T1), (T2), (T3).

Test 5:

k supportive vertices, negative (S2), (S3).

Test 6:

remaining \( d-1 \) topological orderings (B4), (T1)/(T4), (T2)/(T5), (T3)/(T6).

Test 7:

different WCCs (B2).

Observe that the tests for Observations (S1)(S3) can each be implemented easily using boolean logic, which allows for a concurrent test of all supports whose reachability information is encoded in one accordingly-sized integer: For Observation (S1), it suffices to test whether \( r^-(s) \wedge r^+(t) \gt 0 \), and \( r^+(s) \wedge \lnot r^+(t) \gt 0 \) and \( \lnot r^-(s) \wedge r^-(t) \gt 0 \) for Observations (S2) and (S3), where \( r^+ \) and \( r^- \) hold the respective forward and backward reachability information in the same order for all supports. Each test hence requires at most one comparison of two integers plus at most two elementary bit operations. Also, note that Observation (B1) is implicitly tested by Observations (B5) and (B6). Using the data structure described above, our algorithm requires at most one memory transfer for s and one for t for each Query(\( s, t \)) that is answerable by one of the observations. Note that there are more observations that allow to identify a negative query than a positive query, which is why we expect a more pronounced speedup for the former. However, as stated in Theorem 4.1, the reachability in DAGs is always less than 50%, which justifies a bias towards an optimization for negative queries.

If the query can not be answered using any of these tests, we instead fall back to either another algorithm or a bidirectional BFS with pruning, which uses these tests for each newly encountered vertex v in a subquery Query(\( v, t \)) (forward step) or Query(\( s, v \)) (backward step). If a subquery can be answered decisively positive by a test, the bidirectional BFS can immediately answer Query(\( s, t \)) positively. Otherwise, if a subquery is answered decisively negative by a test, the encountered vertex v is no longer considered (pruning step). If the subquery could not be answered by a test, the vertex v is added to the queue as in a regular (bidirectional) BFS.

Skip 5EXPERIMENTAL EVALUATION Section

5 EXPERIMENTAL EVALUATION

We evaluated our new algorithm O’Reach as a preprocessor to various recent state-of-the-art algorithms listed below against running these algorithms on their own. Furthermore, we use as an additional fallback solution the pruned bidirectional BFS (pBiBFS). Our experimental study follows the methodology in [22] and comprises the algorithms PPL [37], TF [3], PReaCH [22], IP [36], and BFL [28]. Moreover, our evaluation is the first that directly relates IP and BFL to PReaCH and studies the performance of IP and BFL separately for successful (positive) and unsuccessful (negative) reachability queries. For reasons of comparison, we also assess the query performance of a full reachability matrix by computing the transitive closure of the input graph entirely during initialization, storing it in a matrix using \( 1 \,\rm bit \) per pair of vertices, and answering each query by a single memory lookup. We refer to this algorithm simply as Matrix. As the reachability in DAGs is small and cache locality can influence lookup times, we also experimented with various hash set implementations. However, none was faster or more memory-efficient than Matrix.

5.1 Setup and Methodology

We implemented O’Reach in C++ 142 with pBiBFS as a built-in fallback strategy. For PPL,3TF,3 PReaCH,4 IP,5 and BFL6 we used the original C++ implementation in each case. All source code was compiled with GCC 7.5.0 and full optimization (-O3). The experiments were run on a Linux machine under Ubuntu 18.04 with kernel 4.15 on four AMD Opteron 6174 CPUs clocked at \( 2.2 \,\mathrm{G}\mathrm{Hz} \) with \( 512 \,\mathrm{k}{\rm B} \) and \( 6 \,\mathrm{M}{\rm B} \) L2 and L3 cache, respectively, and 12 cores per CPU. Overall, the machine has 48 cores and a total of \( 256 \,\mathrm{G}{\rm B} \) of RAM. Unless indicated otherwise, each experiment was run sequentially and exclusively on one processor and its local memory. As non-local memory accesses incur a much higher cost, an exception to this rule was only made for Matrix, where we would otherwise have been able to only run twelve instead of 29 instances. We also parallelized the initialization phase for Matrix, where the transitive closure is computed, using 48 threads. However, all queries were processed sequentially.

To counteract artifacts of measurement and accuracy, we ran each algorithm five times on each instance and in general use the median for the evaluation. As O’Reach uses randomization during initialization, we instead report the average running time over five different seeds. For IP and BFL, which are randomized in the same way, but do not accept a seed, we just give the average over five repetitions. We note that also taking the median instead or increasing the number of repetitions or seeds does not change the overall picture.

Instances. To facilitate comparability, we adopt the instances used in the articles introducing PReaCH [22] and TF [3], which overlap with those used to evaluate IP [36] and BFL [28], and which are available either from the GRAIL code repository 7 or the Stanford Network Analysis Platform SNAP [21]. Furthermore, we extended the set of benchmark graphs by further instance sizes and Delaunay graphs. Table A.4 provides a detailed overview. As we only consider DAGs, all instances are condensations of their respective originals, if they were not acyclic already. We also adopt the grouping of the instances as in [22, 39] and provide only a short description of the different sets in the following.

Kronecker. These instances were generated by the RMAT generator for the Graph500 benchmark [23] and oriented acyclically from smaller to larger node ID. The name encodes the number of vertices \( 2^i \) as kron_logni. Random: Graphs generated according to the Erdős-Renyí model \( G(n, m) \) and oriented acyclically from smaller to larger node ID. The name encodes \( n=2^i \) and \( m=2^j \) as randni-j. Delaunay: Delaunay graphs from the 10th DIMACS Challenge [1, 8]. delaunay_ni is a Delaunay triangulation of \( 2^{i} \) random points in the unit square. Large real: Introduced in [39], these instances represent citation networks (citeseer.scc, citeseerx, cit-Patents), a taxonomy graph (go-uniprot), as well as excerpts from the RDF graph of a protein database (uniprotm22, uniprotm100, uniprotm150). Small real dense: Among these instances, introduced in [17], are again citation networks (arXiv, pubmed_sub, citeseer_sub), a taxonomy graph (go_sub), as well as one obtained from a semantic knowledge database (yago_sub). Small real sparse: These instances were introduced in [18] and represent XML documents (xmark, nasa), metabolic networks (amaze, kegg) or originate from pathway and genome databases (all others). SNAP: The e-mail network graph (e-mail-EuAll), peer-to-peer network (p2p-Gnutella31), social network (soc-LiveJournal1), web graph (web-Google), as well as the communication network (wiki-Talk) are part of SNAP and were first used in [3].

Queries. Following the methodology of [22], we generated three sets of 100,000 queries each: positive, negative, and random. Each set consists of random queries, which were generated by picking two vertices uniformly at random and filtering out negative or positive queries for the positive and negative query sets, respectively. The fourth query set, mixed, is a randomly shuffled union of all queries from positive and negative and hence contains 200,000 pairs of vertices. As the order of the queries within each set had an observable effect on the running time due to caching effects and memory layout, we randomly shuffled every query set five times and used a different permutation for each repetition of an experiment to ensure equal conditions for all algorithms.

5.2 Experimental Results

We ran O’Reach with \( k= 16 \) supportive vertices, picked from 1,200 candidates (\( p= 75 \), \( h=8 \)) and \( d= 4 \) extended topological orderings. We ran IP with the two configurations used also by the authors [36] and refer to the resulting algorithms as IP(s) (sparse, \( h_{\texttt {IP}}{}= k_{\texttt {IP}}{} = 2 \)) and IP(d) (dense, \( h_{\texttt {IP}}{}= k_{\texttt {IP}}{} = 5 \)). Similarly, we evaluated BFL [28] with configuration sparse as BFL(s) (\( s_{\texttt {BFL}}{}=64 \)) and dense as BFL(d) (\( s_{\texttt {BFL}}{}=160 \)), following the presets given by the authors.

Average query times. Table A.5 lists the average time per query for the query sets negative and positive. All missing values are due to a memory requirement of more than \( 32 \,\mathrm{G}{\rm B} \) (TF) and Matrix (\( 256 \,\mathrm{G}{\rm B} \)). For each instance and query set, the running time of the fastest algorithm is printed in bold. If Matrix was fastest, also the running time of the second-best algorithm is highlighted. Besides Matrix, the table shows the running times of PReaCH, PPL, IP(d), and BFL(d) alone as well as multiple versions for O’Reach: one with a pruned bidirectional BFS (O’R +pBiBFS) as fallback as well as one per competitor (O’R +...), where O’Reach was run without fallback and the queries left unanswered were fed to the competitor. Analogously, the running times for IP(s), BFL(s), and TF alone and as a fallback for O’Reach are given in Table A.7.

Our results by and large confirm the performance comparison of PReaCH, PPL, and TF conducted by Merz and Sanders [22]. PReaCH was the fastest on three out of five Kronecker graphs for the negative query set, once beaten by O’R +PReaCH and O’R +PPL each, whereas PPL and O’R +PPL dominated all others on the positive query set in this class as well as on three of the five random graphs, while O’R +TF was slightly faster on the other two. In contrast to the study in [22], TF is outperformed slightly by PPL on random instances for the positive query set. PReaCH was also the dominating approach on the small real sparse and SNAP instances in the aforementioned study [22]. By contrast, it was outperformed on these classes here by O’Reach with almost any fallback on all instances for the positive query set, and by either IP(d) or BFL(s) on almost all instances for the negative query set. On the Delaunay and large real instances, BFL(s) often was the fastest algorithm on the set of negative queries. The results also reveal that BFL and in particular IP have a weak spot in answering positive queries. On average over all instances, O’R +PPL had the fastest average query time both for negative and positive queries.

Notably, \( \texttt {Matrix}{} \) was outperformed quite often, especially for queries in the set negative, which correlates with the fact that a large portion of these queries could be answered by constant-time observations (see also the detailed analysis of observation effectiveness below) and is due to its larger memory footprint. Across all instances and seeds, more than \( 95 \,\% \) of all queries in this set could be answered by O’Reach directly. On the set positive, the average query time for Matrix was in almost all cases less than on the negative query set, which is explained by the small reachability of the instances and a resulting higher spatial locality and better cacheability of the few and naturally clustered one-entries in the matrix. Consequently, this effect was distinctly reduced for the mixed query set, as shown in Table A.6.

There are some instances where O’Reach had a fallback rate of over \( 90 \,\% \) for the positive query set, e.g., on cit-Patents, which is clearly reflected in the running time. Except for PPL, all algorithms had difficulties with positive queries on this instance. Conversely, the fallback rate on all uniprotenc_\( {}^* \) instances and citeseer.scc, e.g., was \( 0 \,\% \). On average across all instances and seeds, O’Reach could answer over \( 70 \,\% \) of all positive queries by constant-time observations.

The results on the query sets random and mixed are similar and listed in Tables A.6 and A.8. Once again, O’R +PPL showed the fastest query time on average across all instances for both query sets. As the reachability in a DAG is low in general (see also Theorem 4.1) and particularly in the benchmark instances, the average query times for random resemble those for negative. On the other hand, the results for the mixed query set are more similar to those for the positive query set, as the relative differences in performance among the algorithms are more pronounced there. Table 2 compactly shows the average query time over all instances for each query set. Only PPL and O’R +PPL achieved an average query time of less than \( 1 \,\mathrm{\mu }\mathrm{s} \) (and even less than \( 0.35 \,\mathrm{\mu }\mathrm{s} \)).

Table 2.

Table 2. Average Query Time Per Algorithm and Query Set

Speedups by O’Reach . We next investigate the relative speedup of O’Reach with different fallback solutions over running only the fallback algorithms. Table A.9 lists the ratios of the average query time of each competitor algorithm run standalone divided by the average query time of O’Reach plus that algorithm as fallback, for all four query sets. A compact version is also given in Table 3. In the large majority of cases, using O’Reach as a preprocessor resulted in a speedup, except in case of negative or random queries for BFL and partially IP on the large real instances as well as for PReaCH and partially again IP on the small real sparse and SNAP instances. The largest speedup of around 105 could be achieved for BFL on kegg for random queries. The mean speedup (geometric) is at least \( 1.29 \) for all fallback algorithms on the query sets positive, random, and mixed, where the maximum was reached for IP(s) on positive queries with a factor of \( 4.21 \). Only for purely negative queries, IP(d) and BFL(s) were a bit faster alone in the mean values. Figure 2 gives some more insight into the distribution of the values and shows that the combination with O’Reach led to distinct speedups for all algorithms on a large majority of the instances on the positive and mixed query set, and also for random. For the negative query set, the combination with O’Reach could in particular speed up the average query time for PPL on all instances, for TF on more than \( 75 \,\% \) of the instances, and for PReaCH and BFL(d) still on around half of the instances. In summary, given that the algorithms are often already faster than single memory lookups, the speedups achieved by O’Reach are quite high.

Fig. 2.

Fig. 2. Speedups achieved if O’Reach is combined with an algorithm. The boxes extend from the first to the third quartile, the whiskers show additional values beyond the box and within \( 1.5 \) times the interquartile range. Inside each box, the median is shown as a horizontal black bar. For positive, random, and mixed, where the maximum speedup was over 100 (positive, random) or over 45 (mixed), outliers are omitted for better readability (see also Table A.9). A red, dashed horizontal line marks a speedup of 1. Values above this line show where the combination with O’Reach makes the algorithm faster, values below this line show where the combination makes the algorithm slower. Note the different range on the y-axis for each queryset.

Table 3.
  • Values greater \( 1.00 \) are highlighted.

Table 3. Mean Speedups with O’Reach Plus Fallback Over Pure Fallback Algorithm

  • Values greater \( 1.00 \) are highlighted.

Initialization Times (Table A.10). On all graphs, BFL(s) had the fastest initialization time, followed by BFL(d) and PReaCH. For O’Reach, the overhead of computing the comparatively large out- and in-reachabilities of all 1,200 candidates for \( k=16 \) supportive vertices is clearly reflected in the running time on denser instances and can be reduced greatly if lower parameters are chosen, albeit at the expense of a slightly reduced query performance, e.g., for \( k=8 \). PPL often consumed a lot of time in this step, especially on denser instances, with a maximum of \( 2.6 \,\mathrm{h} \) on randn20-23.

Based on the average query time per instance, the minimum number of random queries necessary to amortize the additional investment in initialization time if O’Reach is run as preprocessor is between \( 9.6 \) thousand (O’R +BFL(d)) and 499 thousand (O’R +PReaCH). Counting cases where O’Reach could not achieve a speedup in the average query time as infinity, the median number of random queries required for amortization is between \( 2.5 \) million (O’R +BFL(d)) and 101 million (O’R +IP(d)). For the on average fastest algorithm, O’R +PPL, the initialization cost is recovered after 210 thousand (nasa) to \( 6.15 \) billion (kron_logn21) random queries, which equals about \( 0.77 \,\% \) (nasa) and \( 0.14 \,\% \) (kron_logn21) of all vertex pairs, respectively.

Effectiveness of Observations. We collected a vast amount of statistical data to perform an analysis of the effectiveness of the different observations used in O’Reach. To make the analysis more robust, we added additional seeds and increased the number to 25 here. For each observation, we maintained a separate counter, which was increased whenever a query could be answered by an observation. If an observation included multiple tests, as in case of those based on topological orderings ((B4), (T1)(T6)) or on supportive vertices ((S1)(S3)), the counter was only increased once per observation, even if, e.g., (B4) applied for two topological orderings or (S1) applied for multiple supportive vertices. We then obtained the average effectiveness for each observation as the mean ratio of the counter value over the number of queries we want to consider, taken over all seeds and all instances.8 The results are also shown in Table A.12.

First, we look only at fast queries, i.e., those queries that could be answered without a fallback. We increased the counter for all observations that could answer a query for this analysis, not just the first in order, which is why there may be overlaps (one query can be answered multiple times). Across all query sets, the most effective observation was the negative basic observation on topological orderings, (B4), which could answer \( 54 \,\% \) of all fast queries. As the average reachability in the random query set is very low, negative queries predominate in the overall picture. It thus does not come as a surprise that the most effective observation is a negative one. On the negative query set, it could answer even \( 84 \,\% \) of all fast queries. The negative observations second to (B4) in effectiveness were those looking at the forward and backward topological levels, Observation (B5) and (B6), which could answer around \( 74.5 \,\% \) each on the negative query set and around \( 47.5 \,\% \) of all fast queries. The observations using the max and min indices of extended topological orderings, (T2) and (T5), could answer \( 26 \,\% \) and \( 19 \,\% \) of the fast queries in the negative query set, and the observations based on supportive vertices, (S2) and (S3), \( 19 \,\% \) and \( 12 \,\% \), respectively.

After lowering the number of topological orderings from \( d= 4 \) to \( d= 2 \), (B4) was equally effective as (B5) and (B6), each of which could answer around \( 48 \,\% \) of all fast queries and \( 75 \,\% \) of those in the negative query set. Observe that decreasing d negatively affects the number of fast queries, which in turn leads to slightly increased ratios for (B5) and (B6). For Observations (T2) and (T5), the effectiveness was reduced to \( 21 \,\% \) and \( 16 \,\% \) on the negative query set, and to \( 13 \,\% \) and \( 10 \,\% \) across all query sets.

The most effective positive observation and the second-best among all query sets, was the supportive-vertices-based Observation (S1), which could answer around \( 25 \,\% \) of all fast queries and \( 66 \,\% \) in the positive query set. Follow-up observations were the ones using high and low indices, (T1) and (T4), with \( 21 \,\% \) and \( 23 \,\% \) effectiveness for the positive query set, and around \( 7.5 \,\% \) across all query sets. The remaining two, (T3) and (T6), could answer \( 10 \,\% \) and \( 5 \,\% \) in the positive set.

Reducing the number of supportive vertices from \( k= 16 \) to \( k= 8 \) led to a small diminution of the effectiveness of Observation (S1) to around \( 64.5 \,\% \) on the set of positive queries, both if the number of candidates to choose from was kept equal (\( p= 150 \)) or reduced analogously (\( p= 75 \)). Reducing the number of topological orderings to \( d= 2 \) resulted in a slight deterioration in case of (T1) and (T4) to \( 19 \,\% \) and \( 21 \,\% \), and to \( 5 \,\% \) with respect to the positive query set.

Among all fast queries that could be answered by only one observation, the most effective observation was the positive supportive-vertices-based Observation (S1) with \( 38 \,\% \) for all query sets and \( 65 \,\% \) for the positive query set, followed by the negative basic observation using topological orderings, (B4), with around \( 29 \,\% \) for all query sets and \( 63 \,\% \) for the negative query set.

Looking now at the entire query sets, our statistics show that \( 95 \,\% \) of all queries could be answered via an observation on the negative set. In \( 70 \,\% \) of all cases, (B5) in the second test, which uses topological forward levels, could already answer the query. In further \( 16 \,\% \) of all cases, the observation based on topological backward levels, (B6), was successful. On the positive query set, the fallback rate was around \( 29 \,\% \) and hence higher than on the negative query set. \( 52 \,\% \) of all queries in this set could be answered by the supportive-vertices-based observation (S1), and the high and low indices of extended topological orderings (T1) and (T4) were responsible for another \( 7 \,\% \) and \( 4 \,\% \), respectively. Observe that here, the first observation in the order that can answer a query “wins the point”, i.e., the effectiveness here depends on the order and there are no overlaps in the reported effectiveness.

Memory Consumption. Table A.11 lists the memory each algorithm used for their reachability index. As O’Reach was configured with \( k= 16 \) and \( d= 4 \), its index size is \( 64n \,{\text{Byte}} \). Consequently, the reachability indices of O’Reach, PReaCH, PPL, IP, BFL, and, with one exception for TF, fit in the L3 cache of \( 6 \,\mathrm{M}{\rm B} \) for all small real instances. For Matrix, this was only the case for the four smallest instances from the small real sparse set, three of the small real dense ones, and the smallest Kronecker graph, which is clearly reflected in its average query time for the negative, random, and, to a slightly lesser extent, mixed query sets. Whereas for O’Reach, PReaCH, and Matrix, the index size depends solely on the number of vertices, IP, BFL, PPL, and TF consumed more memory the larger the density \( \frac{m}{n} \). IP(s) usually was the most space-efficient and never used more than \( 395 \,\mathrm{M}{\rm B} \), followed by BFL(s) (\( 429 \,\mathrm{M}{\rm B} \)), IP(d) (\( 440 \,\mathrm{M}{\rm B} \)), BFL(d) (\( 754 \,\mathrm{M}{\rm B} \)), PReaCH (\( 1.3 \,\mathrm{G}{\rm B} \)), O’Reach (\( 1.5 \,\mathrm{G}{\rm B} \)), and PPL (\( 4.4 \,\mathrm{G}{\rm B} \)). All these algorithms are hence suitable to handle graphs with several millions of vertices even on hardware with relatively little memory (with respect to current standards). TF used up to \( 3.8 \,\mathrm{G}{\rm B} \) (randn23-25), but required even more than \( 64 \,\mathrm{G}{\rm B} \) at least during initialization on all instances where the data is missing in the table.

Skip 6CONCLUSION Section

6 CONCLUSION

In this article, we revisited existing techniques for the static reachability problem and combined them with new approaches to support a large portion of reachability queries in constant time using a linear-sized reachability index. Our extensive experimental evaluation shows that in almost all scenarios, combining any of the existing algorithms with our new techniques implemented in O’Reach can speed up the query time by several factors. In particular supportive vertices have proven to be effective to answer positive queries quickly. As a further plus, O’Reach is flexible: memory usage, initialization time, and expected query time can be influenced directly by three parameters, which allow to trade space for time or initialization time for query time. Moreover, our study demonstrates that, due to cache effects, a high investment in space does not necessarily pay off: Reachability queries can often be answered even significantly faster than single memory accesses in a precomputed full reachability matrix.

The on average fastest algorithm across all instances and types of queries was a combination of O’Reach and PPL with an average query time of less than \( 0.35 \,\mathrm{\mu }\mathrm{s} \). As the initialization time of PPL is relatively high, we also recommend O’Reach combined with PReaCH as a less expensive alternative solution with respect to initialization time and partially also memory, which still achieved an average query time of at most \( 11.1 \,\mathrm{\mu }\mathrm{s} \) on all query sets.

APPENDIX

A TABLES AND FIGURES

Table A.4.

Table A.4. Instances Used in Our Experiments (read /1 \( \times \) \( 10^{3} \) : in thousands)

Table A.5.

Table A.5. Average Query Times in \( \mathrm{\mu }\mathrm{s} \) for 100,000 Negative (Left) and Positive Queries (Right)

Table A.6.

Table A.6. Average Query Times in \( \mathrm{\mu }\mathrm{s} \) for 100,000 Random (Left) and 200,000 Mixed Queries (Right)

Table A.7.

Table A.7. Average Query Times in \( \mathrm{\mu }\mathrm{s} \) for 100,000 Negative (Left) and Positive Queries (Right)

Table A.8.

Table A.8. Average Query Times in \( \mathrm{\mu }\mathrm{s} \) for 100,000 Random (Left) and 200,000 Mixed Queries (Right)

Table A.9.

Table A.9. Speedups with O’Reach Plus Fallback Over Pure Fallback Algorithm

Table A.10.

Table A.10. Median Initialization Time in \( \mathrm{m}\mathrm{s} \) in Five Repetitions

Table A.11.

Table A.11. Real Index Size in Memory (in \( \mathrm{M}{\rm B} \) )

Table A.12.

Table A.12. Effectiveness of Each Observation as Number of Times the Observation Could Answer a Query Over the Total Number of Considered Queries in Percent for \( k= 16 \) , \( p= 75 \) , \( d= 4 \) (Top) and Other Configurations (Bottom)

Footnotes

  1. 1 Otherwise, \( \frac{1}{n} \le \rho \).

    Footnote
  2. 2 Source code and instances are available from https://oreach.taa.univie.ac.at.

    Footnote
  3. 3 Provided directly by the authors.

    Footnote
  4. 4 https://github.com/fiji-flo/preach2014/tree/master/original_code.

    Footnote
  5. 5 https://github.com/datourat/IP-label-for-graph-reachability.

    Footnote
  6. 6 https://github.com/BoleynSu/bfl.

    Footnote
  7. 7 https://code.google.com/archive/p/grail/.

    Footnote
  8. 8 The statistics were obtained in a slightly different way in [12].

    Footnote

REFERENCES

  1. [1] Bader D., Kappes A., Meyerhenke H., Sanders P., Schulz C., and Wagner D.. 2014. Benchmarking for graph clustering and partitioning. In Proceedings of the Encyclopedia of Social Network Analysis and Mining. Springer.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Chen Yangjun and Chen Yibin. 2008. An efficient algorithm for answering graph reachability queries. In Proceedings of the 24th International Conference on Data Engineering.Alonso Gustavo, Blakeley José A., and Chen Arbee L. P. (Eds.), IEEE Computer Society, 893902. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Cheng James, Huang Silu, Wu Huanhuan, and Fu Ada Wai-Chee. 2013. TF-label: A topological-folding labeling scheme for reachability querying in a large graph. In Proceedings of the ACM SIGMOD International Conference on Management of Data.Ross Kenneth A., Srivastava Divesh, and Papadias Dimitris (Eds.), ACM, 193204. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Cheng Jiefeng, Yu Jeffrey Xu, Lin Xuemin, Wang Haixun, and Yu Philip S.. 2006. Fast computation of reachability labeling for large graphs. In Proceedings of the Advances in Database Technology - EDBT 2006, 10th International Conference on Extending Database Technology.Ioannidis Yannis E., Scholl Marc H., Schmidt Joachim W., Matthes Florian, Hatzopoulos Michael, Böhm Klemens, Kemper Alfons, Grust Torsten, and Böhm Christian (Eds.), Lecture Notes in Computer Science, Vol. 3896, Springer, 961979. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Cohen Edith, Halperin Eran, Kaplan Haim, and Zwick Uri. 2003. Reachability and distance queries via 2-hop labels. SIAM Journal on Computing 32, 5 (2003), 13381355. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Cormen T. H., Leiserson C. E., Rivest R. L., and Stein C.. 2009. Introduction to Algorithms (3rd ed.). MIT Press, Chapter Elementary Data Structures.Google ScholarGoogle Scholar
  7. [7] Floyd R. W.. 1962. Algorithm 97: Shortest path. Communications of the ACM 5, 6 (1962), 345. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Funke Daniel, Lamm Sebastian, Sanders Peter, Schulz Christian, Strash Darren, and Looz Moritz von. 2018. Communication-free massively distributed graph generation. In Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Geisberger Robert, Sanders Peter, Schultes Dominik, and Delling Daniel. 2008. Contraction hierarchies: Faster and simpler hierarchical routing in road networks. In Proceedings of the International Workshop on Experimental and Efficient Algorithms. Springer, 319333.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Geisberger Robert, Sanders Peter, Schultes Dominik, and Vetter Christian. 2012. Exact routing in large road networks using contraction hierarchies. Transportation Science 46, 3 (2012), 388404.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Hanauer Kathrin, Henzinger Monika, and Schulz Christian. 2020. Faster fully dynamic transitive closure in practice. In Proceedings of the18th International Symposium on Experimental Algorithms.Faro Simone and Cantone Domenico (Eds.), Leibniz International Proceedings in Informatics, Vol. 160, Schloss Dagstuhl–Leibniz-Zentrum für Informatik, Dagstuhl, Germany, 14:1–14:14. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Hanauer Kathrin, Schulz Christian, and Trummer Jonathan. 2021. O’reach: Even faster reachability in large graphs. In Proceedings of the19th International Symposium on Experimental Algorithms.Coudert David and Natale Emanuele (Eds.), Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 13:1–13:24. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Jagadish H. V.. 1990. A compression technique to materialize transitive closure. ACM Transactions on Database Systems 15, 4 (1990), 558598. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] Jin Ruoming, Ruan Ning, Dey Saikat, and Yu Jeffrey Xu. 2012. SCARAB: Scaling reachability computation on large graphs. In Proceedings of the ACM SIGMOD International Conference on Management of Data.Candan K. Selçuk, Chen Yi, Snodgrass Richard T., Gravano Luis, and Fuxman Ariel (Eds.), ACM, 169180. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Jin Ruoming, Ruan Ning, Xiang Yang, and Wang Haixun. 2011. Path-tree: An efficient reachability indexing scheme for large directed graphs. ACM Transactions on Database Systems 36, 1 (2011), 7:1–7:44. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Jin Ruoming and Wang Guan. 2013. Simple, fast, and scalable reachability oracle. Proceedings of the VLDB Endowment 6, 14 (2013), 19781989. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Jin Ruoming, Xiang Yang, Ruan Ning, and Fuhry David. 2009. 3-HOP: A high-compression indexing scheme for reachability query. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data.Association for Computing Machinery, New York, NY, 813826. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Jin Ruoming, Xiang Yang, Ruan Ning, and Wang Haixun. 2008. Efficiently answering reachability queries on very large directed graphs. In Proceedings of the ACM SIGMOD International Conference on Management of Data.Wang Jason Tsong-Li (Ed.), ACM, 595608. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Kahn A. B.. 1962. Topological sorting of large networks. Communications of the ACM 5, 11 (1962), 558562. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Gall F. Le. 2014. Powers of tensors and fast matrix multiplication. In Proceedings of the International Symposium on Symbolic and Algebraic Computation.Nabeshima K., Nagasaka K., Winkler F., and Szántó Á. (Eds.), ACM, 296303. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Leskovec Jure and Krevl Andrej. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. Retrieved Feb 1, 2021 from http://snap.stanford.edu/data.Google ScholarGoogle Scholar
  22. [22] Merz F. and Sanders P.. 2014. PReaCH: A fast lightweight reachability index using pruning and contraction hierarchies. In Proceedings of the European Symposium on Algorithms. Schulz A. S. and Wagner D. (Eds.), Springer, Berlin, 701712.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Murphy Richard C., Wheeler Kyle B., Barrett Brian W., and Ang James A.. 2010. Introducing the graph 500. Cray Users Group 19 (2010), 4574.Google ScholarGoogle Scholar
  24. [24] Reps Thomas. 1998. Program analysis via graph reachability. Information and Software Technology 40, 11–12 (1998), 701726.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Reps Thomas, Horwitz Susan, and Sagiv Mooly. 1995. Precise interprocedural dataflow analysis via graph reachability. In Proceedings of the 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 4961.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Schenkel Ralf, Theobald Anja, and Weikum Gerhard. 2004. HOPI: An efficient connection index for complex XML document collections. In Proceedings of the Advances in Database Technology - EDBT 2004, 9th International Conference on Extending Database Technology.Bertino Elisa, Christodoulakis Stavros, Plexousakis Dimitris, Christophides Vassilis, Koubarakis Manolis, Böhm Klemens, and Ferrari Elena (Eds.), Lecture Notes in Computer Science, Vol. 2992, Springer, 237255. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Scholz B., Zhang C., and Cifuentes C.. 2008. User-input dependence analysis via graph reachability. In Proceedings of the 2008 8th IEEE International Working Conference on Source Code Analysis and Manipulation. 2534.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Su Jiao, Zhu Qing, Wei Hao, and Yu Jeffrey Xu. 2017. Reachability querying: Can it be even faster?IEEE Transactions on Knowledge and Data Engineering 29, 3 (2017), 683697. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Tarjan Robert. 1972. Depth-first search and linear graph algorithms. SIAM Journal on Computing 1, 2 (1972), 146160. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Tarjan Robert Endre. 1976. Edge-disjoint spanning trees and depth-first search. Acta Informatica 6, 2 (1976), 171185.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Trißl Silke and Leser Ulf. 2007. Fast and practical indexing and querying of very large graphs. In Proceedings of the ACM SIGMOD International Conference on Management of Data.Chan Chee Yong, Ooi Beng Chin, and Zhou Aoying (Eds.), ACM, 845856. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. [32] Schaik Sebastiaan J. van and Moor Oege de. 2011. A memory efficient reachability data structure through bit vector compression. In Proceedings of the ACM SIGMOD International Conference on Management of Data.Sellis Timos K., Miller Renée J., Kementsietsidis Anastasios, and Velegrakis Yannis (Eds.), ACM, 913924. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Veloso Renê Rodrigues, Cerf Loïc, Meira Wagner, and Zaki Mohammed J.. 2014. Reachability queries in very large graphs: A fast refined online search approach. In Proceedings of the EDBT. 511522.Google ScholarGoogle Scholar
  34. [34] Wang Haixun, He Hao, Yang Jun, Yu Philip S., and Yu Jeffrey Xu. 2006. Dual labeling: Answering graph reachability queries in constant time. In Proceedings of the 22nd International Conference on Data Engineering.Liu Ling, Reuter Andreas, Whang Kyu-Young, and Zhang Jianjun (Eds.), IEEE Computer Society, 75. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Warshall S.. 1962. A theorem on boolean matrices. Journal of the ACM 9, 1 (1962), 1112. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. [36] Wei Hao, Yu Jeffrey Xu, Lu Can, and Jin Ruoming. 2018. Reachability querying: An independent permutation labeling approach. The VLDB Journal 27, 1 (2018), 126. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Yano Yosuke, Akiba Takuya, Iwata Yoichi, and Yoshida Yuichi. 2013. Fast and scalable reachability queries on graphs by pruned labeling with landmarks and paths. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management.He Qi, Iyengar Arun, Nejdl Wolfgang, Pei Jian, and Rastogi Rajeev (Eds.), ACM, 16011606. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Yildirim Hilmi, Chaoji Vineet, and Zaki Mohammed J.. 2010. GRAIL: Scalable reachability index for large graphs. Proceedings of the VLDB Endowment 3, 1–2 (2010), 276284. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. [39] Yıldırım Hilmi, Chaoji Vineet, and Zaki Mohammed J.. 2012. GRAIL: A scalable index for reachability queries in very large graphs. The VLDB Journal 21, 4 (2012), 509534.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Yu Jeffrey Xu and Cheng Jiefeng. 2010. Graph reachability queries: A survey. In Proceedings of the Managing and Mining Graph Data. Aggarwal Charu C. and Wang Haixun (Eds.), Advances in Database Systems, Vol. 40. Springer, 181215. DOI:Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. O’Reach: Even Faster Reachability in Large Graphs

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Journal of Experimental Algorithmics
      ACM Journal of Experimental Algorithmics  Volume 27, Issue
      December 2022
      776 pages
      ISSN:1084-6654
      EISSN:1084-6654
      DOI:10.1145/3505192
      Issue’s Table of Contents

      Copyright © 2022 Copyright held by the owner/author(s).

      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 October 2022
      • Online AM: 17 August 2022
      • Accepted: 21 July 2022
      • Received: 15 December 2021
      Published in jea Volume 27, Issue

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format