1 Introduction

The problem to cluster objects based on their pairwise similarities has arisen from applications both in computational biology [5] and machine learning [4]. In the language of graph theory, as an input we are given a graph where vertices correspond to objects, and two objects are connected by an edge if they are observed to be similar. The goal is to transform the graph into a cluster graph (a disjoint union of cliques) using a minimum number of modifications.

The set of allowed modifications depends on the particular problem variant and the application under consideration. Probably the most studied variant is the Cluster Editing problem, known also as Correlation Clustering, where we seek for a minimal number of edge edits to obtain a cluster graph. The Cluster Editing problem is APX-hard [9], and admints constant-factor approximations [3, 9]. Also, no subexponential time algorithm exists [13, 18] unless exponential time hypothesis fails. From the parameterized perspective, currently the fastest algorithm runs in \(\mathcal {O}(1.62^{k}+n+m)\) time [6], and both a small [10] and efficient [19] polynomial kernels. For more references on the Cluster Editing problem, see [7].

The main principle of parameterized complexity is that we seek algorithms that are efficient if the considered parameter is small. However, the distance measure in Cluster Editing, the number of edge edits, may be quite large in practical instances, and, in the light of recent lower bounds refuting the existence of subexponential FPT algorithms for Cluster Editing [13, 17], it seems reasonable to look for other distance measures (see, e.g., Komusiewicz’s PhD thesis [17]) and/or different problem formulations.

In 2003, Gramm et al. [14] initiated the parameterized study of the Cluster Vertex Deletion problem (ClusterVD for short). Here, the allowed modifications are vertex deletions.

figure a

In terms of motivation, we want to refute as few objects as possible to make the set of observations completely consistent. Since a vertex deletion removes as well all its incident edges, we may expect that this new editing measure may be significantly smaller in practical applications than the edge-editing distance.

It can be shown that a graph is a cluster graph if and only if it contains no induced P 3s (paths on 3 vertices). As ClusterVD can be equivalently stated as the problem of hitting, with minimum number of vertices, all induced P 3s in the input graph, ClusterVD can be solved in \(\mathcal {O}(3^{k} (n+m))\) time by a straightforward branching algorithm [8], where n and m denote the number of vertices and edges of G, respectively. The dependency on k has been improved by considering more elaborate case distinction in the branching algorithm, first to \(\mathcal {O}(2.26^{k}+nm)\) using problem-specific rules by Gramm et al. [14], and later to \(\mathcal {O}(2.179^{k}+nm)\) via a general algorithm by Fernau for the 3-Hitting Set problem [11]. The \(\mathcal {O}(2.0755^{k}+mn)\) running time can be achieved if instead of the latter one applies an algorithm for 3-Hitting Set described in the PhD thesis of Wahlström [21]. Hüffner et al. [15] provided an elegant \(\mathcal {O}(2^{k}k^{9} +nm)\)-time algorithm for ClusterVD, using the iterative compression principle [20] and a reduction to the weighted maximum matching problem. This algorithm quickly became one of the textbook examples of an application of the iterative compression technique.

In our work we pick up this line of research and obtain the fastest algorithm for (unweighted) ClusterVD.

Theorem 1

Cluster Vertex Deletion can be solved in \(\mathcal {O}(1.9102^{k} (n+m))\) time and polynomial space on an input (G, k) with |V(G)| = n and |E(G)| = m.

The source of the exponential 2k factor in the time complexity of the algorithm of [15] comes from enumeration of all possible intersections of the size-k solution we are looking for with a solution of size k + 1 obtained from the previous phase of the iterative compression process. As the next step in each subcase is a reduction to the weighted maximum matching problem (with a definitely nontrivial polynomial-time algorithm), it seems hard to break the 2k-barrier using the approach of [15]. Hence, in the proof of Theorem 1 we go back to the bounded search tree approach. However, to achieve the promised time bound, and at the same time avoiding very extensive case analysis, we do not follow the general 3-Hitting Set approach. Instead, our methodology is to carefully investigate the structure of the graph and an optimum solution around a vertex already guessed to be not included in the solution. We note that a somehow similar approach has been used in [15] to cope with a variant of ClusterVD where we restrict the number of clusters in the resulting graph.

More precisely, the main observation in the proof of Theorem 1 is that, if for some vertex v we know that there exists a minimum solution S not containing v, then in the neighbourhood of v the ClusterVD problem reduces to Vertex Cover. Let us define N 1 and N 2 to be the vertices at distance 1 and 2 from v, respectively, and define the auxiliary graph H v to be a graph on N 1N 2 having an edge for each edge of G between N 1 and N 2 and for each non-edge in G[N 1]. In other words, two vertices are connected by an edge in H v if, together with v, they form a P 3 in G. We observe that a minimum solution S not containing v needs to contain a vertex cover of H v . Moreover, one can show that we may greedily choose a vertex cover with inclusion-wise maximal intersection with N 2, as deleting vertices from N 2 helps us resolve the remaining part of the graph.

Branching to find the ‘correct’ vertex cover of H v is very efficient, with worst-case (1, 2) (i.e., golden-ratio) branching vector. However, we do not have the vertex v beforehand, and branching to obtain such a vertex is costly. Our approach is to get as much gain as possible from the vertex cover-style branching on the graph H v , to be able to balance the loss from some inefficient branches used to obtain the initial vertex v. Consequently, we employ involved analysis of properties and branching algorithms for the auxiliary graph H v .

Note that the algorithm of Theorem 1 can be pipelined with the kernelization algorithm of 3-Hitting Set [1], yielding the following corollary.

Corollary 2

Cluster Vertex Deletion can be solved in \(\mathcal {O}(1.9102^{k} k^{4} + nm)\) time and polynomial space on an input (G, k) with |V(G)| = n and |E(G)| = m.

However, due to the \(\mathcal {O}(nm)\) summand in the complexity of Corollary 2, for a wide range of input instances the running time bound of Theorem 1 is better than the one of Corollary 2. In fact, the advantage of our branching approach is that the obtained dependency on the graph size in the running time is linear, whereas with the approach of [15], one needs to spend at least quadratic time either on computing weighted maximum matching or on kernelizing the instance.

We also analyse the co-cluster setting, where one aims at obtaining a co-cluster graph instead of a cluster one, and show that the linear dependency on the size of the input can be maintained also in this case. In other words, ClusterVD problem can be also solved in \(\mathcal {O}(1.9102^{k} (n+\bar {m}))\) if input graph is represented by its complement, which has \(\bar {m}=\left (\begin {array}{l}n\\2\end {array}\right ) - m\) edges. Such representation leads to a more time- and space-efficient solution for very dense input graphs. In particular, this would be a better choice if one expects most of the vertices of the resulting cluster graph to form a single clique.

The paper is organised as follows. We give some preliminary definitions and notation in Section 2. In Section 3 we analyse the structural properties of the auxiliary graph H v . Then, in Section 4 we prove Theorem 1, with the main tool being a subroutine branching algorithm finding all relevant vertex covers of H v . In Section 5 we analyse the co-cluster setting. Section 6 concludes the paper. The Appendix contains a Python script for computing the worst-case complexity of the algorithm.

2 Preliminaries

We use standard graph notation. All our graphs are undirected and simple. For a graph G, by V(G) and E(G) we denote its vertex- and edge-set, respectively. For vV(G), the set N G (v) = {uu vE(G)} is the neighbourhood of v in G and N G [v] = N G (v) ∪ {v} is the closed neighbourhood. We extend these notions to sets of vertices XV(G) by \(N_{G}[X] = \bigcup _{v \in X} N_{G}[v]\) and N G (X) = N G [X]∖X. We omit the subscript if it is clear from the context. For a set XV(G) we also define G[X] to be the subgraph induced by X and GX is a shorthand for G[V(G)∖X]. An even cycle is a cycle with an even number of edges, and an even path is a path with an even number of edges. A set XV(G) is called a vertex cover of G if GX is edgeless. By VC(G) we denote the size of the minimum vertex cover of G.

In all further sections, we assume we are given an instance (G, k) of Cluster Vertex Deletion, where G = (V, E). That is, we use V and E to denote the vertex- and edge-set of the input instance G.

A P 3 is an ordered set of 3 vertices (u, v, w) such that u v, v wE and u wE. A graph is a cluster graph if and only if it does not contain any P 3; hence, in ClusterVD we seek for a set of at most k vertices that hits all P 3s. We note also the following.

Lemma 3

Let G be a connected graph which is not a clique. Then, for every vV(G), there is a P 3 containing v.

Proof

Consider N(v). If there exist vertices u, wN(v) such that u wE(G) then we have a P 3(u, v, w). Otherwise, since N[v] induces a clique, we must have wN(N[v]) such that u wE(G) for some uN(v). Thus we have a P 3, (v, u, w) involving v. □

If at some point a vertex v is fixed in the graph G, we define sets N 1 = N 1(v) and N 2 = N 2(v) as follows: N 1 = N G (v) and N 2 = N G (N G [v]). That is, N 1 and N 2 are sets of vertices at distance 1 and 2 from v, respectively. For a fixed vV, we define an auxiliary graph H v with V(H v ) = N 1N 2 and

$$E(H_{v}) = \{uw \mid u,w \in N_{1}, uw \notin E\} \cup \{uw \mid u \in N_{1}, w \in N_{2}, uw \in E \}. $$

Thus, H v consists of the vertices in N 1 and N 2 along with non-edges among vertices of N 1 and edges between N 1 and N 2. Note that N 2 is an independent set in H v . Observe the following.

Lemma 4

For u, w ∈ N 1N 2, we have uw ∈ E(H v ) if and only if u, w and v form a P 3 in G.

Proof

For every u wE(H v ) with u, wN 1, (u, v, w) is a P 3 in G. For u wE(H v ) with uN 1 and wN 2, (v, u, w) forms a P 3 in G. In the other direction, for any P 3 in G of the form (u, v, w) we have u, wN 1 and u wE, thus u wE(H v ). Finally, for any P 3 in G of the form (v, u, w) we have uN 1, wN 2 and u wE, hence u wE(H v ). □

We call a subset SV a solution when GS is a cluster graph, that is, a collection of disjoint cliques. A solution with minimal cardinality is called a minimum solution.

Our algorithm is a typical branching algorithm, that is, it consists of a number of branching steps. In a step (A 1, A 2, … , A r ), A 1, A 2, … , A r V, we independently consider r subcases. In the i-th subcase we look for a minimum solution S containing A i : we delete A i from the graph and decrease the parameter k by |A i |. If k becomes negative, we terminate the current branch and return a negative answer from the current subcase.

The branching vector for a step (A 1, A 2, … , A r ) is (|A 1|, |A 2|, … , |A r |). It is well-known (see e.g. [12]) that the number of final subcases of a branching algorithm is \(\mathcal {O}(c^{k})\), where c is the largest positive root of the equation \(1 = {\sum }_{i=1}^{r} x^{-|A_{i}|}\) among all branching steps (A 1, A 2, … , A r ) in the algorithm.

At some places, the algorithm makes a greedy (but optimal) choice of including a set AV into the constructed solution. We formally treat it as length-one branching step (A) with branching vector (|A|).

3 The Auxiliary Graph H v

In this section we investigate properties of the auxiliary graph H v . Hence, we assume that a ClusterVD input (G, k) is given with G = (V, E), and a vertex vV is fixed.

3.1 Basic Properties

First, note that an immediate consequence of Lemma 4 is the following.

Corollary 5

Let S be a solution such that v ∉ S. Then S contains a vertex cover of H v .

In the other direction, the following holds.

Lemma 6

Let X be a vertex cover of H v . Then, in G∖X, the connected component of v is a clique.

Proof

Suppose the connected component of v in GX is not a clique. Then by Lemma 3, there is a P 3 involving v. Such a P 3 is also present in G. However, by Lemma 4, as X is a vertex cover of H v , X intersects such a P 3, a contradiction. □

Lemma 7

Let S be a solution such that v ∉ S. Denote by X the set S ∩ V(H v ). Let Y be a vertex cover of H v . Suppose that X ∩ N 2 ⊆ Y ∩ N 2 . Then T := (S∖X) ∪ Y is also a solution.

Proof

Since Y (and hence, TV(H v )) is a vertex cover of H v and vT, we know by Lemma 6 that the connected component of v in GT is a clique. If T is not a solution, then there must be a P 3 contained in ZT, where Z = V∖({v} ∪ N 1). But since SZTZ, GS would also contain such a P 3. □

Lemma 7 motivates the following definition. For vertex covers of H v , X and Y, we say that Y dominates X if |Y| ≤ |X|, YN 2XN 2 and at least one of these inequalities is sharp. Two vertex covers X and Y are said to be equivalent if XN 2 = YN 2 and |XN 1| = |YN 1|. We note that the first aforementioned relation is transitive and strongly anti-symmetric, whereas the second is an equivalence relation.

As a corollary of Lemma 7, we have:

Corollary 8

Let S be a solution such that v ∉ S. Suppose Y is a vertex cover of H v which either dominates or is equivalent to the vertex cover X = S ∩ V(H v ). Then T := (S∖X) ∪ Y is also a solution with |T| ≤ |S|.

3.2 Special Cases of H v

We now carefully study the cases where H v has small vertex cover or has a special structure, and discover some possible greedy decisions that can be made.

Lemma 9

Suppose X is a vertex cover of H v . Then there is a minimum solution S such that either v ∉ S or |X∖S| ≥ 2.

Proof

Suppose S is a minimum solution such that vS and |XS| ≤ 1. We are going to convert S to another minimum solution T that does not contain v.

Consider T := (S∖{v}) ∪ X. Clearly, |T| ≤ |S|. Since T contains X, a vertex cover, by Lemma 6, the connected component of v in GT is a clique. Thus, there is no P 3 containing v. Since any P 3 in GT which does not include v must also be contained in GS, contradicting the fact that S is a solution, we obtain that T is also a solution. Hence, T is a minimum solution. □

Corollary 10

If VC(H v ) = 1, then there is a minimum solution S such that v ∉ S.

Lemma 11

Let C be the connected component of G containing v, and assume that neither C nor C∖{v} is a cluster graph. If X = {w 1, w 2} is a minimum vertex cover of H v , then there exists a connected component \(\widehat {C}\) of G∖{v} that is not a clique and \(\widehat {C} \cap \{w_{1},w_{2}\} \neq \emptyset \).

Proof

Consider a component \(\widehat C\) of C∖{v} which is not a clique. Since v must be adjacent to each connected component of C∖{v}, \(\widehat C \cap N_{1}\) must be non-empty. For any \(w \in \widehat C \cap N_{1}\), we have that w 1, w 2w and w w 1, w w 2E(G), since otherwise the result follows. If u wE(G) with uN 2, then, as {w 1, w 2} is a vertex cover of H v we must have u = w 1 or u = w 2. We would then have w 1 or w 2 contained in a non-clique \(\widehat C\), contradicting our assumption. Hence u wE(G) ⇒ uN 1. Thus \(\widehat C \subseteq N_{1}\). As w 1 and w 2 are not contained in \(\widehat C\) and they cover all edges in H v , \(\widehat C\) must be an independent set in H v . In G∖{v}, therefore, \(\widehat C\) must be a clique, a contradiction. □

We now investigate the case when H v has a very specific structure. The motivation for this analysis will become clear in Section 4.3.

A seagull is a connected component of H v that is isomorphic to a P 3 with middle vertex in N 1 and endpoints in N 2. The graph H v is called an s-skein if it is a disjoint union of s seagulls and some isolated vertices.

Lemma 12

Let v ∈ V. Suppose that H v is an s-skein. Then there is a minimum solution S such that v ∉ S.

Proof

Let H v consist of seagulls (x 1, y 1, z 1),(x 2, y 2, z 2),…,(x s , y s , z s ) and some isolated vertices. That is, the middle vertices y i are in N 1, while the endpoints x i and z i are in N 2. If s = 1, {y 1} is a vertex cover of H v and Corollary 10 yields the result. Henceforth, we assume s ≥ 2.

Let X = {y 1, y 2, … , y s }. Clearly, X is a vertex cover of H v . Thus, we may use X as in Lemma 9 and obtain a minimum solution S. If vS we are done, so let us assume |XS| ≥ 2. Take an arbitrary i such that y i XS. As |XS| ≥ 2, we may pick another ji, y j XS. The crucial observation from the definition of H v is that (y j , y i , x i ) and (y j , y i , z i ) are P 3s in G. As y i , y j S, we have x i , z i S. Hence, since the choice of i was arbitrary, we infer that for each 1 ≤ is either y i S or x i , z i S, and, consequently, S contains a vertex cover of H v . By Lemma 6, S∖{v} is also a solution in G, a contradiction to the minimality of S. □

4 Algorithm

In this section we show our algorithm for ClusterVD, proving Theorem 1. The algorithm is a typical branching algorithm, where at each step we choose one branching rule and apply it. In each subcase, a number of vertices is deleted, and the parameter k drops by this number. If k becomes negative, the current subcase is terminated with a negative answer. On the other hand, if k is non-negative and G is a cluster graph, the vertices deleted in this subcase form a solution of size at most k.

4.1 Preprocessing

At each step, we first preprocess simple connected components of G.

Lemma 13

For each connected component C of G, in linear time, we can:

  1. 1.

    conclude that C is a clique; or

  2. 2.

    conclude that C is not a clique, but identify a vertex w such that C∖{w} is a cluster graph; or

  3. 3.

    conclude that none of the above holds.

Proof

On each connected component C, we perform a depth-first search. At every stage, we ensure that the set of already marked vertices induces a clique.

When we enter a new vertex, w, adjacent to a marked vertex v, we attempt to maintain the above invariant. We check if the number of marked vertices is equal to the number of neighbours of w which are marked; if so then the new vertex w is marked. Since w is adjacent to every marked vertex, the set of marked vertices remains a clique. Otherwise, there is a marked vertex u such that u wE(G), and we may discover it by iterating once again over edges incident to w. In this case, we have discovered a P 3(u, v, w) and C is not a clique. At least one of u, v, w must be deleted to make C into a cluster graph. We delete each one of them, and repeat the algorithm (without further recursion) to check if the remaining graph is a cluster graph. If one of the three possibilities returns a cluster graph, then (2) holds. Otherwise, (3) holds.

If we have marked all vertices in a component C while maintaining the invariant that marked vertices form a clique, then the component C is a clique. □

For each connected component C that is a clique, we disregard C. For each connected component C that is not a clique, but C∖{w} is a cluster graph for some w, we may greedily delete w from G: we need to delete at least one vertex from C, and w hits all P 3s in C. Thus, henceforth we assume that for each connected component C of G and for each vV(C), C∖{v} is not a cluster graph. In other words, we assume that we need to delete at least two vertices to solve each connected component of G.

4.2 Accessing H v in Linear Time

Let us now fix a vertex vV and let C be its connected component in G. Note that, as H v contains parts of the complement of G, it may have size superlinear in the size of G. Therefore we now develop a simple oracle access to H v that allows us to claim linear dependency on the graph size in the time bound.

Lemma 14

Given a designated vertex v ∈ V, one can, in time linear in the size of G, either compute a vertex w of degree at least 3 in H v , together with its neighbourhood in H v , or explicitly construct the graph H v .

Proof

First, mark vertices of N 1 and N 2. Second, for each vertex of G compute its number of neighbours in N 1 and N 2. This information, together with |N 1|, suffices to compute degrees of vertices in H v . Hence, we may identify a vertex of degree at least 3 in H v , if it exists. If we find such a vertex, say w, then computing \(N_{H_{v}}(w)\) takes time linear in the size of G. If no such vertex w exists, the complement of G[N 1] has size linear in |N 1| and we may construct H v in linear time in a straightforward manner. □

In the algorithm of Theorem 1, we would like to make a decision depending on the size of the minimum vertex cover of H v . By the preprocessing step, C is not a clique, and by Lemma 3, H v contains at least one edge, thus VC(G) ≥ 1. We now note that we can find a small vertex cover of G in linear time.

Lemma 15

In linear time, we can determine whether H v has a minimum vertex cover of size 1, of size 2, or of size at least 3. Moreover, in the first two cases we can find the vertex cover in the same time bound.

Proof

We use Lemma 14 to find, in linear time, a vertex w with degree at least 3, or generate H v explicitly. In the latter case, H v has vertices of degree at most 2, and it is straightforward to compute its minimum vertex cover in linear time.

If we find a vertex w of degree at least 3 in H v , then w must be in any vertex cover of size at most 2. We proceed to delete w and restart the algorithm of Lemma 14 on the remaining graph to check if H v in Gw has a vertex cover of size 0 or 1. We perform at most 2 such restarts. Finally, if we do not find a vertex cover of size at most 2, it must be the case that the minimum vertex cover contains at least 3 vertices. □

4.3 Subroutine: Branching on H v

We are now ready to present a branching algorithm that guesses the ‘correct’ vertex cover of H v , for a fixed vertex v. That is, we are now working in the setting where we look for a minimum solution to ClusterVD on (G, k) not containing v, thus, by Corollary 5, containing a vertex cover of H v . Our goal is to branch into a number of subcases, in each subcase picking a vertex cover of H v . By Corollary 8, our branching algorithm, to be correct, needs only to generate at least one element from each equivalence class of the ‘equivalent’ relation, among maximal elements in the ‘dominate’ relation.

The algorithm consists of a number of branching steps; in each subcase of each step we take a number of vertices into the constructed vertex cover of H v and, consequently, into the constructed minimum solution to ClusterVD on G. At any point, the first applicable rule is applied.

First, we disregard isolated vertices in H v . Second, we take care of large-degree vertices.

Rule 1

If there is a vertex uV(H v ) with degree at least 3 in H v , include either u or \(N_{H_{v}}(u)\) into the vertex cover. That is, use the branching step \((u,N_{H_{v}}(u))\).

Note that Rule 1 yields a branching vector (1, d), where d ≥ 3 is the degree of u in H v . Henceforth, we can assume that vertices have degree 1 or 2 in H v . Assume there exists uN 1 of degree 1, with u wE(H v ). Moreover, assume there exists a minimum solution S containing u. If wS, then, by Lemma 7, S∖{u} is also a solution, a contradiction. If wN 2S, then (S∖{u}) ∪ {w} dominates S. Finally, if wN 1S, then (S∖{u}) ∪ {w} is equivalent to S. Hence, we infer the following greedy rule.

Rule 2

If there is a vertex uN 1 of degree 1 in H v , include \(N_{H_{v}}(u)\) into the vertex cover without branching. (Formally, use the branching step \((N_{H_{v}}(u))\).)

Now we assume vertices in N 1 are of degree exactly 2 in H v . Suppose we have vertices u, wN 1 with u wE(H v ). We would like to branch on u as in Rule 1, including either u or \(N_{H_{v}}(u)\) into the vertex cover. However, note that in the case where u is deleted, Rule 2 is triggered on w and consequently the other neighbour of w is deleted. Hence, we infer the following rule.

Rule 3

If there are vertices u, wN 1, u wE(H v ) then include either \(N_{H_{v}}(w)\) or \(N_{H_{v}}(u)\) into the vertex cover, that is, use the branching step \((N_{H_{v}}(w), N_{H_{v}}(u))\).

Note that Rule 3 yields the branching vector (2, 2).

We are left with the case where the maximum degree of H v is 2, there are no edges with both endpoints in N 1, and no vertices of degree one in N 1. Hence H v must be a collection of even cycles and even paths (recall that N 2 is an independent set in H v ). On each such cycle C, of 2l vertices, the vertices of N 1 and N 2 alternate. Note that we must use at least l vertices for the vertex cover of C. By Lemma 7 it is optimal to greedily select the l vertices in CN 2.

Rule 4

If there is an even cycle C in H v with every second vertex in N 2, include CN 2 into the vertex cover without branching. (Formally, use the branching step (CN 2).)

For an even path P of length 2l, we have two choices. If we are allowed to use l + 1 vertices in the vertex cover of P, then, by Lemma 7, we may greedily take PN 2. If we may use only l vertices, the minimum possible number, we need to choose PN 1, as it is the unique vertex cover of size l of such a path. Hence, we have an (l, l + 1) branch with our last rule.

Rule 5

Take the longest possible even path P in H v and either include PN 1 or PN 2 into the vertex cover. That is, use the branching step (PN 1, PN 2).

In Rule 5, we pick the longest possible path to avoid the branching vector (1, 2) as long as possible; this is the worst branching vector in the algorithm of this section. Moreover, note that if we are forced to use the (1, 2) branch, the graph H v has a very specific structure.

Lemma 16

The algorithm of Section 4.3 is forced to use Rule 5 with the branching vector (1, 2) if and only if H v is an s-skein for some s ≥ 1. Moroever, Rule 5 applied to an s-skein results in an (s − 1)-skein in both branches.

We note that the statement of Lemma 16 is our sole motivation for introducing the notion of skeins and proving their properties in Lemma 12.

We conclude this section with the observation that the oracle access to H v given by Lemma 14 allows us to execute a single branching step in linear time.

4.4 Main Algorithm

We are now ready to present our algorithm for Theorem 1. We assume the preprocessing (Lemma 13) is done. Pick an arbitrary vertex v. We first run the algorithm of Lemma 15 to determine if H v has a minimum vertex cover of size 1 or 2. Then we run the algorithm of Lemma 14 to check if H v is an s-skein for some s.

We consider the following cases2

  1. Case 1.

    VC(H v ) = 1or H v is an s-skein for some s. Then, by Corollary 10 and Lemma 12, we know there exists a minimum solution not containing v. Hence, we run the algorithm of Section 4.3 on H v .

  2. Case 2.

    VC(H v ) = 2and H v is not a 2-skein.Footnote 1 Assume the application of Lemma 15 returned a vertex cover X = {w 1, w 2} of H v . By Lemma 9, we may branch into the following two subcases: in the first we look for minimum solutions containing v and disjoint with X, and in the second, for minimum solutions not containing v.

    In the first case, we first delete v from the graph and decrease k by one. Then we check whether the connected component containing w 1 or w 2 is not a clique. By Lemma 11, for some w ∈ {w 1, w 2}, the connected component of G∖{v} containing w is not a clique; finding such w clearly takes linear time. We invoke the algorithm of Section 4.3 on H w .

    In the second case, we invoke the algorithm of Section 4.3 on H v .

  3. Case 3.

    VC(H v ) ≥ 3and H v is not an s-skein for any s ≥ 3. We branch into two cases: we look for a minimum solution containing v or not containing v. In the first branch, we simply delete v and decrease k by one. In the second branch, we invoke the algorithm of Section 4.3 on H v .

4.5 Complexity Analysis

In the previous discussion we have argued that invoking each branching step takes linear time. As in each branch we decrease the parameter k by at least one, the depth of the recursion is at most k. In this section we analyse branching vectors occurring in our algorithm. The main idea of the analysis is to treat the initial branching in Case 2. and Case 3., together with a few first branching steps of the subsequent invocation of the algorithm of Section 4.3, as a single huge branching step, with a single (long) branching vector. With such an analysis, the potential inefficiency coming from the initial branchings is outweighted by the very efficient branchings of the algorithm of Section 4.3. To finish the proof of Theorem 1, we show that the largest positive root of the equation \(1={\sum }_{i=1}^{r} x^{-a_{i}}\) among all possible branching vectors (a 1, a 2, … , a r ) obtained in the analysis is strictly less than 1.9102.

As the number of resulting branching vectors in the analysis is rather large, we use a Python script for automated analysis.Footnote 2 The main reason for a large number of branching vectors is that we need to analyse branchings on the graph H v in case where we consider v not to be included in the solution. Let us now proceed with formal arguments.

4.5.1 Analysis of the Algorithm of Section 4.3

In a few places, the algorithm of Section 4.3 is invoked on the graph H v and we know that VC(H v ) ≥ h for some h ∈ {1, 2, 3}. Consider the branching tree 𝕋 of this algorithm. For a node xV(𝕋), the depth of x is the number of vertices of H v deleted on the path from x to the root. We mark some nodes of 𝕋; the marked nodes correspond to the branching steps that we analyse together with the initial branching in Case 2. and Case 3.

Each node of depth less than h is marked. Moreover, if a node x is of depth d < h and the branching step at node x has branching vector (1, 2), Lemma 16 lets us infer that the graph H v at this node is an s-skein for some shd. Consquently, for all the descendants of x in V(𝕋) the graph H v is a (smaller) skein and thus, again by Lemma 16, their branching vectors are all (1, 2). In this case, we mark all descendants of x that are within distance (in 𝕋) less than hd. Note that in this way we may mark some descendants of x of depth equal to or larger than h. We refer to Fig. 1 for some examples of branching trees and marked vertices.

Fig. 1
figure 1

Examples of branching trees for the subroutine on the graph H v . Marked nodes for h = 3 are coloured black. In the last case, a 2-skein occurs in one branch, and an extra node (encircled) of depth 3 is marked

We split the analysis of an application of the algorithm of Section 4.3 into two phases: the first one contains all branching steps performed on marked nodes, and the second on the remaining nodes. In the second phase, we simply observe that each branching step has branching vector not worse than (1, 2). In the first phase, we aim to write a single branching vector summarizing the phase, so that with its help we can balance the loss from other branches when v is deleted.

We remark that, although in the analysis we aggregate some branching steps to prove a better time bound, we always aggregate only a constant number of branches (that is, we analyse the branching on marked vertices only for constant h). Consequently, we maintain a linear dependency on the size of the graph in the running time bound.

The main property of the marked nodes in 𝕋 is that their existence is granted by the assumption VC(H v ) ≥ h. That is, each leaf of 𝕋 has depth at least h, and, if at some node x of depth d < h the graph H v is an s-skein, we infer that shd (as the size of minimum vertex cover of an s-skein is s) and the algorithm performs s independent branching steps with branching vectors (1, 2) in this case. Overall, no leaf of 𝕋 is marked.

To analyse such branchings for h = 2 and h = 3 we employ the Python script. The procedure branch_Hv generates all possible branching vectors for the first phase, assuming the algorithm of Section 4.3 is allowed to pick branching vectors (1), (1, 3), (2, 2) or (1, 2) (option allow_skein enables/disables the use of the (1, 2) vector in the first branch). Note that all other vectors described in Section 4.3 may be simulated by applying a number of vectors (1) after one of the aforementioned branching vectors.

4.5.2 Analysis of the Algorithm of Section Section 4.4

Case 1

Here the algorithm of Section 4.3 performs branchings with vectors not worse than (1, 2).

Case 2

If v is deleted, we apply the algorithm of Section 4.3 to H w , yielding at least one branching step (as the connected component with w is not a clique). Hence, in this case the resulting branching vector is any vector that came out of the algorithm of Section 4.3, with all entries increased by one (for the deletion of v). Recall that in the algorithm of Section 4.3, the worst branching vector is (1, 2), corresponding to the case of H w being a skein. Consequently, the worst branching vector if v is deleted is (2,3).

If v is not deleted, the algorithm of Section 4.3 is applied to H v . The script invokes the procedure branch_Hv on h = 2 and allow_skein = False to obtain a list of possible branching vectors. For each such vector, we append entries (2,3) from the subcase when v is deleted. See Fig. 2a for an illustration of the branching tree in this case.

Fig. 2
figure 2

Branching trees for Case 2. and Case 3

Case 3

The situation is analogous to the previous case. The script invokes the procedure branch_Hv on h = 3 and allow_skein = False to obtain a list of possible branching vectors. For each such vector, we append the entry (1) from the subcase when v is deleted. See Fig. 2b for an illustration of the branching tree in this case.

4.5.3 Summary

We infer that the largest root of the equation \(1 = {\sum }_{i=1}^{r} x^{-a_{i}}\) occurs for the branching vector (1, 3, 3, 4, 4, 5) and is less than 1.9102. This branching vector corresponds to Case 3. and the algorithm of Section 4.3, invoked on H v , first performs a branching step with the vector (1, 3) and in the branch with 1 deleted vertex, finds H v to be a 2-skein and performs two independent branching steps with vectors (1, 2). Note that the last example on Fig. 1 corresponds to the branching on H v in this worst-case scenario.

This analysis concludes the proof of Theorem 1. We remark that the worst branching vector in Case 2. is (2,2,3,3,3) (with solution x < 1.8933), corresponding to the case with a single (1, 2)-branch when v is deleted and a 2-skein in the case when v is kept. Obviously, the worst case in Case 1. is the golden-ratio branch (1, 2) with solution x < 1.6181.

5 Co-cluster Setting

In this section we show that the same result as in Theorem 1 holds for the complement version of the problem, called Co-Cluster Vertex Deletion (CoClusterVD for short). Here, one wants to delete at most k vertices from the input graph to obtain a co-cluster graph (a complement of a cluster graph).

Theorem 17

Co-Cluster Vertex Deletion can be solved in \(\mathcal {O}(1.9102^{k} (n+m))\) time and polynomial space on an input (G, k) with |V(G)| = n and |E(G)| = m.

Observe that, if one wants to solve CoClusterVD, one may complement the input graph and solve ClusterVD instead. However, with such an approach we do not obtain a linear dependency on the size of the input. To obtain it, we need to reengineer our preprocessing routine (Lemma 13) and the oracle access to the graph H v (Section 4.2) to work in the co-cluster setting.

5.1 Preprocessing

We need to show the following variant of Lemma 13.

Lemma 18

Given the complement of the graph G, in time linear in the input size, we can for each connected component C of G:

  1. 1.

    conclude that C is a clique; or

  2. 2.

    conclude that C is not a clique, but identify a vertex w such that C∖{w} is a cluster graph; or

  3. 3.

    conclude that none of the above holds.

Proof

Denote by \(\bar {G}\) the complement of G, given as an input. First, we compute the connected components of G, i.e., the complement-connected components of \(\bar {G}\). For this, we may use a linear-time algorithm of Ito and Yokohama [16]. Next, we construct \(\bar {G}[C]\) for each of the discovered components C. This also takes linear time.

Now, we can process each component separately. On each of them we shall spend time proportional to the size of \(\bar {G}[C]\). The algorithm of [16] actually lets us compute a DFS tree of G[C]. We exploit this feature to simulate our approach from Lemma 13: we arrange vertices in the preorder of the DFS tree. Now, it suffices to find the first vertex w which in \(\bar {G}[C]\) has a neighbour u among the preceding vertices. We conclude that in G[C] vertices u, w are endpoints of a P 3 whose midpoint is v, the parent of w in the tree.

Thus, as in the proof of Lemma 13, u, v, w are the tree vertices which we need to verify. In order to check these candidates we simply compute the complement-connected components, again using the results of [16]. □

5.2 Accessing H v in Linear Time

Recall that we have fixed vertex v, and we are to give an oracle access to H v , given \(\bar {G}\) as an input. We first note the following.

Lemma 19

We can compute sets N 1 and N 2 in time linear in the size of \(\bar {G}\).

Proof

First compute N 1 by marking all non-neighbours of v in \(\bar {G}\). Then, for each uV∖(N 1 ∪ {v}) observe that uN 2 if and only if \(|N_{\bar {G}}(u) \cap N_{1}| < |N_{1}|\). This condition can be verified in time linear in the size the neighbourhood of u in the graph \(\bar {G}\), and hence in time linear in the size of \(\bar {G}\) for all vertices u. □

We now prove an analogoue of Lemma 14.

Lemma 20

Given a designated vertex v ∈ V, one can in time linear in the size of \(\bar {G}\) either compute a vertex w of degree at least 3 in H v , together with its neighbourhood in H v , or explicitly construct the graph H v .

Proof

First, mark vertices of N 1 and N 2 using Lemma 19. Second, for each vertex of V compute its number of neighbours in N 1 and N 2; note that this can be done by inspecting all edges of \(\bar {G}\). This information, together with |N 1|, suffices to compute degrees of vertices in H v . Hence, we may identify a vertex of degree at least 3 in H v , if it exists. For such a vertex w, we may compute \(N_{H_{v}}(w)\) in time linear in the size of \(\bar {G}\) by inspecting all vertices uN 1N 2 one-by-one.

If no such vertex w exists, the number of non-edges of \(\bar {G}\) (i.e., edges of G) between N 1 and N 2 is linear in |N 1| + |N 2| and we can compute them in time linear in the size of \(\bar {G}\). Together with \(\bar {G}[N_{1}]\), they form H v . □

Finally, we observe that the following analogoue of Lemma 15 is straightforward, as the proof of Lemma 15 accesses the graph H v and G only via Lemma 14, and we have already adapted this lemma to the co-cluster setting.

Lemma 21

In time linear in the size of \(\bar {G}\) , we can determine whether H v has a minimum vertex cover of size 1, of size 2, or of size at least 3. Moreover, in the first two cases we can find the vertex cover in the same time bound.

This analysis concludes the proof of Theorem 17.

6 Conclusions and Open Problems

We have presented a new branching algorithm for Cluster Vertex Deletion. We hope our work will trigger a race for faster FPT algorithms for ClusterVD, as it was in the case of the famous Vertex Cover problem.

Repeating after Hüffner et al. [15], we would like to re-pose here the question for a linear vertex-kernel for ClusterVD. As ClusterVD is a special case of the 3-Hitting Set problem, it admits an \(\mathcal {O}(k^{2})\)-vertex kernel in the unweighted case and an \(\mathcal {O}(k^{3})\)-vertex kernel in the weighted one [1, 2]. However, Cluster Editing is known to admit a much smaller 2k-vertex kernel, so there is a hope for a similar result for ClusterVD.