Capacitated k-Centers

During my half-year research visit at the University of Maryland, I had a chance to work with MohammadTaghi Hajiaghayi and Samir Khuller, which resulted in a paper accepted to FOCS 2012: LP Rounding for k-Centers with Non-uniform Hard Capacities (to appear in arXiv soon).

In the k-center problem, we are given an integer k and a metric dist defined on a given set of points V. The goal is to open exactly k centers, that is facilities that are to serve clients located in all the points of V, in such a way that the maximum distance between any point of V and its closest center is minimized. Simple 2-approximation algorithms are known for this problem since 80's, and by a reduction from dominating set no (2-eps)-approximation algorithm exists unless P=NP.

In the capacitated version, each point v of V is assigned its capacity L(v) and in this setting no center can serve more clients than its capacity. The special case when all the capacities are equal was studied in 90's, when Bar-Ilan, Kortsarz and Peleg gave a 10-approximation algorithm, which was later improved to 6-approximation by Khuller and Sussmann. In this paper we present a constant factor approximation algorithm for the general capacities case.

By the standard tresholding technique, one can assume that the metric dist is a shortest path metric in a given undirected unweighted graph G=(V,E). This can be achieved by guessing the value of OPT (there are only n2 possibilities) and putting to the edge set all pairs at distance at most OPT. Unfortunately, even after this reduction the standard LP relaxation has unbounded integrality gap, which can be seen at the following example, where k=3 and the capacity function is uniform L(v)=4.

However, what we found surprising, when the given graph G is connected we were able to prove a constant upper bound on the integrality gap of the LP. Since one can solve the problem for each connected component of the graph independently and combine results using standard dynamic programming, a rounding algorithm for connected graphs is enough to obtain a constant factor approximation algorithm.

Let y-value of a vertex be the fraction of a center opened in it by the fractional LP solution we are working with. Since we have a sharp bound on the number of centers opened, we can not independently round y-values of vertices, because we have to preserve the sum of y-values. For this reason we use what we call chain shifting operation, where in order to move y-value between two distant vertices a and b we use intermediate fully opened (with y-value equal to one) centers as illustrated below.

 

Observe, that in the above figure in order to move clients between the centers we need an assumption, that all the internal vertices of the path have capacities at least L(a). This is the main obstacle in our rounding process, which clearly would not appear in the uniform capacities case. This hurdle makes our algorithm technical, as we round the fractional LP solution in several steps, where in each step we get more structure on the fractionally opened centers, at the cost of increasing distances between centers and clients those are serving.

Finally we would like to mention some open problems. The approximation ratio we obtain is in the order of hundreds (however we do not calculate it explicitly), so the natural open problem is to give an algorithm with a reasonable approximation ratio. Moreover, we have shown that the integrality gap of the standard LP formulation for connected graphs in the uniform capacities case is either 5 or 6, which we think might be an evidence, that it should be possible to narrow the gap between the known lower bound (2-eps) and upper bound 6 in the uniform capacities case.

Don't know what to do? Contract something!

The Unique Label Cover problem (ULC for short) is the problem that the Unique Games Conjecture is about: The task is to label each vertex of a given graph with a letter from an alphabet, given constraints on edges. Each edge of the graph comes with a bijection that tells how a label on one endpoint should be translated into a label on the second endpoint. Clearly, it's trivial to verify whether all constraints may be satisfied at once; the true goal is to violate as few edges as possible. The Unique Games Conjecture treats about hardness of determining the fraction of edges that may be violated; in this post I'm going to focus on instances where we would violate only very few constraints. In other words, I'm going to study fixed-parameter algorithms for ULC parameterized by the number of violated constraints (denoted k).

The main claim of our FOCS'12 paper is that ULC problem can be solved in O(sO(k2) nO(1)) time for n-vertex graph and alphabet of size s. This gives that ULC is polynomial-time solvable for constant alphabet and k=O(sqrt(log(n))), improving the obvious k=O(1) algorithm. And the main technique used is very simple: If you don't know what to do, randomly contract (equivalently, make undeletable) each edge independently with probability 0.5 - maybe something interesting will be highlighted in the graph with sufficiently high probability.

What does it mean sufficiently high probability? As we are aiming at time FPT in s and k, probability of the form 1/f(s,k) is fine for us.

So, what can nice happen with such a step? First, we can observe that this is a very easy way to find good separators to recurse in Kawarabayashi-Thorup way. Without going into technical details, if we find a small (of size f(s,k)) cut that separates two large (of size at least g(s,k)) connected subgraphs, we can recurse on one of the sides. With 2-f(s,k)-2g(s,k) probability we won't contract edges of the cut, while contracting a large (of size g(s,k)) connected subgraph of each side of the cut. After such operation, if two vertices of "weight" at least g(s,k) can be separated with a cut of size f(s,k), we've found a good separation.

This step needs to augment the problem with border terminals and pay attention to a few details, but the overall message is: from now on we can assume that each small (in the ULC case if suffices to say here "of size at most f(s,k)=2k") separator leaves one side of small size (in the ULC case of size roughly g(s,k)=sk).

Let X be a solution we're looking for. We know that in G-X there is one huge connected component, and at most k (as X is of size at most k) small connected components. What do we do now? Apply the trick again! With sufficiently high probability:

  1. each edge of X is not contracted;
  2. each small connected component is contracted onto one vertex;
  3. for each edge e in X and each its endpoint v that is contained in the large connected component of G-X, there are at least g(s,k) vertices contracted onto v.

The core observation now is as follows: assume the above, and let v and w be two heavy vertices: that is, on both v and w we have contracted at least g(s,k) other vertices. Since we cannot recurse, there are at least 2k+1 edge-disjoint paths between v and w; each such path gives exactly one way to translate a label of v into a label of w. With a budget of k violated constraints, we can cut only k such paths; thus, for each label of v, there is at most one reasonable label of w - the one that is yielded by a majority of paths. Thus, there are only s ways to label all heavy vertices. The rest of the algorithm is just a matter of a few small observations which we won't describe in details here.

Summing up, we found the above trick interesting, with one - not so often in the fixed-parameter tractability world - property that it is capable of handling weighted versions of the problem: it is not hard to generalize the above idea to find minimum cost set of violated constraints of size at most k. Naturally, there are a few technical devils hidden in this approach - that were sufficient to blow up the paper to 58 pages on arxiv - but still, it enabled us also to improve the k-Way Cut algorithm of Kawarabayashi and Thorup, as well as give an alternative FPT algorithm for Multiway Cut-Uncut.

We conclude with a summary of what properties are necessary to apply this approach - in case you would like to use it to you favourite problem.

  1. First, this approach needs to be some sort of graph deletion problem, with a possibility to bypass (or make undeletable) vertices or edges
    that are not to be deleted.
  2. To make a recursive step, we need that there are only limited number of ways a solution can behave on a separator. In our algorithm we've used the fact that, in a recursive step, we need only to take care of the labeling of the separator (sp choices for a separator of p vertices). This is closely related to the notion of finite integer index.
  3. Finally, we need to make use of the knowledge that in the final step, the graph has very good connectivity - in the sense that a small separator can remove only a small part of the graph. In the ULC case we've observed that parts with pairwise large (of size 2k+1) flow between them have a very limited number of ways they can be labeled. This part is problem-specific, and, from our experience, is the hardest part of the algorithm.

Let's make an example of the vertex k-way cut problem, which is known to be W[1]-hard, and see what breaks in the above approach. In this problem we're given a graph and integers k and p, and we're to delete p vertices to obtain at least k connected components (we parameterize by both k and p). It is not hard to see that the recursion works perfectly well: we need to remember only how does the border vertices split into components, and how many connected components were already created. However, the last step is hard; if you look at the hardness proof, the constructed instance has already strong connectivity - you can remove only isolated vertices by small separators - but it is still capable of encoding a clique instance.

A reverse situation happens in the Bisection problem, where we're to delete k edges (k is the parameter) to obtain two halves of n/2 vertices. Here the final step seems easy, and boils down to a variant of the knapsack problem. However, in the recursive step we need to remember the balance between the parts of the graph - which gives a linear in n number of choices - and the approach breaks down. To the best of our knowledge, fixed-parameter tractability of Bisection remains open.

Algorithmic Applications of Baur-Strassen’s Theorem: Shortest Cycles, Diameter and Matchings

Finally, we were able to show how to find minimum weight perfect matchings in non-bipartite graph using matrix multiplication. This was some years of trying after my paper from ICALP 2006, where the bipartite case was solved. However, with big help of Marek Cygan and Hal Gabow we have found rather elegant solution. The algorithm is very simple and given blow. Nevertheless, the proof of it correctness is not straight forward and requires some nontrivial structural properties.

Minimum weight perfect matching algorithm:

  • Remove non-allowed edges from the graph, i.e., edges that do not belong to any minimum weight perfect matching,
  • Compute M(v) minimum weight almost perfect matching that misses the vertex v,
  • Define new weights of edges w(uv) = w(uv) + w(M(u)) + w(M(v)),
  • Connected components of E = {uv : w’(uv)<a}, for all a, encode the family of blossoms,
  • Using algorithm for maximum cardinality matchings find a matching that crosses each blossom once.

This gives an algorithm working in time, where W is the maximal edge weight and ? is the matrix multiplication exponent. In the case of dense graphs with small integral weights our algorithm is faster than the O(n(m+n log n)) time algorithm of Gabow from 90 and the time algorithm of Gabow and Tarjan from 91. Previously known algorithms for this problem were using augmenting paths. This algorithm is very different and is based on structural properties that give combinatorial relation between matchings weights and the dual solution. In particular, we show that there exists a canonical dual solution that is unique.

During the work we realized that the tools developed in the paper, i.e., usage of Baur-Strassen’s theorem for the computation of allowed edges, can be used to solve other problems as well. The framework appeared to be very powerful and we were able to solve the shortest cycle problem and the diameter problem in their full generality. Previous solutions for these problems that used matrix multiplication did not handle the case of undirected graphs with negative weights. Observe that in such case even detecting whether a graph has negative length cycle is nontrivial. The Bellman-Ford algorithm does not work here as it works only in the case of directed graphs. However, the problem can be solved by a reduction to the minimum weight perfect matching problem.

After the submission we realized that some other problems can be solved using this framework. One can compute the radius of the graph, as well as solve the single source shortest path problem in undirected graph with negative weights, but without negative weight cycles. Previously known algorithm for shortest paths worked only in the case of directed graphs with negative weights.

Single Source – All Sinks Max Flows in Planar Digraphs

This is one of the papers we got accepted to this year's FOCS. It's a joint work with Yahav Nussbaum, Piotr Sankowski and Christian Wulff-Nilsen.

The paper is about computing maximum flows in planar directed graphs. Namely, given a fixed vertex s, we show how to compute the value of the max st-flow for every vertex t. Note that we just compute the value of the flow, so the algorithm basically computes n-1 numbers. By running it n times, one can compute all-pairs max flow.

The running time is O(n log3 n). This is just a log2 n factor slower than the time needed to compute a single max st-flow in a directed planar graph. In the case of undirected planar graphs, things are much easier. All-pairs max flows can be represented with a Gomory-Hu tree - in particular there can be at most n-1 distinct flow values. The representation can be found in almost linear time.

It gets more difficult in the case of planar directed graphs. Not many properties of planar flows in directed graphs are known. By switching to the planar graph dual, it's quite easy to see that the value of the max st-flow is equal to the length of the shortest directed cycle, that separates faces s* and t* and goes clockwise around t*. The latter property is the most troublesome one. While a shortest directed cycle separating s* and t* can be found easily, nobody has found a way of assuring it to be clockwise. We could just compute the shortest cycle separating s* and t*, but this way we would obtain an algorithm that finds the value of the min st- or min ts-cut, whichever one is smaller.

We use a different approach. It is based on a flow partition scheme by Johnson and Venkatesan. It basically says that if we have a separator C in the graph, such s and t lie on the different sides of this separator, then we can first compute the max sC flow (i.e. we treat all vertices of C as sinks) and the max Ct flow and then combine these two into a max st flow. It was later shown by Borradaile et al. that the procedure of combining the two flows can be implemented in time that is dependent only on |C|.

This approach is then combined with a recursive decomposition of a planar graph. Roughly speaking, a recursive decomposition is obtained by dividing the graph into two pieces using a cycle separator and then dividing the obtained pieces recursively. This gives us a tree-like structure of the separators we have used to divide the graph. Our algorithm processes this tree top-down and for each separator D, computes the max sD flow. The flow partition scheme allows us to compute the max flow to a given separator, knowing the flow to its parent in the separator tree.

A big question is now: is it also the case for the general graphs that all-pairs max flow can be found faster than by running ?(n2) max-flow computations?

4 of our papers accepted for FOCS!

We start the blog with some news that was some nice surprise for us. Lucky, we managed to get 4 papers accepted for FOCS 2012:

  • Algorithmic Applications of Baur-Strassen's Theorem: Shortest Cycles, Diameter and Matchings, Marek Cygan and Harold N. Gabow and Piotr Sankowski
  • Designing FPT algorithms for cut problems using randomized contractions, Rajesh Chitnis and Marek Cygan and Mohammad Taghi Hajiaghayi and Marcin Pilipczuk and Michał Pilipczuk
  • LP Rounding for k-Centers with Non-uniform Hard Capacities, Marek Cygan and Mohammad Taghi Hajiaghayi and Samir Khuller
  • Single Source - All Sinks Max Flows in Planar Digraphs, Jakub ??cki and Yahav Nussbaum and Piotr Sankowski and Christian Wulff-Nilsen

Some description of the papers will follow shortly.