# Eight years among short superstrings

Back in 2006, website like this I was a post-doc at MPI Saarbruecken in Kurt Mehlhorn's group. At that time, I started working on several TSP variants beginning with my collaboration with Kasia Paluch and Olek M?dry on the symmetric MAX-TSP. The most fascinating of these was not really a TSP variant, but a problem that is solved by a reduction to one - the Shortest Superstring Problem (SSP) in which you look for a shortest string containing a given set of strings as substrings. My fascination was even greater because of the fact that I felt absolutely certain that I had an idea to improve upon the already rather old result of Sweedyk (from 1991) giving 2 1/2-approximation, only somehow I could not quite make it work. The thing was that the same 2 1/2-approximation could be achieved by using a 2/3-approximation algorithm for MAX-ATSP from a paper by Heim Kaplan, Moshe Lewenstein, Nira Shafrir and Maxim Sviridenko, but just using that result as a black-box did not seem right. However, the proof of the 2/3-approximation there was rather complicated so I could not really put my finger on it.

Fast forward 7 years foward, after several unsuccessful attempts to improve upon the 2 1/2 factor, I stumbled upon a beatiful STACS paper again by Kasia Paluch, together with Khaled Elbassioni and Anke van Zuylen, which gives a new 2 1/2-approximation algorithm for MAX-ATSP. (If you have not yet read this paper, do it!) Obviously, I decided to try adapt this paper to my needs and use it to approximate SSP. That attempt failed completely after a rather heavy struggle. However, on the next day after realizing it was not going to work, it turned out the effort was not in vain - I had the right idea all along, only did not completely realize it. This was one of those crazy moments in research where you are obsessively trying to find something, only to find something completely different and unexpected, and equally or even more valuable!

I will not describe the approach here - that would necessarily be rather long, since the paper builds quite heavily on previous results. However, I am going to present this result at SODA 2013 in January. So if you want to learn more about how a 13 years old bound was improved, come to my talk in New Orleans, or even better chat with me over coffee. See you in 3 months!

Marcin

One of the talks that deeply fall into my memory was the talk on this paper at ICALP 2011, look where the authors show that some important problem is NP-hard, malady settling a 15-year-old open problem. The talk was exactly as you expected it to be: there were gadgets, and more gadgets, and more and more gadgets, until they stopped to fit into the slide. The tradition of such a talk on ICALP was maintained in 2012 by Dániel Marx with his Planar Multiway Cut lower bound.

Recently I spotted a list every TCSer should do in its career. I think that a point "publish a paper that consists solely of one humongous hardness reduction" is definitely missing there.

I miss a few points from the aforementioned list, but at least this new one would be satisfied with our this years SODA paper on Edge Clique Cover. Yes, there is only one huge reduction in this paper. It proves that, essentially, to solve the Edge Clique Cover problem exactly, you cannot do anything better than apply easy reductions and go brute-force.

In Edge Clique Cover we ask to cover the edges of a given graph with a minimum number of cliques that are its subgraphs. This is equivalent to saying that we want to compute an intersection model of a given graph with minimum size of the universe. There are a few easy reduction rules, such as "do the obvious thing with isolated vertices and isolated edges" or "identify twins, i.e., vertices v, u with N[v] = N[u]" (see here for details). If we ask for k cliques (k-element universe), these reductions bring the size of the graph down to 2k. A brute-force algorithm runs in time exponential in the size of the graph, which gives double-exponential dependency on k.

We show a reduction that reduces (in polynomial time) an n-variable 3CNF-SAT formula to a Edge Clique Cover instance with k=O(log n). This shows that any significant improvement upon the brute-force algorithm on the reduced instance would yield a subexponential algorithm for 3CNF-SAT, which seems unlikely.

The key idea of our reduction is to use the cocktail party graph as the main gadget. While it has got a plenty of edge clique covers of size O(log n), the known reductions are not able to reduce it.

So, if you'll attend SODA'13 and you like to see some slides filled with gadgets, don't miss Micha?'s talk on our paper.