Everything began in the haze of the 70's of the last millennium. It is nebulous who stumbled first upon this enigma. Some say that Ron Graham has discussed the problem in public lectures, some others say that Joseph O'Rourke has posed it as an open question in the American Mathematical Monthly, while some others suspect that the problem had been already hiding in older different formulations. However, it is a historical fact that one shinny/cloudy day, three computer scientists finished polishing a manuscript that became a classical paper known as "Some NP-complete geometric problems". In this paper, Michael Garey, Ron Graham and David Johnson showed the NP-hardness of two important problems in geometric metrics: Steiner Tree and Traveling Salesman. For the Euclidean plane, they showed only NP-hardness as they did not manage to show that these problems are contained in NP. Moreover, they accentuated that we cannot even rule out that the decision version of Euclidean Minimum Spanning Tree is outside of NP. What a seeming paradox given that we can compute such a minimum tree in polynomial time! So, whom did they blame? The short answer: The Euclidean metric. Garey and his coauthors explain that all these problems have a common hidden issue: They rely on comparing Euclidean lengths, that is, they rely on comparing irrational numbers based on square roots. Whereas this task is trivial if we just want to compare two line segments (e.g. by comparing the radicands), the problem starts when we want to compare two polygonal paths. Even assuming rational (or, after scaling, integer) coordinates, this problem translates into a question that is fundamental in computational geometry: Given two lists of integers, a1 ... and b1 ..., can we decide whether "∑ √ai ≥ ∑ √bi" in P? Put into words: Can we efficiently compare two sums of square roots over integers?
To emphasize the significance of this question, let me cite David Eppstein: "A major bottleneck in proving NP-completeness for geometric problems is a mismatch between the real-number and Turing machine models of computation: one is good for geometric algorithms but bad for reductions, and the other vice versa. Specifically, it is not known on Turing machines how to quickly compare a sum of distances (square roots of integers) with an integer or other similar sums, so even (decision versions of) easy problems such as the minimum spanning tree are not known to be in NP."
"Is this problem computable at all?" - might the curious reader ask after realizing that the approach of iteratively increasing the precision until the first difference in digits will never terminate if both numbers happen to be equal. Luckily, Allan Borodin et al. and Johannes Blömer showed that polynomial time is enough to decide whether two such sums have the same value. Therefore, it is astonishing that the problem of deciding the sign of their difference, a question that seems only slightly harder, has been only recently spotted by Eric Allender et al on an island of the counting hierarchy CH (3rd level) located in the wide ocean of PSPACE. Thus, it is still far far away from the needle head of P, where Gregorio Malajovich conjectured it to live. Until this will be shown, Jeff Erickson proposes to study which problems are (non-boringly) Σ√-hard.
Oh, if we could only sufficiently bound the difference of the sums from below! Then we could bound from above the precision required to determine the correct sign of the difference: Just image that B is such a lower bound on the difference. Now, take a precision that allows us to compute an approximate difference being only B/2 away from the real one. Then it is easy to verify that the following procedure determines the correct sign: Output the sign of the approximate difference if it is outside of [-B/2,B/2], otherwise output 0. What is the run time of this procedure? It is proportional to the precision, which corresponds to the length of B, that is, to -log B. Hence, if -log B is bounded from above by a polynomial in the length of the input, then we are in P! So, is there any reasonable lower bound for the worst case? That is, what is the smallest positive difference between any two sums of k square roots of integers of size at most n? Welcome to Problem 33 of The Open Problems Project! Let r(n,k) denote such a minimum difference as a function of n and k. For instance, consider n=20 and k=2. Then r(n,k) is roughly 0.0002 and it is attained by √{10} + √{11} - √{5} - √{18}. As being an open problem, there is no tight bound known for r(n,k). Jianbo Qian and Cao An Wang as well as Qi Cheng and Yu-Hsin Li proved(*) that there are sums where -log r(n,k) is at least in the order of magnitude of k log n. However, these lower bounds constitute an exponentially gap to the best known upper bounds which are O(22k log n) [Christoph Burnikel et al.(**)] and 2O(n/log n) [Qi Cheng et al.]. So, maybe you, dear reader, will be the one to close the gap?
In the dramatic opening of this post, I mentioned our paper, where the problem of Sum of Square Roots appeared. There, we design a PTAS for the Traveling Salesman problem (TSP) with hyperplane neighborhoods, that is, we look for a (1+ε)-approximation of a minimum-length tour that, instead of points, visits a given set of hyperplanes in ℝd with d being fixed. The basis of our algorithm is an integer grid of constant size (depending on ε and d). Our key observation is that there is a translation and scaling of the grid such that snapping an (adequately sparsified) optimum solution to the grid results in a (1+ε)-approximation. Using this insight, we let our algorithm enumerate all (reasonable) TSP tours lying in our integer grid. With a simple LP, we translate and scale each tour such that it visits all the hyperplanes of the input whilst minimizing its tour length (i.e., the scaling factor). At the end, we obtain a set of feasible TSP tours and output the shortest one. And here we were confronted with the Sum of Square Roots problem. Which of the tours is the shortest one? Was all our work devastated, all our blood, sweat and tears in vain? Just because the very last line of our algorithm had unknown complexity? Our solution was simple: We just skipped the issue and our paper got accepted. Why? Well, the issue wasn't one for the following three reasons:
- We could be lazy and assume the real RAM computational model with constant time for square root operations (as often done in computational geometry).
- We could be industrious and approximate the real tour lengths with a sufficiently small error since we look anyway for an approximation in overall.
- We could think and then realize that our candidate tours consist of only constant many vertices as our integer grid has constant size. Indeed, the best known lower bound on the difference r(n,k) implies the following nice take-away which shall conclude this epilogue:
-------------------------
Proof sketches for the interested reader of the lower and upper bounds on r(n,k): (*) Ω(k log n) [Cheng et al.]: The interval [k, k √{n}] contains so many distinct sums of square roots (over square-free integers) that, by a pigeonhole argument, there are two sums with very small distance. (**) O(22k log n) [Burnikel et al.]: The difference is an algebraic integer and, consequently, the root of a polynomial with integer coefficients and degree at most 22k. A simple inequality yields the bound.