Some pictures from ICALP

On 10-13 July we hosted ICALP in Warsaw. It was fun! See a few random pictures below.

We in Warsaw, recently stumbled upon which ranks CS departments world-wide according to publication counts in mayor conferences. A cool aspect of this ranking is that one can set the year range for the counted publications. We rather only very recently, i.e., during the last 5-10, started doing well, so this parameter allows to track this change. As we are working mostly in theory let us restrict the ranking to theory only, i.e, Algorithms & complexity; Cryptography; Logic and verification. If you set the range to start in 2000 - to contain the last 15 years, then we are placed on position 10 world-wide. Quite cool already. Setting the starting point to 2005 bumps us to the 6th position. The most interesting part happens when you set the starting point to 2010, i.e., last 5 years, as you get the following table:

Rank Institution Average Count Faculty
1 ? University of Warsaw ? 8.4 24
2 ? Massachusetts Institute of Technology ? 8.2 20
3 ? University of Texas - Austin ? 7.3 8
4 ? Stanford University ? 7.1 12
5 ? Cornell University ? 6.9 14
6 ? Northeastern University ? 6.1 10
7 ? New York University ? 5.9 10
7 ? Carnegie Mellon University ? 5.9 14
9 ? University of Illinois at Urbana-Champaign ? 5.7 14
9 ? University of California - San Diego ? 5.7 10

Quite surprising... as all the ranking are usually. However, there might be some reasons for this. First of all, we cover all three areas that are counted in theory. For example, logic is visibly stronger in Warsaw than in other places included in top 10. Second, we indeed started doing quite well, e.g., we got 6 ERC grants so far. I do not really believe in rankings, but at least it is some indication that Warsaw is not such a bad place to be - see the cat meme.

Bratislava Declaration of Young Researchers

The Bratislava Declaration of Young Researchers is something I was involved in recently. Its preparation was inspired by Slovak Presidency of the EU and it was presented on today's informal Council of Ministers responsible for competitiveness (Research). I hope this will have some follow up, as current trend in funding research in EU is in my opinion (and not only my as this declaration shows) going in the wrong direction.

Why should you not use the automatic assignment by EasyChair

Some days ago I have assigned papers to my PC in ESA Track A.  I tried to use automatic assignment in EasyChair and this was a rather disappointing experience. Just by running it I understood why some assignments I got as a PC member in previous conferences were so bad - they bad without any good reason. Let me explain what happened this time.

I clicked the automatic assignment button and I got a very unfair assignment. EasyChair wanted to punish 3 of my PC members by giving them 11, drugs 12 and 15 papers from their No list. In EasyChair one can bid Yes, Maybe and No on papers. I found this rather absurd as 15 paper was more than half of the total review load. The question is: Does this poor guy need to get so many Nos in optimal assignment?. The short intuitive answer is: most probably no, as there is a lot of freedom in each maximal size matching (by Gallai-Edmonds decomposition).

Hence, in 2 hours I coded my own assignment procedure. I assumed that in the assignment:

  • first, I want to maximize the number of papers assigned to Maybes and Yeses,
  • second, under the first assumption, I want to maximize the number of papers assigned Yeses.

This can be formalized as a maximum-cost maximum-flow problem:

  • connect the source vertex with each PC member with a zero cost edge of capacity equal to review load,
  • connect each PC member with each paper with edges of capacity 1 and cost 0 for No, 10000 for Maybe and 10001 for Yes.
  • connect each paper with the sink with edge of cost 0 and capacity 3.

I executed it and I got assignment with the same number of Nos - 80 and Maybes -100 as EasyChair. And of course, it was not much more fair than the one given by EasyChair. If you always wondered how EasyChair computes the assignment this is the way. Essentially, there is no reason for this to be fair in any sense. One is optimizing global cost, so it should be clear that locally it can be bad, e.g., some PC members will get bad assignments without any good reason. It can be even worse as one can be getting many Nos due to the execution order of procedure, as there is always a lot of freedom in the optimal matching. It can happen that even if you bid on almost everything you will get all the Nos if all people bid No on the same papers.

There is a solution here - I added third assumption to my procedure:

  • zero, no PC member can get more than X Nos, where X is a parameter.

This can be easily incorporated into the max-flow problem. One just needs to split a vertex representing each PC member into three vertices: PC_0, PC_No and PC_NotNo. The maximum-cost maximum-flow problem becomes:

  • connect the source vertex with all PC_0 vertices with a zero cost edge of capacity equal to review load,
  • connect PC_0 to PC_No with zero cost edge of capacity X,
  • connect PC_0 to PC_NotNo with zero cost edge of capacity equal to review load,
  • connect PC_No with edges of cost 0 to all papers with No,
  • connect PC_NotNo with each paper with edges of capacity 1 and cost 10001 for Maybe and 10001 for Yes.
  • connect each paper with sink with edge of cost 0 and capacity 3.

I executed it - the table below shows the number of Maybes in the assignment in dependence on the parameter X.

 X 4 5 6 7 8 9 10 11 12
 #Maybes 120 116 112 109 106 104 102 100 100

One striking thing is that there exists an optimal assignment where the poor guy that was getting 15 Nos is getting only 11 Nos. Essentially, there was no reason for giving 15 Nos to this PC member! This probably already happened to many of you ;). I the final assignment I went for X=4 as this actually meant that almost everyone gets 3 Nos and only 2 PC members get 4 Nos. This much fairer assignment costed only 20 Maybes in the quality, what I found a fair cost of being fairer.

Highlights of Algorithm Conference - Registration is Open

The HALG conference has opened the registration and the program is available as well - see the links below. The program looks very interesting so I hope many of you will be coming.

Highlights of Algorithms - HALG 2016
June 6-8, link 2016, ask Paris, France

The Highlights of Algorithms conference is designed to be a forum for presenting the highlights of recent developments in algorithms and for discussing potential further advances in this area. The conference will provide a broad picture of the latest research in algorithms through a series of survey and invited talks, as well as possibility for all researchers and students to present their recent results through a series of short talks and poster presentations. Attending the Highlights of Algorithms conference will also be an opportunity for networking and meeting leading researchers in algorithms.

The registration for the Highlights of Algorithms conference is open.
To register, please visit the link at (early registration is due April 30)

A preliminary program is available online at
The program is packed with 28 invited talks and with even a larger number of short contributions.

Those interested in attending the conference are advised to book accommodation as early as possible (given the high hotel prices in Paris).


ESA 2016 - call for papers

The ESA call for papers is out: The submission deadline is April 21, 23:59 AoE, see 2016, whereas the notifications will be send out no later than on June 9, 2016. I hope there will be many submissions to keep us in the PC busy during this time. Anyway, if you plan to submit something, you might read (or maybe read again) the post by Boaz Barak: It contains advice on how to increase your chances that your paper gets accepted to a conference - FOCS in this case. The same applies to most conferences and "putting the worst foot forward" is often the hardest part. However, believe me that doing the opposite usually hurts even more.

Highlights of Algorithm Conference

Guest post by Aleksander M?dry

This summer, drug on June 6-8 in Paris, help we will be having a new algorithmic event: Highlights of Algorithms 2016 (HALG 2016) conference.

This conference will be quite unlike the conferences we are all used to. First of all, patient it will consist mainly of invited talks and tutorials, accompanied by a smaller number of contributed talks and posters. Secondly, there will be no conference proceedings, i.e., presenting work already published at a different venue or journal (or to be submitted there) is absolutely welcome and even encouraged.

The call for submissions  of contributed short talks and posters has been just posted online. The program committee is currently in the process of selecting invited speakers (partially based on the nominations from the community).

We already have a number of very exciting speakers committed and we expect to add a few more names shortly.

The intention here is to make HALG a forum for presenting the highlights of some of the most exciting recent developments in algorithms, discussing potential further advances in this area, as well as networking and initiating new collaborations. In a way, we want HALG to be a place one can go to catch up on what are the hottest results/topics
at the moment, as well as what one missed by not being able to attend all the STOC/FOCS/SODA/ESA/ICALP/PODC/ITCS/… conferences that year.

We believe that this kind of venue is very much needed today and will help to make our community stronger and more cohesive. In particular, we hope that the success of the first edition will make HALG become a recurring event. In fact, HALG draws some inspiration from Highlights of Logic, Games and Automata (HLGA) conference series that became very successful in its respective community already. There are some differences in how both conferences are constructed: HALG will concentrate on invited talks with some contributed talks, whereas HLGA was only having short contributed talks so far; but the overall premise is the same and seems to be working very well. (We also heard some rumors that the next edition of HLGA will move in the direction of HALG and have invited talks too.)

In any case, the HALG 2016 in Paris should be a lot of fun. We hope to see many of you there! So please submit your work to HALG.

Two ERC grants in our group are starting!

We are quite happy that this year two ERC grants will be starting in our group. First of all, Marek Cygan very recently got news that his ERC Starting Grant received funding. His grant is on FPT, approximation and metaheuristics. Secondly, our ERC Proof of Concept grant, that is a follow up to my ERC StG grant is starting. This grant aims to commercialise some ideas from algorithmic market modeling and to support formation of a spin-off.

On a (Somewhat Old) Dynamic Steiner Tree Problem

The dynamic Steiner tree problem has been around for a while already (for 24 years), but it did not get a satisfying answer from efficiency point of view, i.e., one would like to have a fast algorithm that maintains a constant approximate solution and allows to update the set of terminal vertices. On one hand, there have been some studies of the online version of this problem, where we focus on minimizing the number of changes to the tree that are necessary to maintain a good approximation. This line of research was started in the pioneering paper by Imase and Waxman [1] and was later continued in [2, 3, 4]. However, in the online problem one ignores the problem of efficiently finding these changes to the tree. On the other hand, the problem of maintaining a Steiner tree is also one of the important open problems in the network community [5], where it is often referred to as dynamic multi-cast tree problem. The motivating problem is to maintain a cheap communication network between a set of dynamically changing users. While it has been studied for many years, the research resulted only in several heuristic approaches [6, 7, 8, 9], none of which has been formally proved to be efficient.

In our paper "The Power of Dynamic Distance Oracles: Efficient Dynamic Algorithms for the Steiner Tree" we have shown the first sublinear time algorithm which maintains a constant approximate Steiner tree under terminal additions and deletions. We show that we can maintain a (6+epsilon)-approximate Steiner tree of a general graph in  ilde{O}(sqrt{n} log D) time per terminal addition or removal. Here, D denotes the stretch of the metric induced by the graph. For planar graphs we achieve the same running time and the approximation ratio of (2+epsilon). Observe that due to the sparsity of real world networks only algorithms with such sublinear complexity could be of some practical importance and could potentially be used to reduce the communication cost of dynamic multicast trees. Our paper aims to be a theoretical proof of concept that from algorithmic complexity perspective such algorithms are indeed possible. However, turning these theoretical ideas into useful algorithms will require much more research. In particular, a natural question here is whether the approximation factor of our algorithms could be considerably improved while keeping the running time sublinear?

Let me note that this might be hard as the approximation factor of (6 + epsilon) for Steiner tree in general graphs comes from using 3-approximate oracles [10], using 2-approximation of the Steiner tree by the MST in the metric closure, and (1 + epsilon)-approximate online MST algorithm. In other words we hit two challenging bounds: in order to improve our approximation factors, one would need either to improve the approximation ratio of the oracles which are believed to be optimal, or devise a framework not based on computing the MST. The second challenge would require to construct simple and fast (e.g., near-linear time) approximation algorithms for Steiner tree that would beat the MST approximation ratio of 2. Constructing such algorithms is yet another challenging open problem.

If you would like to learn more on how it works, you can still attend our talk during the coming STOC in Portland on Monday at 9:00, or have look into the full version of the paper on arXiv.

[1] M. Imase and B. M. Waxman. Dynamic Steiner tree problem. SIAM J. Discrete Math., 4(3):369–384, 1991.
[2] N. Megow, M. Skutella, J. Verschae, and A. Wiese. The power of recourse for online MST and TSP. In ICALP, pages 689–700. 2012.
[3] A. Gu, A. Gupta, and A. Kumar. The power of deferral: maintaining a constant-competitive Steiner tree online. In STOC, pages 525–534, 2013.
[4] A. Gupta and A. Kumar. Online Steiner tree with deletions. In SODA, pages 455–467. SIAM, 2014.
[5] X. Cheng, Y. Li, D.-Z. Du, and H. Ngo. Steiner trees in industry. In D.-Z. Du and P. Pardalos, editors, Handbook of Combinatorial Optimization, pages 193–216. Springer US, 2005.
[6] F. Bauer and A. Varma. ARIES: A rearrangeable inexpensive edge-based on-line Steiner algorithm. IEEE Journal of Selected Areas in Communications, 15:382–397, 1995.
[7] E. Aharoni and R. Cohen. Restricted dynamic steiner trees for scalable multicast in datagram networks. In INFOCOM ’97. Sixteenth Annual Joint Conference of the IEEE Computer and Communications Societies. Driving the Information Revolution., Proceedings IEEE, volume 2, pages 876–883 vol.2, Apr 1997.
[8] S.-P. Hong, H. Lee, and B. H. Park. An efficient multicast routing algorithm for delaysensitive applications with dynamic membership. In INFOCOM ’98. Seventeenth Annual Joint Conference of the IEEE Computer and Communications Societies. Proceedings. IEEE, volume 3, pages 1433–1440 vol.3, Mar 1998.
[9] S. Raghavan, G. Manimaran, C. Siva, and R. Murthy. A rearrangeable algorithm for the construction of delay-constrained dynamic multicast trees. IEEE/ACM Transactions on Networking, 7:514–529, 1999.
[10] M. Thorup and U. Zwick. Approximate distance oracles. J. ACM, 52(1):1–24, 2005.

Online bipartite matching in offline time

In 1973 Hopcroft and Karp gave a very nice O(m \sqrt{n}) time algorithm computing a maximum matching in unweighted bipartite graphs. This algorithm turned out to be the milestone that is hard to beat. The bipartite maximum matching problem has been studied in many different flavors such as online, approximate, dynamic, randomized or the combination of the above. The HK algorithm was improved for sufficiently dense and sufficiently sparse graphs. Nevertheless, for the general case, despite of the wide interest, little improvement over HK was obtained. This is somewhat intriguing, as no reasonable example certifies that the bound for the running time of HK is in fact tight. The HK algorithm is of offline nature and relies heavily on receiving the whole graph before processing it. On the quest for making some progress on the subject matter we aimed for the dynamic setting. Say the vertices need to be matched as they show up in the graph, so some re-matching is allowed and the maximum matching needs to be maintained at all times. Can one do better than simply finding any augmenting path each time a new vertex appears?

Let us narrow down the dynamic scenario we consider: say only the vertices of one side of the graph come in online, and at the time they show up they reveal all their adjacent edges. This scenario is easily motivated as follows.
There is a set of servers and a set of jobs to be executed on these servers. Every job comes with a list of servers capable of performing the job. Of course it is reasonable to serve as many jobs as possible, hence the maximum job-server matching is highly desired. Having that motivation in mind, an online scenario seems to be the most natural: jobs come in online one at the time, and request to be matched to one of the chosen servers. We care to match the job whenever its possible, so we allow some busy servers to be reallocated.
Now the question is, what is a good way of reallocating the servers, so that the dynamic algorithm can benefit from it? We adopt the following measure: each reallocation involves some reallocating cost. We want to minimize the cost of reallocating the servers.

We associate with each server u an attribute called rank_t(u), which states how many times the server has been reallocated in the entire process. The parameter t stands for time. We see the entire process of adding jobs as a sequence of turns, in each turn a new job appears with the list of eligible servers. The attribute rank_t(u) describes the number of times u was reallocated up to turn t. These attributes guide us when we search for augmenting paths. We try to avoid servers with high ranks. This should distribute the necessary reallocations more or less evenly among the servers.

In order to follow this approach, a certain greedy strategy comes to mind. When trying to match a new job, choose the augmenting path which minimizes the maximum rank of a server along this path. This strategy has one substantial disadvantage. If any augmenting path that we can choose to match the new job contains a vertex of high rank r, then we are allowed to rematch all servers of ranks at most r. That is clearly an overhead. We hence define another attribute, tier_t(v), for every job v. This attribute says what is the rank of the lowest ranked path from v to any unmatched server. When we try to match job v, we search for the alternating path along which the tiers of jobs do not increase. We call such a path a tiered path. In other words, a tiered path minimizes the maximal rank on its every suffix. This seems natural: why to re-enter vertices of high rank when we can avoid them.

It turns out that this simple strategy does quite well: the ranks (and hence also the tiers) do not grow above 2 \sqrt{n}. That means, that each server gets reallocated at most O(\sqrt{n}) times. This is already interesting. Moreover, if we knew how to efficiently choose the right augmenting paths, the total work would be O(n\sqrt{n}). We have an algorithm that finds all the augmenting paths in the total time O(m\sqrt{n}), what matches the offline time of HK.

So let us first take a look on how the bound on the maximum rank is obtained. First of all, we direct the edges of the graph according to the current matching: the matched edges point from servers to jobs, while the unmatched edges point the other way around. Now, directed paths are alternating, so in each turn we need to find a directed path from the new job to a free server and reverse the edges along the path. We first show that the servers of high ranks are far from the unoccupied servers: the directed distance from any server u to an unmatched server in turn t is at least rank_t(w).

We now look at any set S_t of vertex disjoint directed paths covering all free servers in turn t before applying the augmenting path. Note, that there are no outgoing edges from free servers, so the paths end there. The rank of the directed path present in the graph in turn t is the maximum rank of a server on it. Let's call \mu_{t-1} the augmenting path applied in turn t-1. We analyze the augmentation process backwards. In turn t-1, before applying \mu_{t-1}, there exists a set of vertex disjoint paths S_{t-1} covering free servers, such that:

  • every path \pi \in S_t has its counterpart \Phi(\pi) \in S_{t-1}, where \Phi is an injection
  • \Phi(\pi) has rank at least as high as \pi, unless \pi's rank is smaller or equal to one plus the rank of \mu_{t-1}: then the rank of \Phi(\pi) may be one less then the rank of \pi
  • there is a path in S_{t-1} that is not in the image of \Phi and has rank at least the rank of \mu_{t-1}

This observation may be translated as follows. Let us, for every path in S_t, draw a vertical bar of height equal to its rank. Let us now sort the bars in descending order and put them next to each other, as shown in Figure 1 to the left. These bars are the ranks of the paths in turn t. When we take a step back to turn t-1, we have another set of bars corresponding to paths from turn t-1, where one additional bar pops out. Moreover, some bars may have decreased by one, but all the bars that decreased are dominated (with respect to height) by the newly added bar. This is shown in Figure 1 to the right. The process ends when we reach turn 0, and there is n bars of height zero.


Now we move to bounding the maximum rank. The maximum rank, say R, will appear on some path in some turn t'. We take a look at set S_{t'} consisting of this single path. There is only one bar of height R. In the previous turn, either there is still a bar of the height R, or there are two bars of height R-1. Every time the bars decrease, there comes another bar that dominates the decreased bars. Somewhere on the way back to turn 0 there is a set of bars with the sum of heights quadratic in R. The bars, however, correspond to vertex disjoint paths, and the heights of the bars are the lower bounds on the lengths of these paths. Hence, there is \Omega(R^2) vertices in the graph and R \in O(\sqrt{n}).


The question that remains is whether we are able to efficiently find these paths. The main problem here is that we need augmenting paths where the tiers of jobs along the path do not increase. This is not a good news: the tiers are difficult to maintain upon the reversal of the edges on the entire augmenting path. The idea is to maintain them in a lazy manner. For each job v, instead of its actual tier, the algorithm maintains an attribute tier_{LB}(v). Subscript LB stands for lower bound, as we maintain the invariant that tier_{LB}(v) \leq tier(v). When a new vertex v turns up in some turn, tier_{LB}(v) is set to 0. The algorithm repeatedly tries to find (in the directed graph) a tiered (according to the maintained lower bounds for tiers) directed path from v to a free server. It executes a search routine from v, traversing only the vertices with ranks and tier_{LB}'s bounded by tier_{LB}(v). Once a job v' with tier_{LB}(v') < tier_{LB}(v) is discovered, the upper bound on the ranks and tier_{LB}'s of vertices visited next is recursively set to tier_{LB}(v'). This goes on until one of the two things happen. It might be that a free server is discovered. Then we found an augmenting path, along which the tier_{LB}'s of the vertices are their actual tiers (the path that we found is a witness for that). We reverse the edges along the augmenting path and increase the ranks of the reallocated servers. It might also happen that the search fails. This means, that the algorithm detects a group of vertices whose tier_{LB}'s are lower than their actual tiers. The algorithm then rightfully increases the tier_{LB}'s associated with these vertices. It continues the search from the point where it failed. The difference is that now it can search further, as it is allowed to visit vertices with higher tier_{LB}'s than before. The general idea is that whenever a vertex is visited by the algorithm, either its tier_{LB} or its rank is increased. Unfortunately upon every such visit the algorithm may touch all the edges adjacent to the visited vertex. These edges, however, will be touched in total O(\sqrt{n}) times each. The total time amounts to O(m\sqrt{n}).