Wednesday, May 2, 2012

Strongly Connected Components

The Tarjan's algorithm does two passes of DFS, and I keep forgetting the algorithm. Now that I have to look it up again, here are some intuitions.

SCC for directed graph.

Let us begin with a DFS tree. You know that if you can reach your ancestor, then both of you are in the same component. To check if some node y is your ancestor, and you are x, then dfsbegin[y] < dfsbegin[x] and dfsfinish[y] < dfsfinish[x], that is, you start later and finish early, then whoever has an interval covers your interval is your ancestor.

Now consider a node y, we know that y can reach x for every x that is its descendant. If we start from x and there is a path to y, then we can assign x to y. However we do not want to start from every x as it is too slow. Now comes the reverse edge trick. A path from x to y is equivalent to a path from y to x if we reverse every edge in the graph. If we do that, every tree edge and forward edge become back edge, and the previous back edge becomes either tree edge or forward edge but that does not bother us since they are not involved in finding components. The annoying edge is cross edge, which must go from right to left in the original graph, now they have to go from left to right after reverse. Then we can finish components on the right before work on the left so we will detect a node already assigned to a different component when hitting a cross edge. One way to achieve this is to process nodes with decreasing order of dfsfinish[x].

No comments:

Post a Comment