algorithms and random thoughts: August 2012

Friday, August 24, 2012

preorder, postorder and reconstruct tree

An interesting problem from mitbbs.com, jobhunting, posted by geniusxyz

The problem is, given two lists, pre and post, representing the preorder and postorder traversal of a general tree, each node in the list is struct {int val; bool is_leaf; }. Reconstruct the tree.

It seems rather difficult and took me a full day.

Some picture always helps, draw a tree and do preorder and postorder to see how nodes are listed.

Observation: we can partition the tree into vertex-disjoint paths, each path starts at some node and ends with a leaf. The pre list gives exactly the paths, each path terminated by a leaf.

Now our job is to attach the paths to the right subtree root, and this follows from post list.

Let's start with attaching the second list to the first list.
Let ptr be the pointer to some node in the first list, init to the leaf. In post list we have ptr matched to the first leaf in post[] in the beginning. Now as long as we have not hit the next leaf in post[], we advance postptr (pointer to post[]) and move ptr to its parent. When we stop, the parent of ptr is where we attach the second list to the first. Why? because we now run into another leaf and the parent of ptr is an ancestor of that leaf, so it must also be a parent of the second list. (a more rigorous argument is needed here). Repeat until we attached all lists to the first list.

Picture-wise, the tree starts with a linear list, which is the first list, then the second list was attached and the frontier consists of the second list and the common sublist that is the ancestor of both the first and second list. We move the frontier as we attach new lists and update ptr in that frontier.

ZhangBZ replied in the same post with very elegant code.

Wednesday, August 22, 2012

SRM 553

p250: 0 is special, so eval with program[pos] set to 0, where pos is the -1.
then you can eval with -1 there. If it comes out equal wanted, then you return 1, since 0 did not work and the next smallest number is 1.
Otherwise you got a number different from wanted. You know that if target can make eval to wanted, then use -1 instead of target you get wanted - target - 1, so target = wanted - ans -1, where ans is the eval result when you leave -1 unchanged. However there is a catch, the eval returned value may not use -1 at all. To trace whether -1 has been hit or not turns out a bit tricky. pieguy has an elegant solution, use 1 and 2 to replace -1 and eval them, if the two ans come out different, then you know -1 has been hit. My offline solution is to do one more eval starting from program[pos]=-1, then check stack size = 1. There are other solutions which remembers whether the current value in stack has -1 built in or not.

p500, the solution must be a decreasing rows of B matched with an increasing rows of W, this will lead to a dp solution. One more observation, since each cell is either B or W, it must be start with topleft an all B or all W row echelon form.

Thursday, August 16, 2012

SRM 552

As usual I only know how to do the 250 problem, and I failed that one.

p250, given three numbers R, G, B, and another number N, you lay out a triangle such that no two neighboring balls share same color. The problem asks how many triangles with N layers you can make.

It is easy to see that you need

N R G B
-------------
1 0 0 0
2 1 1 1
3 2 2 2
4 4 3 3
5 5 5 5
6 7 7 7

Enough pattern, so to make one triangle of N layers you need sum of 1 to N balls, which is N(N+1)/2. For N=3k or 3k+2 this number is a multiple of 3 so you need the same number of three colors, and the answer is simply B/(S/3), where R>=G>=B and S = N(N+1)/2.

The tricky case is N=3k+1, then S = 3k' + 1. Now you need to distribute balls to make max number of triangles. Let us image that the optimal answer is m triangles, then you need at least m*((S-1)/3) for each color, and whatever remain can be any color, as long as the remaining balls are at least m. So the answer is min of (R+G+B)/S and B/(S/3). The special case is N=1, where you return R+G+B. An easier way to visualize this is that you grow three equal columns for each color, then what has been cut off from each color are the remaining balls you have, those balls are equivalent and can be distributed to any triangle you want to make.

The problem setter is more evil than this to introduce a few integer overflow even for long long, if you use * instead of /

p500
notice that the two rectangles r1 and r2, can be separated by a horizontal or vertical divider. Try all possible dividers and within each region, brute-force on all possible r1, and of all possible r2 respectively. We have no more than 30 horizontal divider and same for vertical ones. To try all possible r1, just run (x1,y1) and (x2,y2). This is no more than 30*30*30*30 < 10^6. The implementation demands extreme care. Draw a picture to get the indices correct. Also check when there is no solution, you need to return -1. This is different from the case when there is zero flowers.

p1000

Saturday, August 11, 2012

evernote code sprint

a bit disappointing codesprint.

all 4 problems are typical 15min interview problems and the testcases are probably not strong.

Most coders would finish within an hour and being rank 1 the first time didn't make me feel anything close to accomplishment.

codechef aug12

Other problems:
why my segment tree got WA?
some 10^8 solution got TLE, which is strange.

block game: why I didn't realize it is the stable marriage problem, and my idea is very close on the last day?
Just found a very stupid bug in my code, I am implementing exactly the algorithm but confused in two indexing arrays. Easily ACed after that.

OK, now I can describe my approach. The problem looks a bit confusing, and in fact the statement has to be rewritten to clean up some ambiguity. Quite a few coders got AC before rewritten and I guess they are good at guessing what the problem setter has in mind because great coders think alike. The confusion mostly comes from the intent that the setter is trying to hide the trick in the problem. Or he does not want people to realize what the constraint really meant. shangjingbo is a top coder, that is no question, just the way the problem was phrased make me a bit disappointed, although I did read the problem correctly. Another evil part of the statement was that you output -1 if there is no solution. In fact, this can never happen, as the existence of a solution can be proved.

Let's read the problem first. Each boy visits each house at a different time, also a house can never have two boys at the same time. Now we need to find a mapping from boys to houses, such that after boy B locked at house H, nobody can visit house H later. The implication is that, for any boy, they cannot visit H, and any house on his schedule after H. For example, if a boy B1 visits H1, H2, ..., Hn in order of time, and H = H3. then after another boy B2 was locked to H3, boy B1 can only visit H1 and H2, houses from H3 to Hn are not available to B1 now.

At this point, it is clear that we can simulate the game. Let all boys pick up their first choice of house in their schedule. As long as there is a conflict, that is, a house is shared by two or more boys, we find such a house, then find the boy in that house with minimum time. This boy must be able to visit that house before all conflicting boys, so we move this boy to his next house. In the end, if we end up in a state that all boys get a different house, then we have a solution, otherwise there is no solution.

Now we show that there is always a solution. The reason is, the only case when there is no solution is that a boy was moved beyond his last, the n-th house. But this can happen only after all the n houses were taken. Since a boy can take only one house, the other n-1 boys cannot take all n houses. So there is always a solution.

Clearly if our algorithm stops, then it finds a solution. On the other hand, if there is a solution, then the algorithm always finds one. Intuitively it is clear, but I do not have a good proof than the one based on stable marriage problem. A high level argument is that we started with all boys take their first choice and keep them as happy as possible, any time we were forced to move a boy, we move the boy that has to do so, because it is the one that is able to visit the house before its peers hence was able to advance to its next choice.

Lessons learned:
In the final output code, I should wrote myindex[i] instead of location[i]. Two ways could possibly avoid this mistak:
1. do not use location[i] as global, then it will not be visible in the final output code part.
2. keep some note of key data structure, or be careful and take note of variable renaming.
3. have a proof of what I am trying to do, as I am not convinced about my algorithm which is a bad thing.

machine gun: I know it is bipartite matching, just my hopcroft-karp implementation keeps getting TLE. The author (ktuan) and tester (laycurse) use the same Hopcroft-Karp algorithm but they must be running within 15s limit. Also ACrush's code runs within 8s. I heard severkplus mentioned that other people have graphs twice as big as his yet they had a better max-flow implementation and passed, while his failed.

Actually there is one simple but crucial observation, the graph has 4 disconnected components, based on parity of cell index, (odd, odd), (odd, even), (even, odd), (even, even), they cannot be connected. So we are dealing with 4 graphs each with size (700/2)^2 < 130000. This will run in time.

stunning difference:
laycurse solution runs 300x300 all F graph in 0.02s
my solution runs the same graph in 23s

A stupid bug, when running multiple testcases,forgot to initialize globals, in particular, ans is not reset so it accumulates previous counts!

optimizations:
1. the graph is of 4 parts based on (odd/even, odd/even) also essential
2. each component is bipartite with (x+4, y+2) vs (x+2, y+2) also essential
3. when reset visited[][], only reset those entries that you actually set, this requires remember all nodes visited in the current DFS augment
4. map (i,j) into N1++, N2++, this one is essential
5. use +1, -1 to find neighbors, eliminate +2 then /2
6. check bounds with if expr instead of inbound() call, there are too many calls

Saturday, August 4, 2012

SRM 551

p250.
You know the optimal solution has to fix one pos unchanged. So try to fix each of the 0 to n-1 pos. For a fixed pos, search left and right to get the nearest same color letter until run out of maxSwap.
It took me a while to figure it out and my coding is rather slow.

p500.
A relatively easy 500 problem, observe that many people got 400+ points. However I could not find an algorithm during contest. This solution is due to liympanda.

We are to find how many 'Y' we need to remove to force a path from 0 to n-1. Generalize a bit, we need to find ans[0] to ans[n-1] where ans[i] is the min op to get from 0 to i. This leads to a DP + BFS solution.

First ans[0] = 0
Then we need to repeat at most n times.
In each iteration, we find min ans[i] such that i has not been used before, that is use[i] = 0
Then we use i to update all next such that color[i][next] is 'Y' and ans[next] can be improved. Notice that we also need to add an extra = k-1 if next is the k-th neighbor of i.

This is a bit like Dijkstra, each time we pull out the best node with min dist from src, then we use that to update dist of all other nodes. In the end we have dist of all nodes.

algorithms and random thoughts