algorithms and random thoughts: July 2012

Tuesday, July 31, 2012

CF 131

problem set 213

A. you have to start at some machine, then you keep finishing deg=0 job until exhausted. Now it only makes sense to go to machine curr+1 as going to the other one cost=2, while you can go to cost=1 then go for another cost=1. All remaining jobs have deg = deg-1 if some of its predecessor finished. Now the only question is which machine do you start? Since there are only 3 machines, the solution is easy, try to start from each of them, the best of the 3 start is your answer.

Wednesday, July 25, 2012

gcj round 3

A. Linearity of expectation and greedy.

B. scared by the picture.

C. I only have some idea before reading solution:
First you can binary search from 1 to M.
Second thing is to find how many delivers you need (this is ternary search but I do not know how)
Third thing is that each delivery is identical, you just find cheapest item that can last i days for i=1 to K, say a delivery can last K days. This can be done in O(N log N) using binary search.

D.

Saturday, July 21, 2012

The Hungarian algorithm: Kuhn-Munkres theorem

The Kuhn-Munkres theorem:

Given a bipartite graph with w(e) on edges, find a matching with maximum weight.

Simplified assumption, let the graph be bipartite complete, missing edges can be taken as w(e)=0.

Then we can assume optimal matching is perfect, that is, if |L|=|R|=n, then the matching M is of size n. the reason is that we can always add some edge if not perfect, and a complete bipartite graph must have a perfect matching.

Now we look for a feasible labeling on vertex, called slack, s(v), such that
s(x) + s(y) >= w(e) for e=(x,y). This ensures that sum of all s(v) is an upper bound on opt matching. Now we seek a feasible labeling and a perfect matching on the admissible edge graph, an edge is admissible if s(x) + s(y) = w(e). This prefect matching is necessarily optimal.

The algorithm starts with a feasible labeling, each v in L has s(v)=w(e), each v in R has s(v)=0, and an empty matching M.

Now find a free vertex in L, call it x0, we then look for a way to extend the matching by one more edge.

If x0 leads to an alternating path in E(admissible), then we augment. Otherwise we end up with a subset L' and R' such that L' is contained by R'. We then lower s(v) in L' and increase s(v) in R'. This keeps edges admissible if they are admissible before, and it will give us at least one more admissible edge to augment the matching.

Time complexity is O(|V|^3).

Variantions:
- The above algorithm assumes all w(e)>=0, what if we are given some w(e) <0?
Simply replace those w(e) by 0. The reason is that if those w(e) were not chosen in the solution, then we are good. If they do get included in the solution, they can only beat edges with w(e)=0, then we are fine as well since we can simply throw away the w(e) <0 that we get in the solution.

- How about finding minimum weight perfect matching?
We can use some large integer Z and replace each w(e) with Z - w(e). Now we look for maximum matching instead. Since the solution has to be a perfect matching, flip sign of w(e) does not change the optimal solution, although its value needs to be restored.

- Implementation

nodes: X = 0,1,...,N-1,
Y=N,...,2*N-1

slack[i]=0 for each i in Y
slack[j]= max w(i, j) among all i in Y
mate[i]=-1 for i in Y
mate[j]=-1 for j in X

ans=0
for i = 0 to N-1
    if mate[i]==-1 // i not matched yet
   clear vis[]
         while (update = augment(i) is false)
           // update slack[]
   delta = inf
             for each j in X that vis[j] is true
   for each i in Y that vis[i] is false
                         checkmin(delta, w(i,j) - s[j] - s[i]) // how much s[j] can decrease,
                                                      //notice that s[i] is not changed in this update
              for each j in X that vis[j] is true: slack[j] -= delta
for each i in Y that vis[i] is true: slakc[i] +=delta
clear vis[]
// after enough update, i must reach all j in X, so an augment has to occur
            ans++ // each augment increase matching size by 1

function augment(vertex s): return true if augment
    if (vis[s]) return false;
    vis[s] = true;
    for each neighbor z of s // only edges with w(e) = slack[x] + slack[y] eligible
          if (mate[z]==-1):
                  then mate[z] = s, mate[s]=z, return true
          if (augment(mate[z]) is true):
                  then mate[z]=s, mate[s]=z, return true
    return false

Observation: once a vertex is matched, it stays matched

SRM 550

Solved p250 with 161pt. My idea is to simulate the move of robot and then cut the minimum board.

p500 is still too hard for me, it takes a thorough analysis of the game. Coding itself is trivial. I am not good at that.

p1000 is a DP but with N up to 10^9, you need to be good at compressing states.

The result is not too bad for me as many people failed their solution. I am now only 50pt away from getting yellow.

n queen problem

Today interviewstreet gave a problem that to find one solution to place N queen in NxN board. In addition, a queen is allowed a knight's move.

I wrote a simple backtrack solution which solves N=32, N=35 is doable but takes maybe 30min.

There are solution for the original N queen problem solves N with time O(N^2). Given there are solution solves N up to 10^4, the solution should be applicable to this modified problem as well.

Thursday, July 12, 2012

Copy only text

A simple solution when you want only the text from a webpage or some document, everything else like style or tags, you don't care.

cat >/dev/null
then paste your text here and copy to the destination. This works only on a unix/linux terminal, though.

Maybe not related, but format c++ code, you can use a tool called astyle.

Aho–Corasick_string_matching_algorithm

codechef July Challenge: Favorite Number problem.

The algorithm finds all matching positions that matches one or more in the given set of patterns.

It implements a finite state automaton. States are 0, 1, 2, ....

goto(s, a) = s' move from state s to s', when input is letter a

fail(s, a) = s' move from state s to s', when input letter a does not match any next state

output(s) = { set of strings in pattern }

When do matching text to patterns, just walk through the trie and follow fail link when necessary, similar to Knuth-Morris-Pratt.

First we show how to use the finite state machine to process text

state = 0
for each ch in text
while goto(state, ch) == fail do, state = fail(state) // must stop at state = 0
state = goto(state, ch)
if output(state) not empty, then emit output(state)

Now we show how to construct the functions.

goto() is easy, for each keywords y[i] , i=1..k (we have k keywords)
state = 0
for each ch in y[i]
if goto(state, ch) == fail, then goto(state, ch) = cntstate++
else do nothing
append y[i] to output[state]

fail() function is constructed using a BFS on the graph. We first do depth=1 nodes, which have parent as state=0, then do it with increasing depth.

queue = empty
state = 0
// do depth=1
for each next_state of state, fail(next_state) = state, enqueue(next_state)
// do depth=d using d-1
while queue not empty do
r = queue.pop()
for each symbol ch
if goto(r, ch) = s != fail, then // s is at depth=d with parent=r
enqueue(s)
state = fail(r)
while goto(state, ch) == fail, do state = fail(state)
fail(s) = state
output(s).append(output(state))

Saturday, July 7, 2012

interviewstreet oxygencloud

Two hard problems.

Problem 1: one person game: given 8x8 grid, with each cell having a number, find a path in the grid with maximum sum. The path cannot contain zero cell.

I wrote a brute force dfs solution, which passed case 1 and 3 by sheer luck. This 20 points was also the only points I earned. Now we know the starting position has to be nonzero. A cell can go up, down, left, right, 4 neighbors. Still no idea for a solution.

Problem 2: data center. Given integer n, find the number of possible labeled spanning forest with root, 1 2, 1 2 3, 1 2 3 4, ..., 1 2 ... n. MOD = 10^9+7
Two people solved it. I don't have any idea. Looks a tough counting problem.

UPDATE:
Solution for problem 2 from chhung6.
The idea is DP with knowledge of Cayley's formula. The number of labeled rooted tree with n nodes is n^(n-2)

Let dp[N][M] be number of forests with N servers and M clients,
then N=1 we use Cayley's formula.
N=2, we can use K clients to server 1 and M-K clients to server 2, and there are (M choose K) ways to split the M clients.

In general,
dp[N][M] = sum_K=0^M mult((M choose K), dp[N-1][K], (M-K+1)^(M-K+1-2))
be careful when M-K+1=1, then (M-K+1-2) is -1.

Now that we have dp[N][M], the answer is sum_K=2 to n: dp[K][n-K]
since n could be as large as 10^9, you know you need a closed form. Then the natural step is to google the first a few terms. However, there is a catch.
Search the first a few terms of ans[2, 3, 4, 5, 6] did not return anything useful. In fact, my search in google or oeis did not return anything.

Then the next step is to search sum_K=1 to n: dp[K][n-K], note the start index is from 1 now. This gives a closed form formula in oeis,

A058128 a(1)=1, a(n)=(n^n-n)/(n-1)^2 for n >= 2.
1, 2, 6, 28, 195, 1866, 22876, 342392, 6053445, 123456790,

our answer is simply a(n) - number of spanning trees with n nodes (that is, with only 1 server)

algorithms and random thoughts