Saturday, December 31, 2011

interview street challenge

The problems are not easy and require some thought, but not terribly hard as I can solve it on my own with enough time.

One problem is, given two string P[1500], Q[1500], and integer K<=1500, find max length L such that P has a substr of length L, Q has a substr of length L and the two substr have at most K char differ.

I know Knuth-Morris-Pratt but does not help much here. As I have just read tourist's solution on TC SRM 528, binary search seems work here. If best substr has length L, then there is also a feasible length L-1. So we can binary search for the maximum length L. To check a particular L is valid or not, we can check all substr of P and Q with length L. But this might be too expensive, since there are 1500 substr from P and 1500 substr from Q. So we need to compare 1500*1500 pairs, and L might be as large as 1500. 1500^3 is too much. However we can make some observation.

After we compare P[0..L) to Q[0..L), we can get
P[1..L+1), Q[1..L+1),
P[2..L+2), Q[2..L+2),
each in amortized O(1) time.
So we only need to check 1500+1500 pairs in 1500 steps and this is quite fast. The algorithm then becomes 10^7 roughly.

Another problem is computing the expected sum. This is easy if you know linearity of expectation and use birthday paradox or balls-and-bins to compute each individual expectation and add them up. The interesting part is that I printed ans with .3lf while the problem asks 2 digits after decimal point. My solution failed because of this. And I have to change it to .2lf to make it Accepted.

The testcase is a bit weak as each problem has about 10 cases, and they did only a simple diff.

The nice thing is that you can write your solution in a number of different languages.

update: findstring can be solved with suffix array, although you cannot have vector as suffix as this uses too much memory.

update: kdifference is an easy problem as you only need to check a[i]+K and a[i]-K for each i.

permutation game is another easy one, standard combinatorial game, and since N<=15, you can use a 16-bit mask to keep track of unused numbers. dp with memorization works. Now I have 5 problems solved. The scoring system is a bit strange, people with more problems are ranked below me, maybe I had a few 50pt problems, but they are actually easy.

Rank Hacker Score Submissions Solved
79 lantimilan 364.15 12 5

update Jan 4: String similarity accepted. Although I haven't complete the proof for pref[] computation.
Proof complete.

let q[j] be the border[] of string s.
let p[i] be the pref[] of string s.

Claim 1: If q[j] hits pos=i, then pos[i] has a prefix of length l=q[j].

Claim 2: If pos=i is hit by any q[j], then the max q[j] is equal to pref[i].
Proof: if some other q[j2] overshoot pos[i] and results in a longer pref[i], then q[j] can be extended longer. The implication is that we can compute all pos hit by some q[j]. These are called essential positions. The next step is to use the essential positions as markers to compute other nonessential positions.

Claim 3: If pos[i] not covered by any marker pos[k], then pref[i]=0.
Proof: any prefix > 0 must be contained in some q[j] and all q[j] are shown in p[k], since any pos[i] covered by some q[j] must be covered by some p[k].

Claim 4: If two p[k1], p[k2] both cover pos[i], then the closer one, let it be k2, is better, that is, give a longer prefix.
Proof: every prefix from k1 is contained in the segment defined by k2.

Claim 5: the computation of p[i] for non-marker i are correct.
Proof: From induction we can assume previous entries are correct, so we are done with pref[i-k] term. For pref[k]-(i-k) term, if pref[i] goes beyond the end of pref[k], then it either end up at some q[j], which is impossible because i is non-marker, or it went into another marker k' and pref[k'] gave a longer prefix. But we already argued that if this is the case, k' must be closer to i than k, this contradicts with the choice of k being the closest marker to i from the left.

For marker it is easy: pref[i] = max q[j] such that q[j] hits pos[i], j-i+1 = q[j]

For non-markers: The formula is pref[i] = min(pref[i-k], pref[k]-(i-k))
Explanation: p[k] needs to cover pos[i], that is, k + p[k] -1 >=i, the remain length for pos[i] is pref[k]-(i-k), however, we need to observe pref[i-k] since not all segment may be used, otherwise pref[i-k] could be longer.

Friday, December 30, 2011

TC SRM 528

p250
you get length=10 for free and the only extra you get is by cutting a 20 into 10 and 10, with 1 cut you get 2 items. So sort the sticks by length and first process length with multiples of 10, then process others. I made a mistake forgot to check avail cuts >=0 and got challenged.

p500
Given a string with 'o' and 'x', find number of ways that splits into two identical substrings. string length =40.

The algorithm works by noticing that substr has length =20. So you can guess all 2^20 substr. For each of the guess, compute dp[i][j] as number of ways to match first[0..i) and second[0..j) to s[0..i+j). dp[n/2][n/2] is the answer for this guess. And ans is the sum of all guesses. Need to prune the guess with wrong number of 'o' and 'x', otherwise will TLE.

p1000
Given a[] of n=50 integers, each a[i]<=2000. P1 and P2 are two integers 50<= P1,P2 <=2000.
Find number of ways to get as many (P1,P2) pairs from a[].

Since the feasible solution is monotone. If we can make k pairs, we can also make k-1 pairs.
So we can binary search ans. low=0, high=2000*50/(50+50)=1000.

dp[n][p] is max #P2 if we use a[0..n-1] and we have #P1=p.

When update dp[i][j], we iterate all possible usage of a[i].
a[i] can offer #P1 from k=0 to a[i]/P1. the corresponding P2 offer is from 0 to (a[i]-k*P1)/P2

set all dp[i][j] to -1
dp[0][0] is 0 since we got 0 P2 if we do not use any of a[] and our P1 is 0
let i=0 to n
let j=0 to mid // if dp[i][j]>=0
let k=0 to a[i]/P1 // a[i] offer P1 as k
dp[i][j+k] updated by dp[i][j] + min(avail[i][j], mid-k) // min is due to a[i] cannot offer more than maxP1 - its own offer of P1

a valid answer is dp[n][mid] >= mid
and take the best over all mid we got the answer.

Monday, December 26, 2011

CF #99, prob 138

problem B: given an integer n, find two permutations of the two copies of n such that the sum will end with maximum number of zeros. n has at most 10^5 digits.

I read the problem wrong and was thinking the sum needs to have maximum number of zeros which could be in any position. Then the problem becomes much harder as we now have to deal with sum of tens and sum<=8 to space out the tens, in addition to the trailing block of 0s, 10, and 9s.

Spent two days on it and couldn't find a way to implement the idea. Only after reading the solution did I realize the maximum is on trailing zeros.

problem C
there are n<=10^5 trees and m=10^4 mushrooms, all in one line. each tree has position a[i], height h[i], prob fall left l[i] and probability fall right r[i], both given as an integer between 1 and 100. Each mushroom has a position b[i] and a power p[i]. Each tree may fall left or right with the given probability, a mushroom would survive if no tree hits it. A tree will hit a[i]-h[i] to a[i]-1 with probability l[i], and hit a[i]+1 to a[i]+h[i] with probability r[i]. The problem asks for the total survival mushroom power.

After some thought it seems we need to sort trees and mushrooms. Each tree can be seen as two intervals. And we need to keep track of current probability of killing given what trees are covering this position. This can be implemented as a queue. Each new start will contribute to the current probability by multiplying (1-l[i]) or (1-r[i]). Each exit will remove that contribution. Each mushroom can then be calculate using current probability. However, there is a catch. When there are many trees, the accumulated probability can be very small and then never come back due to underflow. The rescue is from the probability being represented as integral percentage. So we can keep an array of [1..100] for counts of percentages. then we merely incr/decr counts as new start or exit occurs.

Wednesday, December 21, 2011

CF down again?

received 504 gateway error.

Just found that it is back.

Finished prob 136 A-E.

next_permutation() tries to generate lexico next permutation from current. It returns FALSE if cannot find a lexico greater perm. So if you start with [1 3 2]. The perm [1 2 3] never gets generated. To be sure, first sort your vector to be lexico minimum.

problem E is about a game and you have to find an explicit strategy.

Sunday, December 18, 2011

CF #98, prob 137

I had 4 problems AC in this div2 contest and rating got up a little bit. Ranked 89 overall.

A, B are trivial problems and didn't take much thought.

C is asking how many intervals are included inside another interval. All (left,right) are distinct. It is all too natural to sort them by left, then from increasing left, use current interval to include as many follow up intervals as possible. Then move to next not included interval. The observation is that, if [a2,b2] is included in [a1,b1], then any [ai,bi] included in [a2,b2] is also included in [a1,b1] so we are not missing anything by throwing away [a2,b2]. On the other hand, any [ai,bi] not included in [a1,b1] cannot be included in any of [a2,b2].

D is a DP problem. Given a string of length 500, find a way to break it into at most k<=500 palindromes. Let dp(npal,pos) be the cost of the solution to break string [0,pos) into at most npal palindromes. The answer is then dp(k,n). The recurrence is dp(npal,pos) = min of dp(npal-1, p) + cost to make str[p,pos) a palindrome where p range from 0 to pos-1. Base case is true when pos=0 and npal>=0.

E It does not look difficult but couldn't figure out an algorithm fast enough. the problem is that given a string of length <= 10^5, find all maximum length substr with nvowel <= 2*nconsonant. It looks apparent that you need O(n) or O(nlogn) algorithm, the question is how?

After two days I finally got a solution that works. I don't know why I am so determined to solve it and I have much confidence that I will get it, although several attempts failed.

Let vow[n], con[n] be the count of vow and con from 0 to pos.
The first observation is that vow[n] and con[n] change at most 1 as we move along the string. So we can compute vow[n] and con[n] in one scan of the string.

Now consider the substr with maximum length. If it does not reach pos=0 or pos=n-1, then it is bounded inside the string. Therefore, it must be the case that nvow=2*ncon in this substring, otherwise it can get longer by expand left or right. Let the start be i and end+1 be j. we know that vow[j]-vow[i]=2*(con[j]-con[i]), rearrange it gives vow[i]-2*con[i]=vow[j]-2*con[j]. So we can build an array diff[i]=pair(vow[i]-2*con[i], i) and sort diff[]. Those with same diff[i] value will be consecutive after sorting and for elements with same diff[i], we seek the index of first and last, and currlen=last-first is a candidate for maxlen.

After that we check the maxlen if we begin from pos=0, and maxlen if we begin from pos=n-1.

After we have maxlen known, we can check all substr of length maxlen to see if they qualify as a solution, that is nvow<=2*ncon. This check can be done in O(n) time.

Overall the algorithm is O(n) time except sorting the diff[] array.

With even a simple implementation, my solution runs within 60ms in CF machine.

Approaches that does not quite work:
1. binary search maxlen. Unfortunately there is no monotone property as a valid substr of length k does not include a valid substr of length k-1.
2. start from a pos with a consonant and build a substring. The problem is that you could hit a block of vows and then a block of cons. So each pos might require O(n), which amounts to O(n^2).

Saturday, December 17, 2011

TC SRM 527

p275

It is p275 so would be a bit harder than the normal p250. The problem is:
Find a tree with n nodes that achieves maximum score. The score of a tree is the sum of score of all nodes, and the score of a node depends on its degree. You are given an array scores[0..n-1] with each element between 0 and 10000, a node with degree d would have score=scores[d-1].

Thought for quite some time but no good idea. For a tree you can try recursively remove one leaf and work on the rest. Another line of attack is to notice the total deg of a tree is 2(n-1). Enumerate a node with deg=1 to n-1 and recurse on the (n-1,d-deg). However, there is no guarantee that there is a node with deg.

Archon.JK submitted this problem within 5min!

I think I understand the problem now, after reading Archon.JK's solution.

his code looks like this
dp[1][0] = 0, u[] = [0, score[]]
for i=1 to n
for j=1 to i-1
for p=1 to n-1
for q=0 to n-2
dp[i][p] = max(dp[i][p], dp[j][p-1]+dp[i-j][q]-u[p-1]-u[q]+u[p]+u[q+1])

and the answer is max of dp[n][0] to dp[n][n-1]

dp[i][j] means best score for a tree with i nodes and one of them has deg=j
The idea is to split a tree with i nodes and a node with deg=p into two subtrees, one with j nodes and the other with i-j nodes, split at deg=p, so one subtree now has a node with deg=p-1 and the other one has deg=q. After merge they become deg=p and deg=q+1.

One possible improvement is to split a tree at a leaf, at node with deg=d. The code looks like this

dp[1][0]=0;
for i=2 to n
for d=1 to i-1
dp[i][d] = max( , dp[i-1][d-1]+dp[1][0]-u[d-1]-u[0]+u[d]+u[1])
we also need to update dp[i][1] since we now have a deg=1 node
dp[i][1] = max( , dp[i][d])

Now the answer is dp[n][1]

// As I was writing this one, I realized how difficult it is to write a solution. That explains why so few people would bother to write solution or summary and even for those who do, it gets delayed or incomplete

p450
Given the rows of a matrix, in order from 0 to m-1. Each row is a string of "0,1,?"
the columns are given as well, in random order. The problem is to find a matrix that agrees with both given rows and columns, and among all feasible ones, find the lexico smallest matrix. No more than 30 rows and 30 columns.

No good idea. As usual I have no idea on level 2 problem.

OK, lyrically's solution makes sense now.
Since we want lexico minimum solution, for each pos, from row 0 to m-1, col 0 to n-1, if cell is '?', we try '0', if the resulted subproblem has a feasible solution, then we fix this '?' to '0', otherwise we put a '1' here.

The subproblem is a bipartite matching problem, cardinality only, without weights. We can construct col[0] to col[n-1] from the rows. If there is a perfect match to the permuted columns, then we have a feasible solution.

Actually, for TC problems, many times if it looks like you need to try all permutations, there is a good chance that you can solve it with bipartite matching.

TODO: both topcoder.com and arena are down currently, need to verify solution.

p250 and p500 PASSED.
Two bugs in implementation of augment, one is that left is 0..n-1 and right is n..2n-1. and when augment, if from left to right, try all neighbors, if right to left, need to follow the matched vertex on the left.

Currently use BFS. The code looks rather ad hoc.

Wednesday, December 14, 2011

CF #96 div2 problem 133

prob133 D

Not a difficult problem since there are only 2500*4*2=10^4 configurations and just iterate them all, but made several mistakes.

1. Didn't realize when CP flips, the current pixel/cell also changes.
2. wrote something like pair.second.first=xxx; and then pair.second.first=xxx again, and I meant to say pair.second.second.

prob133 E

The problem gives a string with T or F, T means turn around and F means forward one step. A turtle moves in a 1D line and you must change exactly 'change' symbols in the command string, with the goal to maximize the distance of the final position.

After checking greedy and DP, it seems you can do DFS to check all valid configurations, (pos, index, change, dir). And there are only 10^6 confs. One catch is that change-2*i can be used.


Now one question:

How to suppress the gcc warning to gets function should not be used. Seems it comes from the linker.

Tuesday, December 6, 2011

TC SRM 526

p250 is an easy one and the only one that I submitted. However a mistake in
ducks[c]=1 instead of the correct ducks[r] = 1 cost all points. Now back to 1254, close to div2 now.

p500 is a combinatorial game and seems not difficult, however I cannot get the last sample case to pass. The problem is about a game. Two players, A and B, remove pebbles from a pile with n pebbles initially. A moves first. A can remove between 1 to K pebbles from the pile, but the remaining number must be a prime. B can remove between 1 to K pebbles, but the remaining number must be a composite (>=2 and not prime). Now decide the value of the game. If a player wins, it tries to win with minimum steps, if lose, it tries to maximize steps. The natural way is to calc(N,turn) to check whether the current turn wins. However each move needs to check K possible numbers and there are 2N configurations, for a total of 10^6^2 = 10^12, which will certainly TLE. But there has to be something about primes, can you check less configurations or check each configuration faster?

Monday, November 28, 2011

lprm cancel all print jobs

lprm -Pprintername -
to cancel all printing jobs

lpq -Pprintername
to view jobs

linux shell process filename from a text file

In my TA work I need to do the following:

Given a text file called names.txt, which contains all student ids:
aaa
bbb
ccc
ddd

Now I need to grep all files with a prefix matching one of the ids in names.txt, that is, I need to get

aaa.cpp
bbb.cpp
ccc.cpp

but the directory might contain a lot more other *.cpp

The first blessing is from grep

ls *.cpp | grep -f names.cpp > filenames.txt
gives the filenames that matching any entry in names.cpp

Now comes the hard part, how do you use those entries.

ls or find comes into mind but I don't know how to use them with input filename from a text file.

So it is awk.

awk '{ cmd="ls -l " $0; system(cmd) }' < filenames.txt

And that's it.

Tuesday, November 22, 2011

the FKG inequality and facility location problem

 Spent one day working on Sviridenko's paper on one inequality and finally see why it is   
 true.  
 Let y1,y2,...y_m be the probabilities that each facility open, sum of the y_i's is 1. Now   
 the expected distance is  
 dist=  
 c1*y1 + c2*y2*(1-y1) + c3*y3*(1-y1)(1-y2) + ... + cm*ym*(1-y1)...(1-y_m-1) + c_m+1*(1-  
 y1)...(1-ym)   
 and we want to show  
 dist <= (c1*y1+c2*y2+...+cm*ym)(1-1/e) + c_m+1 * 1/e  
 We know that y1+...+ym=1  
 Sviridenko noticed that  
 dist <= c1*(1-e^-y1) + c2*(e^-y1 - e^-(y1+y2)) + c3*(e^(-y1-y2)-e^(-y1-y2-y3)) + ... + cm*  
 (e^(-y1-...-y_m-1) - e^(-y1-...-ym)) + c_m+1*(1/e)=rhs  
 This is because c1<=c2<=...<=cm<=c_m+1  
 and the fraction of dist is always leading the rhs.  
 y1 >= 1-e^-y1  
 y1+y2(1-y1) >= 1-e^(-y1-y2)  
 Now we want to show rhs <= (c1*y1+c2*y2+...+cm*ym)(1-1/e) + c_m+1 * 1/e  
 and this is the part that took me a full day.  
 We actually use a similar idea. Notice that the sum of fractions are the same on both sides,  
 1-e^-y1 + e^-y1 - e^(-y1-y2) + ... + e^(-y1-y2-...-y_m-1) - e^(-y1-y2-...-y_m) = 1-e^-1  
 and  
 (y1+y2+...+ym)(1-1/e)=1-1/e  
 So we only need to show that the fraction of lhs is always leading.  
 1-e^-y1 >= y1(1-1/e)  
 1-e^-y1-y2 >= (y1+y2)(1-1/e)  
 ...  
 1-e^-y1-y2-...-ym >= (y1+y2+...+ym)(1-1/e)  
 This can be proved by noticing the function f(x)=1-(e^-x)-x(1-1/e) evaluates to 0 at x=0 and   
 x=1, and in between it is always nonnegative. And y1, y1+y2, ... , y1+...+ym area all   
 between 0 and 1.  
 A slightly different proof due to Byrka uses FKG inequality  
 (u1*f1+u2*f2+...+um*fm)(u1*g1+u2*g2+...+um*gm) >= (u1+...+um)  
 (u1*f1*g1+u2*f2*g2+...+um*fm*gm)  
 assuming u1,u2,...,um >= 0 and f is increasing and g is decreasing.  
 Now we use   
 u1=y1  f1=c1  g1=1  
 u2=y2  f2=c2  g2=1-y1  
 u3=y3  f3=c3  g3=(1-y1)(1-y2)  
 The FKG inequality gives  
 (y1+...+ym)(y1*c1*1 + y2*c2*(1-y1) + ... + ym*cm*(1-y1)...(1-y_m-1))   
 <= (y1*c1+y2*c2+...+ym*cm)(y1*1+y2(1-y1)+...+ym(1-y1)(1-y2)...(1-y_m-1))  
 Notice this part (y1*1+y2(1-y1)+...+ym(1-y1)(1-y2)...(1-y_m-1)) = 1-(1-y1)...(1-ym) since   
 they are complementary events. lhs is one of y1,...,ym happen, rhs is one minus none happen.  
 Therefore  
 (y1*c1*1 + y2*c2*(1-y1) + ... + ym*cm*(1-y1)...(1-y_m-1))   
 <= (y1*c1+y2*c2+...+ym*cm)(1-(1-y1)(1-y2)...(1-ym)  
 Add (1-y1)...(1-ym)c_m+1 to both sides we have  
 expected dist <= (c1*y1+c2*y2+...+cm*ym)(1-(1-y1)...(1-ym)) + c_m+1(1-y1)...(1-ym)  
 <= (c1*y1+...+cm*ym)(1-1/e) + c_m+1*1/e since c1<=c2<=...<=cm<=c_m+1  

Monday, November 21, 2011

TC SRM 524

p250
notice that if n is prime, then do it 2 times, one with n-1 and one with 1. if n is not prime, then do it only 1 time. There are two special cases, when n is 1, do it only 1 time, and when n is 3, need three times. The reason is that any prime above 3 has n-1 being nonprime while 3-1=2 is prime.

p500: unsolved
if all c[i]>0 or all c[i]<0, then answer is infinity. some upper bound, length <= lcm(c+, c-) since the smallest abs c[i]>0 and c[i]<0, their lcm will be both >0 and <0, which is impossible.
some lower bound is min(abs(c[i]))-1, then no restrictions apply and the sequence is always good.

p1000:

Friday, November 4, 2011

dell cannot display this video mode

Found an old dell dimension 4500 from some corner of the lab and decide to put up a Windows machine with office because the department machines are all linux and occasionally I do need windows and office to communicate with the rest of the world.

After installing xp3 and all the updates, and sophos, the monitor failed to display after reboot. Type the title into google search pops a lot webpages, so other people are seeing this as well.

The following works for me:

1. reboot machine and into safe mode.
2. in safe mode, control panel->computer administrative->device driver->uninstall display adapter driver
3. reboot into normal mode

C traps and pitfalls by Andrew Koenig

int a[] = { 1, 2, 3, };
It is legal to have a trailing comma after the last literal, namely 3 here.

because a[i] is the same as *(a+i), it is also the same as i[a], surprising, but true.

Wednesday, November 2, 2011

how to print man page from linux

Some time it is necessary to print a Linux man or info page. To print a ls command man page type the following command:

$ man ls | col -b | lpr -P hp1_floor2
Where,

col -b : To format man page output, it will not show any special character such as backspaces
lpr -P hp1_floor2 : hp1_floor2 is the printer name of the Linux/UNIX printer queue

To print the man page with effects such as italic and bold fonts on the HP Laser Jet printer, enter:
$ zcat /usr/share/man/man1/ls.1.gz | groff -man -Tps | lpr -P hp1_floor2

You can save man page as a ps file, enter:
zcat /usr/share/man/man1/ls.1.gz | groff -man -Tps >top.ps

Where,

zcat : Open compressed /usr/share/man/man1/ls.1.gz file
groff : Go through groff document formatting system, which is responsible for formatting of man page with effects such as italic and bold fonts on both screen and printer
lpr -P hp1_floor2 : hp1_floor2 is the printer name of the Linux/UNIX printer queue

Here is a shell script that we have installed on all Linux workstations, it saves man page in a text file and later user can print it using lpr command.

#!/bin/bash
# Linux shell script to save man pages to text file
# so that later user can print them
# Created by nixcraft , under GPL v2.0 or above.
# ----------------------------------------------------------------------
CMDS="$@"
OUTDIR="/tmp/$USER.man"
[ ! -d "$OUTDIR" ] && mkdir -p "$OUTDIR" :
for i in $CMDS
do
man "${i}" col -b > "$OUTDIR/${i}.txt"
done
echo "**********************************************************"
echo "All man pages saved to $OUTDIR directory"
echo "Just goto \"$OUTDIR\" directory and type the following command
to print man page(s):"
echo "lpr *.txt"
echo "**********************************************************"
exit 0


Just type the script-name followed by Linux commands, for example to print a man page of ls and bash type
$ saveman "ls bash"

Wednesday, October 26, 2011

4942 - Zombie Blast!

A problem from south cal regional, slightly harder.

At first I thought it might require something like voronoi diagram since the straightforward solution is to find the nearest mine for each zombie. However I have never implemented voronoi diagram and the O(nlogn) time algorithm to construct voronoi diagram seems tricky. Looking at the contest, several teams solved it so there has to be other ways as not many people have implemented voronoi diagram.

Looking at the grid column by column, we know that for each cell in a given column j, for each row i we need only remember the nearest M from left, if our nearest M at row k is from left. We can also find the nearest M from the right. The better of the two is our nearest M for cell(i,j) at row k. So for each column j=0 to w-1, we can compute an array of near[0..h-1] such that near[k] is dist from the nearest M at row k. We can compute this for all columns j, so we have a matrix near[0..h-1][0..w-1],

near[i][j] is the column difference from nearest M at row i to column j.
It takes only O(n^2) time to compute near[][] so we are good so far.

Now we binary search the smallest radius that covers all zombies. The max is w+h, min is 0. And since w<=2000 and h<=2000, for a relative error of 1.0e-6, we need only iterate 30 times since 2^30=10^9. Now we deal with each radius. Given a radius, we need to check whether it is enough to cover all zombies. Again we look at each column. If some M is to cover anything in this column, it has to be a near[i][j]. So for each near[k][j], k=0 to h-1, we calculate the interval that near[k][j] covers at column[j], and we can use a vector to keep track of nonoverlapping intervals. Once we got all intervals for column[j], we can scan each cell in column[j] to see if every zombie in this column is covered. This way we spend only O(n) time on each column.

In the end we have a decision to make, more iteration gives better precision but might TLE. Tried 20 and 25, both got WA, so 30 is the next thing to try and got AC finally.

In the original problem number of zombie and mines are limited to 10^5 so might be slightly easier to deal with.

Tuesday, October 25, 2011

TC SRM 521

p500

n<=40 points on xy-plane, find number of subsets that fit in nxn square, where nlow <= n <= nhigh.

Both nlow and nhigh can be 10^8, as well as x[i] and y[i]. But notice that any subset of points has xmin, xmax, ymin, ymax, and that's only n^4 choices of them. For each choice we have a rectangle and we need to test if it can be fit into a square of size between nlow and nhigh. If it does, we then test each point and collect all points inside this square. We can record all subsets using long long bitmask since n<64. To fit a rectangle into a square, we need to limit rectangle never go off the limit of prev of min and next of max, and putting xval and yval into a set<int> helps.

However, I keep forgetting that the rectangle needs to fit into a square and then the sides of a square is equal. So if x2-x1 and y2-y1 are different, the side is at least max of the two.

Wednesday, October 19, 2011

kd tree

Seems kd-tree is one way to solve this problem
2010/2011 SOUTHERN CALIFORNIA REGIONAL
ACM INTERNATIONAL COLLEGIATE PROGRAMMING CONTEST
Problem 7
Zombie Blast!

http://socalcontest.org/history/2010/socal2010.pdf

http://homes.ieu.edu.tr/~hakcan/projects/kdtree/kdTree.html

More info about kd-trees can be found in book :

M. de Berg, M.van Kreveld, M. Overmars, O.Schwarzkopf, "Computational Geometry (Algorithms and Applications) ", Springer, 1998.

Friday, October 14, 2011

CF #90

Being lucky that this one is unrated due to author's solution wrong in problem B.

This time I used an unusual strategy and started with problem C. A and B turned out to be trivial enough later so my estimate is correct. B is still tricky in a case when all cards are present in the q cards.

C is a DP problem. Work from day 0 to day n-1, try each hw between a[i] and b[i], and update next state. A state is (hw,day,last), means at that day, use a[last] and b[last], select hw as assignment. Then next state is (hw+k,day+1,l) or (hw*k,day+1,l) for l=last+1 to m-1, assume we sort assignments by increasing c[i]. The problem is made more complicated because we need to return the actual path, so for each state, we need to remember not only total hw so far, but also prev subject, and prev hw. My solution contained 3 mistakes:
1. when declare subject, should use long long for a and b, instead I used int.
2. day is off-by-one when reconstruct path, this should be easy to caught from the testcase, but I simply could not see it.
3. when try nexthw, we have only two options, hw+k and hw*k. I was instead looping over all a[next] to b[next], thinking that the constraints are small enough. However, they are just big enough to make 10^9 which gives a TLE.

After fixing these 3, got AC for problem C.

problem A and B can be done in 10min.

Friday, October 7, 2011

SCC in linear time

This is about computing strongly connected components in a directed graph. It is well-known that two DFS can do it in linear time.

The idea is to look at the DAG with each SCC collapsed into one node. Now if we do a DFS and if somehow we can start with a component with no outgoing edge, then we are good because all reachable nodes are in the same component. To this end we can do a DFS and we know that components finished last has no incoming edges. Why, an incoming edge cannot come from its ancestor because ancestors always finish after their descendants. This excludes tree edges and forward edges. The incoming edge cannot be a back edge because then the ancestor and descendant will be in the same component. So it can only be a cross edge from right to left. But then a component in the left always finishes before a right one so this cannot happen.

To conclude, a component finished last cannot have incoming edges. In particular, the component containing the root of the DFS tree finished last. Now if we reverse all the edges, then the component finished last does NOT have outgoing edges and every node in the same component is still reachable from each other. Put it another way, G and G(rev) has the same SCC. Now we just go through each node in reverse order of finish time and do a DFS to collect all nodes in the same component.

Thursday, October 6, 2011

SRM 520 div2

Haven't been coding for a while so I picked SRM 520 div2 for a warm up. It seems topcoder's div2 problem is becoming easier, while I still haven't had a p500 in div1 yet.

p250 and p500 have very small instances so you can brute-force by enumerating all possible solutions.

p1000 is a very straightforward DP, just observe that you can work row by row and only need to remember previous row's pass and challenged counts. However, my solution simply won't pass the last testcase, and it is not for integer overflow. The real issue? I created DP array dp[55][4][4], which ought to be enough, but I made the first recursive call with (0,10,0), which index something dp[0][10][0]. And this is a disaster. After staring at the code for quite some time and printed some intermediate results, still no idea where went wrong. Looking at a few submitted solution and they have implemented the same thing. And finally, I realized that the first call recurse(0,10,0) is the culprit, and it passed.

OK, in the end I am happy to see all three problems passed.

Friday, September 30, 2011

http download without a browser

I was trying to download some journal article but library's connect from home feature seems having a problem. OK, I can ssh into my machine at school, but my Windows laptop does not give an easy way to launch X. How do I use only terminal to emulate a click on download pdf? At this point wget comes into mind.

The first attempt, wget ..., returns HTTP 403 refused.

Maybe the webserver just don't like wget?

a google search returns something helpful. Of course wget is on the not wanted list, but you can pretend you are a browser.

wget -U firefox http://.../full.pdf

This works.

Monday, September 26, 2011

unix grep

http://www.thegeekstuff.com/2011/01/advanced-regular-expressions-in-grep-command-with-10-examples-%E2%80%93-part-ii/

In this article, let us review some advanced regular expression with examples.
Example 1. OR Operation (|)

Pipe character (|) in grep is used to specify that either of two whole subexpressions occur in a position. “subexpression1|subexpression2″ matches either subexpression1 or subexpression2.

The following example will remove three various kind of comment lines in a file using OR in a grep command.

First, create a sample file called “comments”.

$ cat comments
This file shows the comment character in various programming/scripting languages
### Perl / shell scripting
If the Line starts with single hash symbol,
then its a comment in Perl and shell scripting.
' VB Scripting comment
The line should start with a single quote to comment in VB scripting.
// C programming single line comment.
Double slashes in the beginning of the line for single line comment in C.

The file called “comments” has perl,VB script and C programming comment lines. Now the following grep command searches for the line which does not start with # or single quote (‘) or double front slashes (//).

$ grep -v "^#\|^'\|^\/\/" comments
This file shows the comment character in various programming/scripting languages
If the Line starts with single hash symbol,
then its a comment in Perl and shell scripting.
The line should start with a single quote to comment in VB scripting.
Double slashes in the beginning of the line for single line comment in C.

Example 2. Character class expression

As we have seen in our previous regex article example 9, list of characters can be mentioned with in the square brackets to match only one out of several characters. Grep command supports some special character classes that denote certain common ranges. Few of them are listed here. Refer man page of grep to know various character class expressions.

[:digit:] Only the digits 0 to 9
[:alnum:] Any alphanumeric character 0 to 9 OR A to Z or a to z.
[:alpha:] Any alpha character A to Z or a to z.
[:blank:] Space and TAB characters only.

These are always used inside square brackets in the form [[:digit:]]. Now let us grep all the process Ids of ntpd daemon process using appropriate character class expression.

$ grep -e "ntpd\[[[:digit:]]\+\]" /var/log/messages.4
Oct 28 11:42:20 gstuff1 ntpd[2241]: synchronized to LOCAL(0), stratum 10
Oct 28 11:42:20 gstuff1 ntpd[2241]: synchronized to 15.11.13.123, stratum 3
Oct 28 12:33:31 gstuff1 ntpd[2241]: synchronized to LOCAL(0), stratum 10
Oct 28 12:50:46 gstuff1 ntpd[2241]: synchronized to 15.11.13.123, stratum 3
Oct 29 07:55:29 gstuff1 ntpd[2241]: time reset -0.180737 s

Example 3. M to N occurences ({m,n})

A regular expression followed by {m,n} indicates that the preceding item is matched at least m times, but not more than n times. The values of m and n must be non-negative and smaller than 255.

The following example prints the line if its in the range of 0 to 99999.

$ cat number
12
12345
123456
19816282

$ grep "^[0-9]\{1,5\}$" number
12
12345

The file called “number” has the list of numbers, the above grep command matches only the number which 1 (minimum is 0) to 5 digits (maximum 99999).

Note: For basic grep command examples, read 15 Practical Grep Command Examples.
Example 4. Exact M occurence ({m})

A Regular expression followed by {m} matches exactly m occurences of the preceding expression. The following grep command will display only the number which has 5 digits.

$ grep "^[0-9]\{5\}$" number
12345

Example 5. M or more occurences ({m,})

A Regular expression followed by {m,} matches m or more occurences of the preceding expression. The following grep command will display the number which has 5 or more digits.

$ grep "[0-9]\{5,\}" number
12345
123456
19816282

Note: Did you know that you can use bzgrep command to search for a string or a pattern (regular expression) on bzip2 compressed files.
Example 6. Word boundary (\b)

\b is to match for a word boundary. \b matches any character(s) at the beginning (\bxx) and/or end (xx\b) of a word, thus \bthe\b will find the but not thet, but \bthe will find they.

# grep -i "\bthe\b" comments
This file shows the comment character in various programming/scripting languages
If the Line starts with single hash symbol,
The line should start with a single quote to comment in VB scripting.
Double slashes in the beginning of the line for single line comment in C.

Example 7. Back references (\n)

Grouping the expressions for further use is available in grep through back-references. For ex, \([0-9]\)\1 matches two digit number in which both the digits are same number like 11,22,33 etc.,

# grep -e '^\(abc\)\1$'
abc
abcabc
abcabc

In the above grep command, it accepts the input the STDIN. when it reads the input “abc” it didnt match, The line “abcabc” matches with the given expression so it prints. If you want to use Extended regular expression its always preferred to use egrep command. grep with -e option also works like egrep, but you have to escape the special characters like paranthesis.

Note: You can also use zgrep command to to search inside a compressed gz file.
Example 8. Match the pattern “Object Oriented”

So far we have seen different tips in grep command, Now using those tips, let us match “object oriented” in various formats.

$ grep "OO\|\([oO]bject\( \|\-\)[oO]riented\)"

The above grep command matches the “OO”, “object oriented”, “Object-oriented” and etc.,
Example 9. Print the line “vowel singlecharacter samevowel”

The following grep command print all lines containing a vowel (a, e, i, o, or u) followed by a single character followed by the same vowel again. Thus, it will find eve or adam but not vera.

$ cat input
evening
adam
vera

$ grep "\([aeiou]\).\1" input
evening
adam

Example 10. Valid IP address

The following grep command matches only valid IP address.

$ cat input
15.12.141.121
255.255.255
255.255.255.255
256.125.124.124

$ egrep '\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)' input
15.12.141.121
255.255.255.255

In the regular expression given above, there are different conditions. These conditioned matches should occur three times and one more class is mentioned separately.

If it starts with 25, next number should be 0 to 5 (250 to 255)
If it starts with 2, next number could be 0-4 followed by 0-9 (200 to 249)
zero occurence of 0 or 1, 0-9, then zero occurence of any number between 0-9 (0 to 199)
Then dot character

Saturday, September 24, 2011

CF problem 8D

http://codeforces.com/contest/8/problem/D

If t2 is big enough for Bob to visit shop before go home, then ans=dist(cinema,shop,house)+t1

Otherwise we look for a point p such that
for Alan, d(cinema,p) + d(p,shop) + d(shop,house) <= d(cinema,shop,house)+t1
for Bob, d(cinema,p) + d(p,house) <= d(cinema,house)+t2

The trajectory for Alan is an ellipse and the same is true for Bob. So we need to find the intersection of two ellipses. This seems a bit difficult to program.

Friday, September 23, 2011

TC SRM 519

p600 RequiredSubstrings

you are given a vector of string as words, and C<=6, L<=50, each word in words has length <=50. You are to find out the number of strings with length=L and contains exactly C words as substring.

To build a dp solution, you need to remember a state. Clearly you work from len=1 to L, and you need to remember what subset of words you already seen, this is only 1<<6 = 128 so no problem. But you need more than that. Ideally you want all possible prefix of length k-1, when you work on len=k, but this is too much. So you need to cut down your state. Since to have a valid word in len=k, you can separate words already in len=k-1 and words end with the kth char. To get a word end at kth char, you need to have a prefix of some words. OK, so you remember the longest suffix of your string which is a prefix of some word. char appears before that you don't care. Now your state is
dp[len][index_in_prefix][bitset_of_words_contained]

So you built an array prefix[] containing all prefixes of all words including empty string, remove duplicates. Then for each prefix and each char=a_to_z, you construct two arrays, go[p][c] is the index in prefix of the longest suffix of string prefix[p]+char(c), cover[p][c] is the bitset of words covered by string prefix[p]+char(c).

Notice that you can build a trie for all the prefix for fast check whether a given string is a prefix.

CF #88

Solved A, WA on B, TLE on C.

A. A simulation problem, since t_i <= 10^8, you can run time from 0 to 10^8, then solve each person when their t_i matches current time. Of course you need to sort them by their t_i so that you can get them in O(1) time. B. Recognizing constraints. a and b can both be 10^9 so enumerating even one of them is a bad idea. But mod is only 10^7. So you can try all possible numbers for the first person, i=0 to min(a, mod-1), since if some other choice, say k makes first win, then k%mod would make it win as well. The first wins if i*10^9 + j !=0 % mod for all j=0 to b. In other words -i*10^9 % mod >= b+1
Last catch is integer overflow when you do the multiplication.

C. Tournament. First thing is to realize that brute force will not work. Checking all path of length 2 would be n^3 and n=5000, so you need something better. For tournament, wiki tells you that if a tournament has a cycle, then it has a cycle of length 3. And draw a picture you will see a ways to find the length-3 cycle. Now the rest is easy. Do a DFS or BFS, find any cycle, then find a length-3 cycle out of that cycle.
For DFS you need to remember parent[node] for each node to be able to reconstruct the cycle.

Last but not least, the input can be huge, as large as 2.5MB. So cin will kill you. Use scanf instead.

D. TODO

E. TODO

Summary: When div1 and div2 are together in the same round, the problem is usually easier.
rank 470, rating 1669 (-6)

Thursday, September 22, 2011

count bits

In GCC
__builtin_popcount(k)

In MSVC
unsigned short __popcnt16(
unsigned short value
);
unsigned int __popcnt(
unsigned int value
);
unsigned __int64 __popcnt64(
unsigned __int64 value
);

#include <iostream> 
#include <intrin.h> 
using namespace std; 

int main() 
{
  unsigned short us[3] = {0, 0xFF, 0xFFFF};
  unsigned short usr;
  unsigned int   ui[4] = {0, 0xFF, 0xFFFF, 0xFFFFFFFF};
  unsigned int   uir;

  for (int i=0; i<3; i++) {
    usr = __popcnt16(us[i]);
    cout << "__popcnt16(0x" << hex << us[i] << ") = " << dec << usr << endl;
  }

  for (int i=0; i<4; i++) {
    uir = __popcnt(ui[i]);
    cout << "__popcnt(0x" << hex << ui[i] << ") = " << dec << uir << endl;
  }
}

Thursday, September 15, 2011

CF #87

A - Party
Find longest chain in a DAG, and each node has at most one predecessor. A simple DP in topological order.
Passed.

B - Lawnmower
This is an implementation problem. When work on row[i], assume you are moving right, then you have to arrive right[i]. Then you need to look at next nonempty row[k], if row[k] is moving right, then you have to move to left[k], otherwise row[k] is moving left and you have to move to right[k]. You also need to count in the steps moving between rows. And you stop when you have visited all 'W' cells. Notice that you do NOT have to end at last row.

The DP idea below got TLE.

It appears that your move is essentially fixed as you can only go down and in each row you can only move to one direction, left or right. But then you can have several empty rows in between and this will affect your move. So another DP problem. dp[row][first][last] is the number of moves you need to complete current row to last row. If you have some W cells outside [first,last], then it is infeasible and dp[][][] is INF. However during contest forgot to check outside last and didn't finish.

D - Unambiguous Arithmetic Expression
Hacked as TLE. My solution need 1000^3=10^9 and seems too slow. On local machine, the hacking case
1+1+1+...+1 runs in 26s.

Didn't get a good idea about C and E.

rating: 1711 to 1675, almost into div2.

C - Plumber I have to read the tutorial to get the problem. It looks you should arrange from top to bottom, from left to right. But the constraint is huge, each cell has 4 possible orientations and you have 10^5 cells. Even try to enumerate one row seems prohibitive. Now is the catch. When the constraints seem too big, any DP strategy destined to doom, there is a simple formula to count. If it is only 1D, then everybody sees it. Each plumber is either left or right, and once the first one is determined, then every cell will be determined in the same row. So you have the solution now. Each row, and each column, has to have plumbers alternating. That is, for a row, if the first one is chosen to be left, then 2nd one is right, and 3rd one is left and 4th one is right and so on. So you just check 2 possible orientations for each row. If both are good, then your ans will be multiplied by 2, if only one is good, then your ans does not change, if neither is good, you know your ans is zero.

Wednesday, September 14, 2011

declare var to the wrong type

I got caught by this issue when working on CF #82 div2, problem D, Treasure Island. It seems a straightforward problem and I implemented the naive idea. This will TLE but this is not my problem at this point. I got WA for testcase #38. However the testcase is too big to be shown fully on the submission page. I looked at my code, and a few AC code. Looks the only difference is that they precomputed dp[4][1000][1000], the max distance you can travel from cell[i][j] with direction[x]. This will speed up considerably, from 10^9 to 10^6. I know, but I got WA, not TLE. So what went wrong here.

Looking at my code, tried a couple of different things. Changing scanf to cin, change char[] to string, change [] to vector. None helps. Tried to generate some random input and run my code as well as some AC code. They agree. Plus to generate a nontrivial testcase for this problem is not that easy. I almost gave up and wrote a message to the problem setter for the complete testcase. Then something just kept pulling me back to the problem. Fix it.

My last resort was to hack the testcase. I tried to output some debugging information for that specific case. For CF it is not easy. I have to get around earlier testcases before reaching the testcase #38. Once I was there, something apparently wrong surfaces. Whenever inst.len > 128, the step went wrong. I still don't get it and switched between the naive impl and the speed-up version. Finally, I realized something. And that was it. I wrote
char dir=inst.dir, len=inst.len;
and then len is of type char and happens to be at most 128 on CF machine.

That was it.

Integer overflow are among the hardest bugs to ferret out. And when it bites, it hurts greatly.

Tuesday, September 13, 2011

TC SRM 517 div2 p1000 CuttingGrass

// srm517div2p1000.cpp
//
// 1. each pos is cut <= 1 times
// if not, then can postpone the first cut to merge with the second cut, this
// results in a better configuration
//
// 2. once a subset of pos are cut, then the best way is to order cuts by
// increasing grow[pos]
//
// The DP solution is to calc(int day, int pos, int T), the total height
// for grass[pos..N-1] after we decide cuts up to day.
// and we try all T=1,...,N
//
// another solution is to use max-weight-matching, due to idy
// L is grass set, R is day set
// w[l][r] = height that got cut if we cut grass[l] at day[r]
// try all T=1,...,N
// find a max-weight-matching of size T, then the total height is
// sum init[i]+grow[i]*T - matching
//
#include <algorithm>
#include <cassert>
#include <cstring>
#include <iostream>
#include <vector>
using namespace std;

struct Grass
{
    int grow, init;
};

bool operator<(const Grass &g1, const Grass &g2)
{
    return g1.grow < g2.grow;
}

int N;
vector<Grass> G;
int dp[51][51]; // day,pos

class CuttingGrass
{
public:
int theMin(vector<int> init, vector<int> grow, int H);
};

int calc(int day, int pos, int T)
{
    if (pos>=N) return 0;
    int &ans=dp[day][pos];
    if (ans>=0) return ans;
    ans = G[pos].init+G[pos].grow*T + calc(day,pos+1,T);// no cut on pos
    if (day<T) {
        int tmp = G[pos].grow*(T-day-1) + calc(day+1,pos+1,T); // cut on pos
        ans=min(ans,tmp);
    }
    return ans;
}

int CuttingGrass::theMin(vector<int> init, vector<int> grow, int H)
{
    N=int(init.size());
    int sum=0;
    for(int i=0; i<N; ++i) sum+=init[i];
    if (sum<=H) return 0;
    G.resize(N);
    for(int i=0; i<N; ++i) { Grass g={grow[i],init[i]}; G[i]=g; }
    sort(G.begin(), G.end());
    for(int t=1; t<=N; ++t)
    {
        memset(dp,-1,sizeof dp);
        if (calc(0,0,t)<=H) return t;
    }
    return -1;
}

Friday, September 9, 2011

CF #85

Problem C.
Notice that nxm <= 40, so min(n,m)<=6, suppose n<=6, at most 6 columns;
for each row, try 2^6 ways to put spiders.
use dp with memorization to solve(m,prev,mask), prev is the bitset of row[r-1], mask is the cells in previous row that not covered yet, they have to be covered by current row.

Thursday, September 8, 2011

CF #86

Failed all submissions,
http://codeforces.ru/contest/113/problem/A
problem A, WA. Passed A offline. Too many places went wrong. Need to be more careful. The problem is straightforward.

1. check each word to be valid
2. if only one word, return true
3. get type=0 (adj), then type=1(noun), then type=2(verb)
if there are words remaining, return false
if both genders appear, return false
if count of type[1] is not 1, return false

return true if none of above apply.!


problem B, hacked
The problem asks for all distinct substring that has a given prefix and a given suffix. SPOJ has a similar problem asks for the number of distinct substr, with n=1000.

One idea is to use suffix array, and count each l=1,2,...,n distinct prefix. However this seems too slow. My implementation accepted when using MSVC++, but TLE in GNU C++ 4.6. Shocking enough, there are solutions run in 10ms, while mine is 1.40s under MSVC++.

A faulty implementation of naive string matching. A correct one is:

n=len(text); m=len(patt)
vector<int> ans;
for i=0 to n-1
{
  for j=0 to m-1
    if (i+j >= n || text[i+j] != patt[j]) break;
  if (j>=m) ans.push_back(i); // found a match
}
return ans;

problem C, know will TLE and it did.
You want to use Sieve of Eratosthenes but memory is an issue because N is 3x10^8. However bitset or vector<bool> fits in memory limit. Also optimizations for the sieve helps, when at elem=p, start marking at p^2 instead of 2p, and take increment of 2p instead of p.

The following number theory result might also help.

Fermat's theorem on sums of two squares
From Wikipedia, the free encyclopedia

In additive number theory, Pierre de Fermat's theorem on sums of two squares states that an odd prime p is expressible as
p = x^2 + y^2
with x and y integers, if and only if
p = 1 mod 4.

For the prime sieve method, a recent result by
Jonathan P. Sorenson, The Pseudosquares Prime Sieve
might also be of interest.

Tuesday, September 6, 2011

install package for latex

There are times you need packages that does not come with your tex distribution, MikTeX for Windows and TexLive for Fedora Linux. And you need somehow install that package. Your got some zip or .tar.gz from CTAN, then:

latex foo.ins (this generates .sty)

[lyan@localhost algorithm2e]$ cp algorithm2e.sty ~/texmf/tex/latex/algorithm2e/
[lyan@localhost algorithm2e]$ cp algorithm2e.pdf ~/texmf/doc/
[lyan@localhost algorithm2e]$ sudo texhash ~/texmf
[sudo] password for lyan:
texhash: Updating /home/lyan/texmf/ls-R...
texhash: Done.

Then you can \usepackage{algorithm2e} in your latex source.
Note your local tex directory should mimic TDS Tex Directory Structure. texmf/tex/latex/package for .sty and texmf/doc/ for .pdf.

And by the way, algorithm2e seems pretty nice for algorithm formatting. If you want to use package algorithms.sty for whatever reason, here is an example

\usepackage{fullpage,amsmath,amsthm,algorithmic,algorithm,algorithm2e}

\DontPrintSemicolon

\begin{algorithm}
\caption{The generalized JV Algorithm for Uniform Demand FTFL}
\label{alg:jv}
\begin{algorithmic}
  \WHILE{$U\neq \emptyset$}
      \FOR{each active $j$}
          \STATE raise $\alpha_j$ uniformly
          \FORALL{$i\in\fac$}
          \IF {$\alpha_j \geq d_{ij}$}
              \IF {$i$ is fully-paid} \STATE raise $z_{ij}$
              \ELSE \STATE raise $\beta_{ij}$ ($i$ is not fully-paid)
              \ENDIF
          \ENDIF
          \ENDFOR
          \IF {$j$ reached $r$ open facilities} \STATE $j$
          is saturated \ENDIF
          \COMMENT{After $j$ is saturated, $\alpha_j$
            freezes along with $\beta_{ij},z_{ij}$ for all
            $i\in\fac$.}

          \IF {$j$ reached a dead facility} \STATE $j$ is
          blocked 
          \ENDIF
          \COMMENT{After $j$ is blocked, $\alpha_j$ freezes
            along with $\beta_{ij},z_{ij}$ for all
            $i\in\fac$.}
      \ENDFOR
      \FOR{each $i$ not dead or open}
          \IF {$i$ is reached by some $j$ that is saturated} \STATE $i$ is sleep \ENDIF
          \IF {$i$ is fully-paid}
              \IF {$i$ is sleep} \STATE $i$ is dead
              \ELSE \STATE $i$ is open \ENDIF
          \ENDIF
      \ENDFOR
  \ENDWHILE
\end{algorithmic}
\end{algorithm}

Saturday, September 3, 2011

about bitmask and subset Dynamic Programming

It is all too common to use bitmask 0 to 1<<n -1 to represent all subsets of {1,...,n} and do some calculation on each subset. Now if your DP algorithm requires subsets be computed before their supersets, what do you do?

Ideally you would like to generate all subsets in the order of empty set, all {i}, then all {i,j}, ... However this might not be easy to implement, in particular, the 0,1,2,3,... subsets do not assume that order. Still you would like to do calculation use that order, that is, you compute dp[0], then dp[1], dp[2], dp[3], ... up to dp[(1<<n)-1]

You can use dp[s] = sum { for each j in s, dp[s^1<<j] }

Why, because s excludes j, that is s^1<<j is always a smaller integer than s. So even though the subsets are not generated in the order of size, they still guarantee subsets generated before super sets.

Thursday, September 1, 2011

TC SRM 516

Missed the match because overslept.

p250 is easy since plain xor cipher gives you key. And each cipher maps to one plain so you can try them all.

p500 reads daunting but is actually simple. each row is a 50-ary number. Since each col is independent, just accumulate counts and assign 0 to the max cnt, 1 to second max cnt, etc. Watch for integer overflow when do multiplication.

p1000 is a tricky problem, even petr did a resubmission. First of all, the problem asks for an lexico first eulerian tour. We all know eulerian tour and there is a linear time algorithm with simple implementation using recursion. So what is the difficult part? In the tour, an edge cannot have the same label as its prev edge. The problem statement goes through all the trouble to disguise this by saying no two consecutive edges are same labelled except the begin edge and last edge of the tour. So how to do that? A greedy algorithm would use the minimum valid edge as next edge, but this fails. See this example

edge(0,1)='a'
edge(0,4)='f'
edge(1,2)='c'
edge(1,3)='c'
edge(1,4)='b'
edge(2,4)='d'
edge(3,4)='e'

A valid tour is 0-1-3-4-2-1-4-0, with edges a-c-e-b-c-d-f
Now greedy will choose edge(1,4)=b as next edge instead of edge(1,3)=c, then it is doomed since we will have to enter 1 and leave 1 using two edges both labeled 'c'. And it is not a valid tour.

So the algorithm? Pretty much brute force. You try all possible next edge, see if use this edge will allow the rest to have a valid tour, and pick the minimum label edge as your choice. It is obvious that the solution is valid and lexico minimum. Now how do you check the rest edges?

For each node, it needs to have an even number of edges, and the max cnt of them cannot be more than 1/2 of total cnt. It is clear that if this holds for all vertices and the subgraph is connected, then the subgraph has a valid eulerian tour. In fact, you are checking an eulerian path from vertex first to vertex last, each vertex other than first/last should have maxcnt=othercnt. For first and last, each get an extra deg for othercnt because for first=last, it has to be first=last=0, then you can have an extra 2 othercnt since it is valid to have maxcnt=othercnt+2 for vertex[0]. For first!= last, it is maxcnt=othercnt before, and if you use one of maxcnt, then you are good, if you use one othercnt, then you will have maxcnt=othercnt+1 and it is allowed.

Since we only have 40 nodes, and at most 40*40/2=800 edges. For each vertex and each of its neighbor, we remove that edge and do a check. the check is linear time. So the total is 40*40*40*40=800*800=640000, which is a small number and will run in time limit.

OK, code here, also at here

#include <algorithm>

#include <iostream>
#include <string>
#include <vector>
using namespace std;

#define sz(a) int(a.size())

bool valid(int first, int last, const vector<string> &graph);

vector<int> euler(char prev, int v, vector<string> graph)
{
//cout << "euler " << prev << ' ' << v << endl;
//for(int i=0; i<sz(graph); ++i)
//{
//for(int j=0; j<sz(graph[0]);++j) cout << graph[i][j];
//cout << endl;
//}

int n = sz(graph);
int deg[50]={0};
for(int i=0; i<n; ++i) for(int j=0; j<n; ++j) deg[i]+=(graph[i][j]!='.');
vector<int> ans(1,v);
if (deg[v]==0) return ans; // basecase, when v has deg=0

// find a neighbor and remove edge, decr deg
int next=sz(graph[v]);
for(int k=0; k<sz(graph[v]); ++k)
{
char ch = graph[v][k];
if (ch=='.'||ch==prev) continue;
//use edge(v,k) and check the rest
graph[v][k]=graph[k][v]='.';
// need to use min vertex so the first valid k is next
if (valid(k,0,graph)) { next=min(next,k); }
graph[v][k]=graph[k][v]=ch;
}
if (next>=sz(graph[v])) return vector<int>();
char ch = graph[v][next];
graph[v][next]=graph[next][v]='.';
vector<int> sub=euler(ch,next,graph);
if (sub.empty()) return vector<int>();
for(int i=0; i<sz(sub); ++i) ans.push_back(sub[i]);
return ans;
}

void dfs(int v, int visit[], const vector<string> &graph)
{
if (visit[v]) return;
visit[v]=1;
int n = sz(graph);
for(int i=0; i<n; ++i) if (!visit[i] && graph[v][i]!='.')
dfs(i,visit,graph);
}

// all non-isolated vertices must be connected
bool connected(const vector<string> &graph)
{
int n = sz(graph);
int visit[50]={0};
dfs(0,visit,graph);
int deg[50]={0};
for(int i=0; i<n; ++i) for(int j=0; j<n; ++j) if (graph[i][j]!='.') deg[i]++;
for(int i=0; i<n; ++i) if (deg[i] && !visit[i]) return false;
return true;
}

// every node has to be balanced, #maxcnt <= #total/2
bool valid(int first, int last, const vector<string> &graph)
{
int n = sz(graph);
if (!connected(graph)) return false;

for(int i=0; i<n; ++i)
{
int total=0;
int cnt[100]={0};
for(int j=0; j<n; ++j)
{
char ch = graph[i][j];
if (ch=='.') continue;
int k;
if ('A'<=ch && ch<='Z') k=ch-'A';
else k=ch-'a'+26;
cnt[k]++; total++;
}
sort(cnt,cnt+sizeof(cnt)/sizeof(int), greater<int>());
int extra=0;
// first=last only possible when both are 0, and maxcnt=other+2 is fine
// when first!=last, both can have maxcnt=other+1 because we have
// maxcnt=other before use one edge
if (i==first) { ++total; ++extra; }
if (i==last) { ++total; ++extra; }
if (total%2!=0) return false;
if (cnt[0]>total-cnt[0]+extra) return false;
}
return true;
}
class LexSmallestTour
{
public:
vector<int> determineTour(vector<string> roads, vector<int> queries)
{
vector<int> empty;
if (!valid(0,0,roads)) return empty;

vector<int> tour=euler(' ', 0, roads);
if (tour.empty()) return empty;
vector<int> ans;

for(int i=0; i<sz(queries); ++i)
{ if (i) cout << ' '; cout << tour[queries[i]];
ans.push_back(tour[queries[i]]);
}
cout << endl;
return ans;
}
};



// SRM 516 div1 p250

//
// brute force
//
#include <cassert>
#include <cmath>
#include <cstdio>
#include <cstdlib>
#include <cstring>
#include <string>
#include <vector>
#include <iostream>
#include <sstream>
#include <algorithm>
#include <map>
#include <queue>
#include <set>
using namespace std;

typedef long long LL;
typedef vector<int> VI;
typedef pair<int,int> PI;

#define INF (1<<29)
#define fort(i,a) for(typeof a.begin() i=a.begin(); i!=a.end(); ++i)
#define ALL(x) x.begin(), x.end()
#define PB push_back
#define MP make_pair
#define sz(a) int(a.size())

class NetworkXOneTimePad
{
public:
int crack(vector <string> plain, vector <string> cipher)
{
int ans=0;
string c = cipher[0];
for(int i=0; i<sz(plain); ++i)
{
string key;
for(int j=0; j<sz(c); ++j)
{
char ch;
if (plain[i][j]==c[j]) ch='0';
else ch='1';
key+=ch;
}
bool good=true;
for(int k=1; k<sz(cipher); ++k)
{
bool found=false; // found a plain for cipher[k]
for(int kk=0; kk<sz(plain); ++kk)
{
bool match=true; // plain[kk] match cipher[k]
for(int j=0; j<sz(cipher[k]); ++j)
{
char ch;
if (plain[kk][j]==key[j]) ch='0';
else ch='1';
if (ch!=cipher[k][j]) { match=false; break; }
}
if (match) { found=true; break; }
}
if (!found) { good=false; break; }
}
if (good) ++ans;
}
return ans;
}

// BEGIN CUT HERE
public:
void run_test(int Case) { if ((Case == -1) || (Case == 0)) test_case_0(); if ((Case == -1) || (Case == 1)) test_case_1(); if ((Case == -1) || (Case == 2)) test_case_2(); if ((Case == -1) || (Case == 3)) test_case_3(); }
private:
template <typename T> string print_array(const vector<T> &V) { ostringstream os; os << "{ "; for (typename vector<T>::const_iterator iter = V.begin(); iter != V.end(); ++iter) os << '\"' << *iter << "\","; os << " }"; return os.str(); }
void verify_case(int Case, const int &Expected, const int &Received) { cerr << "Test Case #" << Case << "..."; if (Expected == Received) cerr << "PASSED" << endl; else { cerr << "FAILED" << endl; cerr << "\tExpected: \"" << Expected << '\"' << endl; cerr << "\tReceived: \"" << Received << '\"' << endl; } }
void test_case_0() { string Arr0[] = {"110", "001"}; vector <string> Arg0(Arr0, Arr0 + (sizeof(Arr0) / sizeof(Arr0[0]))); string Arr1[] = {"101", "010"}; vector <string> Arg1(Arr1, Arr1 + (sizeof(Arr1) / sizeof(Arr1[0]))); int Arg2 = 2; verify_case(0, Arg2, crack(Arg0, Arg1)); }
void test_case_1() { string Arr0[] = {"00", "01", "10", "11"}; vector <string> Arg0(Arr0, Arr0 + (sizeof(Arr0) / sizeof(Arr0[0]))); string Arr1[] = {"00", "01", "10", "11"}; vector <string> Arg1(Arr1, Arr1 + (sizeof(Arr1) / sizeof(Arr1[0]))); int Arg2 = 4; verify_case(1, Arg2, crack(Arg0, Arg1)); }
void test_case_2() { string Arr0[] = {"01", "10"}; vector <string> Arg0(Arr0, Arr0 + (sizeof(Arr0) / sizeof(Arr0[0]))); string Arr1[] = {"00"}; vector <string> Arg1(Arr1, Arr1 + (sizeof(Arr1) / sizeof(Arr1[0]))); int Arg2 = 2; verify_case(2, Arg2, crack(Arg0, Arg1)); }
void test_case_3() { string Arr0[] = {"000", "111", "010", "101", "110", "001"}; vector <string> Arg0(Arr0, Arr0 + (sizeof(Arr0) / sizeof(Arr0[0]))); string Arr1[] = {"011", "100"}; vector <string> Arg1(Arr1, Arr1 + (sizeof(Arr1) / sizeof(Arr1[0]))); int Arg2 = 6; verify_case(3, Arg2, crack(Arg0, Arg1)); }

// END CUT HERE

};

// BEGIN CUT HERE
int main()
{
NetworkXOneTimePad __test;
__test.run_test(-1);
}
// END CUT HERE




[lyan@localhost topcoder]$ cat RowsOrdering.cpp 

// SRM 516 div1 p500
//
// The problem reads daunting but turned out to be very easy
//
// basically you are treating each row as a number in 50-ary
// since each col is independent, you can work with them without
// worry about other cols
//
// since you are to minimize the sum, you can assign 0 to the
// max cnt char in the col, 1 to 2nd max cnt char, ...
//
// then you sum the weighted cnt in each col
//
// now you work with a vector of sums and you assign larger
// weight to smaller sum
//
#include <cassert>
#include <cmath>
#include <cstdio>
#include <cstdlib>
#include <cstring>
#include <functional>
#include <string>
#include <vector>
#include <iostream>
#include <sstream>
#include <algorithm>
#include <map>
#include <queue>
#include <set>
using namespace std;

typedef long long LL;
typedef vector<int> VI;
typedef pair<int,int> PI;

#define INF (1<<29)
#define fort(i,a) for(typeof a.begin() i=a.begin(); i!=a.end(); ++i)
#define ALL(x) x.begin(), x.end()
#define PB push_back
#define MP make_pair
#define sz(a) int(a.size())

class RowsOrdering
{
public:
int order(vector <string> rows)
{
const int MOD = 1000000000+7;
int M = sz(rows[0]);
vector<LL> cols(M);

for(int c=0; c<M; ++c)
{
int row[55]={0};
for(int r=0; r<sz(rows); ++r)
{
int val=0;
char ch = rows[r][c];
if ('a'<=ch && ch<='z') val = ch-'a';
else if ('A'<=ch && ch<='X') val = ch-'A'+26;
row[val]++;
}
sort(row,row+sizeof(row)/sizeof(int),greater<int>());
LL sum=0LL;
for(int i=0; i<51; ++i)
{
sum += i*row[i];
}
cols[c]=sum;
}
sort(cols.begin(), cols.end(), greater<LL>());
LL factor=1LL, ans=0LL;
for(int i=0; i<M; ++i)
{
ans = (ans+cols[i]*factor%MOD) % MOD;
factor = factor*50LL % MOD;
}
ans = (ans+sz(rows))%MOD;
return ans;
}

// BEGIN CUT HERE
public:
void run_test(int Case) { if ((Case == -1) || (Case == 0)) test_case_0(); if ((Case == -1) || (Case == 1)) test_case_1(); if ((Case == -1) || (Case == 2)) test_case_2(); if ((Case == -1) || (Case == 3)) test_case_3(); if ((Case == -1) || (Case == 4)) test_case_4(); if ((Case == -1) || (Case == 5)) test_case_5(); }
private:
template <typename T> string print_array(const vector<T> &V) { ostringstream os; os << "{ "; for (typename vector<T>::const_iterator iter = V.begin(); iter != V.end(); ++iter) os << '\"' << *iter << "\","; os << " }"; return os.str(); }
void verify_case(int Case, const int &Expected, const int &Received) { cerr << "Test Case #" << Case << "..."; if (Expected == Received) cerr << "PASSED" << endl; else { cerr << "FAILED" << endl; cerr << "\tExpected: \"" << Expected << '\"' << endl; cerr << "\tReceived: \"" << Received << '\"' << endl; } }
void test_case_0() { string Arr0[] = {"bb", "cb", "ca"}; vector <string> Arg0(Arr0, Arr0 + (sizeof(Arr0) / sizeof(Arr0[0]))); int Arg1 = 54; verify_case(0, Arg1, order(Arg0)); }
void test_case_1() { string Arr0[] = {"abcd", "ABCD"}; vector <string> Arg0(Arr0, Arr0 + (sizeof(Arr0) / sizeof(Arr0[0]))); int Arg1 = 127553; verify_case(1, Arg1, order(Arg0)); }
void test_case_2() { string Arr0[] = {"Example", "Problem"}; vector <string> Arg0(Arr0, Arr0 + (sizeof(Arr0) / sizeof(Arr0[0]))); int Arg1 = 943877448; verify_case(2, Arg1, order(Arg0)); }
void test_case_3() { string Arr0[] = {"a", "b", "c", "d", "e", "f", "g"}; vector <string> Arg0(Arr0, Arr0 + (sizeof(Arr0) / sizeof(Arr0[0]))); int Arg1 = 28; verify_case(3, Arg1, order(Arg0)); }
void test_case_4() { string Arr0[] = {"a", "a"}; vector <string> Arg0(Arr0, Arr0 + (sizeof(Arr0) / sizeof(Arr0[0]))); int Arg1 = 2; verify_case(4, Arg1, order(Arg0)); }
void test_case_5() { string Arr0[] = {"dolphinigle", "ivanmetelsk", "lympandaaaa"}; vector <string> Arg0(Arr0, Arr0 + (sizeof(Arr0) / sizeof(Arr0[0]))); int Arg1 = 356186235; verify_case(5, Arg1, order(Arg0)); }

// END CUT HERE

};

// BEGIN CUT HERE
int main()
{
RowsOrdering __test;
__test.run_test(-1);
}
// END CUT HERE


Tuesday, August 23, 2011

CF #83

B. Basketball team. The trick is to calculate 1.0-ans, and it is (sum-s[h] choose n-1)/(sum-1 choose n-1). Took quite some time to figure it out.

C Arrangement
http://codeforces.com/contest/107/problem/C
I don't know if this is a hard problem but it took me almost a week to understand what's going on. The problem looks innocent enough, put n professors into n seats, return the yth permutation (1-based) in lexicographic order, with additional restriction that prof at pos=a must be less than prof at pos=b, if edge(a,b) is given in the input. A natural idea is to enumerate and count. Suppose profs are 0 to n-1, and seats are 0 to n-1. You fill pos from 0 to n-1 one by one.

For pos=0, you first try prof=0 at pos=0. Then count #perms with prof[0] at seat[0]. If this is >=k, then you know prof[0] is at seat[0]. Now work on pos=1, ...

The hard part is to count #perm with pos 0..p-1 already fixed to some prof. Now how do you do this? It has to be some DP solution. Since prof at pos=b has to be less than all prof at pos=a with edge(a,b), you kind of have to know the profs at all such pos=a when you put a prof at pos=b. So this tells you that you work in the order of prof when do counting. You can put prof[0] at pos=0 to n-1, after that you can put prof[1] at pos=0 to n-1...

// count number of perms with some prof fixed
long long calc(int elem, int n, int taken, int fixed[])
{
if elem>=n, found a perm to place all profs, so return 1
if dp[taken]>=0, subproblem solved before, so return dp[taken]

ans=dp[taken];
ans=0;
if (elem appear in fixed[]) {
if (pos=fixed[elem] not taken)
ans=calc(elem+1,n,taken|1< }
else {
try p=0 to n-1 for elem
if (p not taken and all pred[p] are taken)
ans += calc(elem+1,n,taken|1< }
return ans;
}

D. Crimes
only 6 or less may have m[i]>=2.
for m[i]=1, let there be k of them, number of seq is k^n if only use m[i]=1.
but how to count m[i]>=2 ones, with n=10^18

E. Darts
the only problem is to calculate the area covered by one or more rectangles. Given there are n=500 rectangles, there has to be a way to do it faster than inclusion-exclusion.

Saturday, August 20, 2011

TC SRM 515

My first match in div1, after quite some time.

p250, given clock hands for hh and mm, after rotating some degree, find out the earliest possible time that the measure is taken.

A natural solution is to enumerate all possible rotations. One catch is that the problem mentioned that measure is taken on a 30deg mark. So rotation is from 0 to 360 with increment of 30. One more observation is that the smallest (hourDeg,minuteDeg) pair is the desired answer. Got this problem correct, after struggle with a test case that require a rotation of 6 deg. Luckily there is an announcement just in time to clarify this.

Lesson: work out formula and multipliers on paper before typing. Spent too much time running testcase to get correct multiplier.

p550, a problem with subtle statement. A shop owner has some number of swords for sale. Each customer has several (hour,price,probability) triples. The owner wants to maximize the total price of sale. The restriction is that each hour at most one customer may have the hour in its list, and there are 24 hours.

A DP solution is almost obvious:
use mask 1<recurse(sword, time, mask) = recurse(sword, time+1, mask) if no customer has time in its list
=recurse(sword, time+1, mask) if the customer has time in its list already appear in mask
=(1-p)*recurse(sword,time+1,newmask) + p*recurse(sword-1,time+1,newmask)
that is, customer show up with probability p, and with (1-p) it no show. The trick, however, is to keep newmask=mask if there is no further event with the current customer. Actually, this is what differentiates a bmerry's solution and my offline solution.

Notice that dp[sword][time][mask] may have up to 24*24*2^24 size and this is too much. However, if we keep mask unchanged if no more event for the same customer, then mask has at most 2^12 instead of 2^24.


p1000: opened the problem but didn't think too much about it yet.
Not too hard either, actually this is the first time I think I can do a 1000 problem.
Three animals, L, F and R each have some possible start positions in a maze. F<=20, R<=20. maze cells are marked as 'L','F','R','#' for wall, '.' for empty. maze size is at most 50x50. L,F,R each choose start position uniformly and randomly. Find a maze cell as meet point so as to minimize the total expected move distance.
For each pair of F pos, R pos, 20x20x2500(cells)x2500(bfs)=2x10^9, try every cell as a meeting point, do bfs to find distance to all 'L' marks.

No, actually 2x10^9 is too slow. looked at rng_58's code and the solution runs in 20x20x2500xlog(2500)=2x10^7 roughly.

The idea is to loop each F and R pair as before. Then in each iteration, we can compute shortest path via bfs from current F and current R. Now each cell can be reached at dist=d(cell,currF)+d(cell,currR). We also know that a neighboring cell can be reached at d+1 provided that it is not '#'. Now use a priority_queue to compute min dist for each cell, and update dist(F,R,L) if that cell is an 'L'. Accumulate dist(F,R,L) to num. And deno is size(F) x size(R) x size(L).

A potential catch is that num might be larger than MAX_INT, as the path may wind several times because of '#'. So use long long.

This time p500 and p1000 both look doable, although it is much harder to actually get it.

Friday, August 19, 2011

CF 2C Commentator Problem

A very nasty problem. It does not take long to realize that the point (x,y) satisfies
d1/r1=d2/r2=d3/r3
Now if you look at it as two equations, then you have the trajectory as two circles, possibly degenerate to lines. Then you take the intersection of the two circles(lines) and check if you have the desired point. This is messy to implement.

A better way is watashi's solution. You look for the scaling factor theta. Code with comments below.

One more solution: you can binary search the scaling factor if you know how to check whether three circles have common area. Notice that pairwise intersection does not imply all three have common area. The paper discussed conditions that three circles overlap, let p1 and p2 be two points of circle C1 intersects C2, then p1 is inside C3 and p2 is outside C3. This ensures the commentator problem has a feasible solution.
http://www.dtic.mil/cgi-bin/GetTRDoc?Location=U2&doc=GetTRDoc.pdf&AD=ADA463920

// For (x1,y1,r1) and (x2,y2,r2), the trajectory is a circle, or a line if
// r1=r2
//
// the official solution is to find the intersection of two circle/line, which
// is quite messy to implement. there are 4 cases, line-line, line-circle,
// circle-line and circle-circle.
//
// Algorithm:
// take a second look at the equations
//
// (x-x0)^2 + (y-y0)^2 = (s*r0)^2
// (x-x1)^2 + (y-y1)^2 = (s*r1)^2
// (x-x2)^2 + (y-y2)^2 = (s*r2)^2
//
// we can find s(theta), and then get x,y
// -2(x0-x2)*x + -2(y0-y2)*y = s^2*(r0^2-r2^2) - (x0^2+y0^2)
// -2(x1-x2)*x + -2(y1-y2)*y = s^2*(r1^2-r2^2) - (x1^2+y1^2)
//
// Comments:
// - we only need theta2, that is s^2
// - s^2 need to be >1
// - there might be two s values, the problem asks for the smaller one
//
#include
#include
using namespace std;

const double EPS=1.0e-7;

int main()
{
double x[3],y[3],r[3];
for(int i=0; i<3; ++i)
scanf("%lf %lf %lf", &x[i],&y[i],&r[i]);

double a[2],b[2],c[2],d[2],det, x1,x2,y1,y2, delta;
a[0]=-2*(x[0]-x[2]); a[1]=-2*(x[1]-x[2]);
b[0]=-2*(y[0]-y[2]); b[1]=-2*(y[1]-y[2]);
c[0]=r[0]*r[0]-r[2]*r[2]; c[1]=r[1]*r[1]-r[2]*r[2];
d[0]=x[2]*x[2]-x[0]*x[0]+y[2]*y[2]-y[0]*y[0];
d[1]=x[2]*x[2]-x[1]*x[1]+y[2]*y[2]-y[1]*y[1];
det=a[0]*b[1]-b[0]*a[1]; // det != 0 because three centers non-collinear

fprintf(stderr,"%lf\n", det);

x1=(b[1]*c[0]-b[0]*c[1])/det; // matrix inv
y1=(c[1]*a[0]-c[0]*a[1])/det;
x2=(b[1]*d[0]-b[0]*d[1])/det;
y2=(a[0]*d[1]-a[1]*d[0])/det;

fprintf(stderr,"%lf %lf %lf %lf\n",x1,y1,x2,y2);

a[0]=x1*x1+y1*y1;
b[0]=2*x1*(x2-x[0])+2*y1*(y2-y[0])-r[0]*r[0];
c[0]=(x2-x[0])*(x2-x[0])+(y2-y[0])*(y2-y[0]);
delta=b[0]*b[0]-4*a[0]*c[0];
fprintf(stderr,"%lf %lf %lf (%lf)\n", a[0],b[0],c[0],delta);

double theta2; // theta^2, theta is the scaling factor, theta>1
if (fabs(a[0]) if (fabs(b[0]) if (fabs(c[0]) else theta2=-1.0;
} else { theta2=-c[0]/b[0]; }
} else {
if (delta<-EPS) theta2=-1;
else {
theta2=(-b[0]-sqrt(fabs(delta)))/(2*a[0]); // prefer smaller theta2
fprintf(stderr,"%lf\n",theta2);
if (theta2<1-EPS) {
theta2=(-b[0]+sqrt(fabs(delta)))/(2*a[0]);
}
}
}
fprintf(stderr,"%lf\n", theta2);
if (theta2>1-EPS) {
double xans=x1*theta2+x2, yans=y1*theta2+y2;
printf("%.6lf %.6lf\n", xans, yans);
}
else {
//printf("No Solution.\n");
}
}

Sunday, August 14, 2011

why spam emails are poor at spelling

It has puzzled me for quite some time why there are so many misspelled words in a spam. In fact, solely by counting the number of misspelled words one can reasonably dictate whether a message is spam or not. Later I learned somewhere that the very reason is that most spam filter programs counting on the appearance of specific keywords like coupon, free, etc. Now to evade the detection, they will intentionally misspell those words. But then a spam filter can use that information. So the game goes.

CF problem 7E

The problem gives n<=100 macro definitions and asks to check whether an expr is safe or not.

One idea is to replace all macros to get final expr and check. This runs out of time and memory as each macro might have 30+ terms and with 100 recursive substitution the resulting expr might have 30^100 terms which is simply too long.

Need to use standard parsing algorithms to do the check. Maybe it is not necessary to rewrite the left recursive grammar. watashi's idea is to evaluate all macros to
Expr: something without + - * /
Term: something with + -
Factor: something with * /
Suspicious: something already suspcious

Now keep all previous evaluated terms in a stack, as well as a stack for operators not processed so far. When hit lparen, make a recursive call and let the call return with the maching rparen. When hit a new operator, pop and evaluate all ops in stack with precedence higher or same. The code is very clean.

Enter div1 in Codeforces

Round #81 is a div1 and div2 both round and I got really lucky to get problem B. problem A is so simple and I have an almost correct solution, except reading in double costs precision issues. Anyway a boost of +128 sent me to div1 for the first time

Captain lantimilan
Li Yan, Riverside, CA, United States (USA)
Online

User''s contest rating in Codeforces community Contest rating: 1752
User''s contribution into Codeforces community Contribution: +12

Saturday, August 13, 2011

floating point in C++

Sometimes floating point might catch you in the least expected place:

See the innocent looking code below:

double x; cin >> x;
cout << int(100*x) << endl;

On some machines, if x is 0.70, then you will get 69 instead of 70. I guess this has something to do with the precision of double, as some decimal floating point does not have an exact binary representation. So the double is read and stored in an approximate value. Then the cast to int results in something like 69.999999999999.

Wednesday, August 10, 2011

TC SRM514 div2

Another div2 competition for me, as I am the borderline of div1 and div2.

p250 easy problem, just take diff of x and y, then return sqrt(dx^2+dy^2)

p500 the constraints are small, so just bfs on a large enough grid. maxX, maxY <=30, so you need only a grid of +-60.

p1000 looks daunting, but turns out to be easy, just didn't got scared during SRM. Actually I had 50+ minutes for p1000. Just not that confident. The string concatenation is a disguise, you only need to keep track of length of each element, and number of ones. A trivial DP. Two places need to be careful. One is that the length might get to 50*100=5000 digits, so even long long is not big enough. But you need only the first 15 digits so check that. Another place is my fault, when allocating vector, use max(n+1,K) since n could be smaller than K-1. The first time I got 'uncaught exception' in TC arena yet it runs fine on my own machine.

Another good lesson is to use vec.clear() and vec.resize() to reinitialize to zero. As I was using one[i]+=(s[i]=='1'), the second testcase picks up the old value from testcase 1.

Tuesday, August 9, 2011

brute force with twist

http://codeforces.com/contest/104/problem/E

The problem asks all sum a+b*k k=0...(a+b*k<=n) for p different pairs of (a,b).
The size of array, n<=3*10^5. Also p<=3*10^5.

Naive impl has to TLE, so some cleverness needs to be there.

Since we can compute sum a+b*k for same b, different a in one scan of the array, we can take care of all b<=1000 queries. For others we have b>=1000, then the number of summands in one (a,b) pair is at most 3*10^5/10^3=300, and for p=3*10^5 we have total being 300*3*10^5=10^8 which is fast enough.

It is a bit strange that prob E in CF is usually easier or less messy than prob D for div2, if you see what the problem is really asking for.

A few notes about implementation:
1. CF machine favors scanf/printf, the same code runs 1970ms with cin/cout and 890ms with scanf/printf
2. CF machine requires %I64d for long long, do NOT use %lld as this will cause WA sometimes.
3. use vector instead of plain array is not much slower, runs in 1080ms. Even repeated push_back for 10^5 times is not slow at all.
4. accessing elements in large array is cheap, repeatedly add to a[i] is as fast as make a local tmp += d; and then a[i]=tmp.

Friday, August 5, 2011

a bit string game

An exercise from Matousek's discrete math book.

Two people playing a game, they agree on a number n. Then they write 0 or 1 in turns until one of them is forced to repeat a pattern of length n. Then the game terminates and that person loses.

Question 1: if n is odd, show that the second person has a simple winning strategy.

Question 2: if n=4, show that the first person has a winning strategy.

It was mentioned that for even n >=6 it is an open problem.

Thursday, August 4, 2011

a pigeon hole principle exercise

From Matousek's discrete math book.

Prove that any n+1 numbers from 1 to 2n contains two numbers a and b such that a divides b.

Some observations: first n+1 is necessary to guarantee the existence of a and b. If we only pick n numbers, then pick n+1 ... 2n will do. No two divides each other.

second is that you need pigeon hole principle, and you need to build n holes somehow. Several attempts do not seem to be promising. The idea suggested in the solution is to build buckets with respect to 2^k*(2*m+1), that is, use the odd factor to build buckets. Now you have 500 buckets, with odd factor 1, 3, 5, ..., 999. Numbers fall into the same bucket has the form
2*(2m+1), 2^2*(2m+1), ..., 2^k*(2m+1) and the min will divide the max.

Are there other proofs?

Sunday, July 3, 2011

SRM 511

div 2
p500 Zoo
The problem is easier after reading the examples. There is a feasible solution iff we have
2's followed by 1's followed by 0's. Some parts might be missing. Assuming it is k 2's and some nonzero 1's, then it is 2^(k+1).

p1000 FiveHundredEleven
A naive solution is to use to keep track of the state of the same and recursion from initial state. But there are 2^50 sets of cards so need to be more clever.

The first observation is that cards that are masked by bitset are equivalent, so we only need the count of them. Play any one of them is the same as play another.

The second observation is that there is no need to keep track of subsets already used. The reason is that for cards not masked, they must not have been used. So we can find them using bitset. For cards that are masked, we only care about their numbers. So remember the number of cards already used suffices.

Tuesday, June 7, 2011

GCJ 2011 round 2

Failed abysmally in this round. First got bitten in problem A, when double should be used instead of int. Spent almost one hour and luckily fixed it. Then got panic and spent considerable time on problem B, never get it to go, although it was only one small bug that failed Bsmall. Looked at C and D but didn't have a clue due to panic.

Almost a week later got calm down and worked on those problem offline.

problem A: given a set of nonintersecting walkways and a limit of t seconds to run at speed R, walk speed is S, speed boost at walkway[i] is w[i]. Calculate minimum time to finish passage length X.

Greedy suffices. First consolidate the length not covered by a walkway, call this part w[0], other walkways are w[1] ... w[n]. Now run as much as possible to cover w[0], w[1], ... and do the rest with walking. The only thing that might catch you is cast to double when doing division.

problem B: given a gird with digits, 500x500, find a max square such that after throw away 4 corners, the mass center is at the square center.

This should ring a bell about a standard trick for finding max square with all 1's in a 0/1 matrix, where you can check a square with size s, bottom right corner (r,c) in O(1) time given squares with size s-1 in (r-1,c-1), (r,c-1), (r-1,c). Similar trick here to get you done in O(1) time for each (x,y),s. 500^3 runs in time limit.

Some implementation issues make the problem nasty. You need to keep track of total mass in a square, the sum of mass vectors from size s-1 needs to be adjusted. And you might want to scale the vectors by 2 so that you can deal with integers only. Choice of coordinate is also critical, the best one seems to be below, then (r,c) becomes (x,y) with x=c+1, y=r+1 for bottom right point of the cell.
------------------------>x
|
|
|
\/y

If all these are not enough, you also have to worry about space issue, as store all x,y,s triples would take 500^3 space and this is too much. The good news is that you need only s-2 and s-1 to compute size s. So it is really 500x500x3.

problem B is difficult to get correct. You have to have all the details on paper before coding.

problem C is surprisingly easy, although the problem statement seems a bit daunting. In retrospect there is a good reason that the bottom segment of top 1000 only solved C.

Basically you need to find out, for each prime < N, the maximum k such that p^k <= N. However N=10^12 and there are roughly 10^11 primes then listing them all would be too slow. Fortunately you care about only primes such that k>=2, so you need only primes <= sqrt(N) = 10^6, which is much more manageable. To find k you can use log(N)/log(p), but then you might worry about precision issue since you have introduced double. And there is a good reason to worry because the problem setter is likely to have this in mind and he/she is trying to set it up to get you. I added an assertion to catch this and the assertion did fail. A simple patch is to multiply p^k until you passed N. For this you need fast exp. And that's all for problem C.

problem D. todo.


Now I got some time to deal with problem D. It was the hardest of the four but not that hard once you see it. First it is easy to see you need to do a BFS to find out the number of planets to conquer. Then you need a shortest path with maximum number of neighbors. This looks a bit weird at first as I do not know a standard graph algorithm to do this and it does not look like a flow problem. After a while you see the BFS giving away some structure. The path has to run down level by level. So to solve D-small you can actually enumerate all paths, it is not much worse than (36/6)^6 < 3^12 = 10^6 so it will be fast. One way to do this is to have an L-ary counter to enumerate the paths and compute neighbors of each path. The neighbor sets can be computed using bitset but needs to be a bit careful as 36 bits calls for long long, and my implementation went awry in several places. The last bug was define neighbor[] as int, but it should be LL.

To solve D-large you need some DP strategy, and the crucial observation is that you never have an edge between two nodes with level difference >= 2. So neighbors of level l-2 are fixed as you are dealing with a level l node, and that's how your recurrence was built. To actually implement it probably take quite a bit work.

Done with D-large. For DP you need to keep track of number of neighbors for paths end up with some edge, so dp[i][j] would keep the count of neighbors (or neighbors+path_nodes) for best path end at edge[i][j]. Then node k with edge[j][k] can use all parent and parent of parent to compute dp[j][k].