round 1B, problem C, you are given exactly 500 positive integers, each no more than 10^12, find two subsets with equal sum.
round 1C, problem C, you are given two lists, list A has a sequence of
The round 1B problem looks like the NP-hard problem subset sum and it seems impossible to deal with 500 items with value as large as 10^12. The well-known solutions either enumerates all 2^n subsets or build a table with dp[n][sum]. Neither of them works here. However, read the constraints again. It says EXACTLY 500 items, why didn't it say at most 500? Why having more items helps?
OK, you know that each item is no more than 10^12, so you do not have many different sums, in fact, no more than 500*10^12=500*10^14. On the other hand you have 2^500 = 10^150 different subsets which is way bigger.
The solution builds on all 6-element subsets. The number of sums is no more than 6*10^12, and the number of sets is 500^6 = 10^16, so you have a good chance to hit two subsets with same sum if you pick two random subsets. And the algorithm simply shuffle the input integers, and keep enumerating 6-element subsets until you have two subsets with equal sum. The time limit is 8min which is more than enough. My naive implementation runs in 4min.
The round 1C problem is about longest common subsequence, or LCS, except that the two lists are a bit long, could be 100*10^12 items, and the usual dp approach just won't work. However you can observe the special arrangement of this problem and tweak the algorithm a bit. The recurrence looks like this:
dp[i][j] = max(dp[i][j-1], dp[i-1][j], dp[i-1][j-1] + 1)
And the last case applies only when type at a[i] matches type at b[j]. In this problem, if type of a[i] and b[j] mismatch, then we do the same since the optimal solution end at a[i],b[j] has to discard either a[i] or b[j]. However, if they do agree on type, then we have to run this particular type from i to some prior position x, and run from j to some prior position y, so we update dp[i][j] by dp[x][y] + min(count of type in a[x+1..i], count of type in b[y+1..j]) and the count can be obtained by maintaining a prefix sum.
And that was it. The large case runs in 20s.
hey Li Yan,
ReplyDeletefor the GRID WALKING problem in interviewstreet, could you paste you output for this input...
1
4 4
5 5 5 5
9 9 9 9
Mine is 560. Am not sure if am missing any special case in the challenge. I did the DP and calculations with Pascal triangle. Yet am missing something basic here(i passed 1/11 cases). Could you help me out... ?
[lyan@localhost interviewstreet]$ g++ gridwalk.cpp
ReplyDelete[lyan@localhost interviewstreet]$ cat >in
1
4 4
5 5 5 5
9 9 9 9
[lyan@localhost interviewstreet]$ ./a.out <in
4096
thanks a lot Li Yan, I got it now.
ReplyDeleteTurned out that, I combinated the steps instead of permuting.
good for you
ReplyDelete