Thursday, January 29, 2015
facebook hacker cup round 2
I was lucky enough to get into round 2, but it was a disaster.
After 3 hours, I had 10pt problem submitted, but I know somehow it would fall. End up getting 0 points.
lazy sort: the observation is that at any time, the partial solution must be a consecutive segment of (1..n), or (n..1). For some reason I was confusing it with a harder problem, where you can pop_front and pop_back, in addition to push_back and push_front.
The solution processes the given list, and keep track of current min and max, and the start and end of remain list. Now if there is a solution, then either the head or the tail of the remaining list is min-1 or max+1, and only one of the four possibility holds. So you can deterministically follow that one, or return false.
all critical: the problem asks for expectation. As usual there are two possible approaches, one is using definition, that is, we compute an infinite sum, pr(X=n) * n for n = 0 to inf. It takes a dynamic programming dp(s, l), meaning the probability to get s success in the first l rounds. dp(0, 0) = 1.0, dp(s, l) could come from dp(k, l-1) for k = 0, 1, 2, ..., s, that is, dp(s, l) = sum_{k} dp(k, l-1) * (N-k, s-k) because in round l-1 we had k success, and there can be at most N = 20 success. In round l we must have s - k success, and that must come from N-k candidates. Once we have dp(N, l) for l = 1, 2, ..., n for some large enough n, we can compute expectation as sum_{l} [dp(N,l) - dp(N, l-1)] * l. The problem asks for a precision of 1e-5, so L = 1e6 should be enough.
The other approach uses recursive idea. Let E[n] be the expected number of trials when there are exactly n possible success. Now consider the first trial. We could have 0, 1, ..., n success. If we have 0 success, then we need another E[n] trials, if we have 1 success, then we need another E[n-1] trials, ..., therefore
E[n] = 1 + E[n] * (1-p)^n + E[n-1] * C(n,1) * p * (1-p)^n-1 + ... + E[n-k] * C(n,k) * p^k * (1-p)^(n-k) + ... + E[0] * p^n
You should convince yourself that E[0] = 0. Then it is simple to calculate E[n] for n = 1 to N, and E[N] is the answer.
auto complete strikes back: The first step is to build a trie with all words. Then we solve the problem recursively on the tree. Consider a subtree rooted at T, let cost(T, k) be the cost to get k words if we only use words in this subtree. Then cost(root, K) is our answer. Obviously cost(T, 0) is 0. To get k words, we need to get k1 from child c1, k2 from child c2, ..., such that k1 + k2 + ... = k, and the cost is the sum of the child subtree's cost. A special case is that when T is a word, then the sum of k1, k2, ... maybe k-1. The cost to get the word in T is the number of links to follow to get to node T, which can be stored when we build the trie, call it the depth. Another case that warrants special treatment is k = 1, in this case we can have a cost of T.depth because T must have at least one word in its subtree and to get one word we only need to arrive at node T and no further.
One implementation challenge is to manage memory. Both new/delete and use a global array to allocate node will work, but my mistake to put a local array of size 26*100 caused stack overflow and segmentation fault, and it takes quite some time to find out the culprit.
problem D is supposed to be a complicated segment tree problem. To be continued.
After 3 hours, I had 10pt problem submitted, but I know somehow it would fall. End up getting 0 points.
lazy sort: the observation is that at any time, the partial solution must be a consecutive segment of (1..n), or (n..1). For some reason I was confusing it with a harder problem, where you can pop_front and pop_back, in addition to push_back and push_front.
The solution processes the given list, and keep track of current min and max, and the start and end of remain list. Now if there is a solution, then either the head or the tail of the remaining list is min-1 or max+1, and only one of the four possibility holds. So you can deterministically follow that one, or return false.
all critical: the problem asks for expectation. As usual there are two possible approaches, one is using definition, that is, we compute an infinite sum, pr(X=n) * n for n = 0 to inf. It takes a dynamic programming dp(s, l), meaning the probability to get s success in the first l rounds. dp(0, 0) = 1.0, dp(s, l) could come from dp(k, l-1) for k = 0, 1, 2, ..., s, that is, dp(s, l) = sum_{k} dp(k, l-1) * (N-k, s-k) because in round l-1 we had k success, and there can be at most N = 20 success. In round l we must have s - k success, and that must come from N-k candidates. Once we have dp(N, l) for l = 1, 2, ..., n for some large enough n, we can compute expectation as sum_{l} [dp(N,l) - dp(N, l-1)] * l. The problem asks for a precision of 1e-5, so L = 1e6 should be enough.
The other approach uses recursive idea. Let E[n] be the expected number of trials when there are exactly n possible success. Now consider the first trial. We could have 0, 1, ..., n success. If we have 0 success, then we need another E[n] trials, if we have 1 success, then we need another E[n-1] trials, ..., therefore
E[n] = 1 + E[n] * (1-p)^n + E[n-1] * C(n,1) * p * (1-p)^n-1 + ... + E[n-k] * C(n,k) * p^k * (1-p)^(n-k) + ... + E[0] * p^n
You should convince yourself that E[0] = 0. Then it is simple to calculate E[n] for n = 1 to N, and E[N] is the answer.
auto complete strikes back: The first step is to build a trie with all words. Then we solve the problem recursively on the tree. Consider a subtree rooted at T, let cost(T, k) be the cost to get k words if we only use words in this subtree. Then cost(root, K) is our answer. Obviously cost(T, 0) is 0. To get k words, we need to get k1 from child c1, k2 from child c2, ..., such that k1 + k2 + ... = k, and the cost is the sum of the child subtree's cost. A special case is that when T is a word, then the sum of k1, k2, ... maybe k-1. The cost to get the word in T is the number of links to follow to get to node T, which can be stored when we build the trie, call it the depth. Another case that warrants special treatment is k = 1, in this case we can have a cost of T.depth because T must have at least one word in its subtree and to get one word we only need to arrive at node T and no further.
One implementation challenge is to manage memory. Both new/delete and use a global array to allocate node will work, but my mistake to put a local array of size 26*100 caused stack overflow and segmentation fault, and it takes quite some time to find out the culprit.
problem D is supposed to be a complicated segment tree problem. To be continued.
Subscribe to:
Posts (Atom)