algorithms and random thoughts: August 2011

Tuesday, August 23, 2011

CF #83

B. Basketball team. The trick is to calculate 1.0-ans, and it is (sum-s[h] choose n-1)/(sum-1 choose n-1). Took quite some time to figure it out.

C Arrangement
http://codeforces.com/contest/107/problem/C
I don't know if this is a hard problem but it took me almost a week to understand what's going on. The problem looks innocent enough, put n professors into n seats, return the yth permutation (1-based) in lexicographic order, with additional restriction that prof at pos=a must be less than prof at pos=b, if edge(a,b) is given in the input. A natural idea is to enumerate and count. Suppose profs are 0 to n-1, and seats are 0 to n-1. You fill pos from 0 to n-1 one by one.

For pos=0, you first try prof=0 at pos=0. Then count #perms with prof[0] at seat[0]. If this is >=k, then you know prof[0] is at seat[0]. Now work on pos=1, ...

The hard part is to count #perm with pos 0..p-1 already fixed to some prof. Now how do you do this? It has to be some DP solution. Since prof at pos=b has to be less than all prof at pos=a with edge(a,b), you kind of have to know the profs at all such pos=a when you put a prof at pos=b. So this tells you that you work in the order of prof when do counting. You can put prof[0] at pos=0 to n-1, after that you can put prof[1] at pos=0 to n-1...

// count number of perms with some prof fixed
long long calc(int elem, int n, int taken, int fixed[])
{
if elem>=n, found a perm to place all profs, so return 1
if dp[taken]>=0, subproblem solved before, so return dp[taken]

ans=dp[taken];
ans=0;
if (elem appear in fixed[]) {
if (pos=fixed[elem] not taken)
ans=calc(elem+1,n,taken|1< }
else {
try p=0 to n-1 for elem
if (p not taken and all pred[p] are taken)
ans += calc(elem+1,n,taken|1< }
return ans;
}

D. Crimes
only 6 or less may have m[i]>=2.
for m[i]=1, let there be k of them, number of seq is k^n if only use m[i]=1.
but how to count m[i]>=2 ones, with n=10^18

E. Darts
the only problem is to calculate the area covered by one or more rectangles. Given there are n=500 rectangles, there has to be a way to do it faster than inclusion-exclusion.

Saturday, August 20, 2011

TC SRM 515

My first match in div1, after quite some time.

p250, given clock hands for hh and mm, after rotating some degree, find out the earliest possible time that the measure is taken.

A natural solution is to enumerate all possible rotations. One catch is that the problem mentioned that measure is taken on a 30deg mark. So rotation is from 0 to 360 with increment of 30. One more observation is that the smallest (hourDeg,minuteDeg) pair is the desired answer. Got this problem correct, after struggle with a test case that require a rotation of 6 deg. Luckily there is an announcement just in time to clarify this.

Lesson: work out formula and multipliers on paper before typing. Spent too much time running testcase to get correct multiplier.

p550, a problem with subtle statement. A shop owner has some number of swords for sale. Each customer has several (hour,price,probability) triples. The owner wants to maximize the total price of sale. The restriction is that each hour at most one customer may have the hour in its list, and there are 24 hours.

A DP solution is almost obvious:
use mask 1<recurse(sword, time, mask) = recurse(sword, time+1, mask) if no customer has time in its list
=recurse(sword, time+1, mask) if the customer has time in its list already appear in mask
=(1-p)*recurse(sword,time+1,newmask) + p*recurse(sword-1,time+1,newmask)
that is, customer show up with probability p, and with (1-p) it no show. The trick, however, is to keep newmask=mask if there is no further event with the current customer. Actually, this is what differentiates a bmerry's solution and my offline solution.

Notice that dp[sword][time][mask] may have up to 24*24*2^24 size and this is too much. However, if we keep mask unchanged if no more event for the same customer, then mask has at most 2^12 instead of 2^24.

p1000: opened the problem but didn't think too much about it yet.
Not too hard either, actually this is the first time I think I can do a 1000 problem.
Three animals, L, F and R each have some possible start positions in a maze. F<=20, R<=20. maze cells are marked as 'L','F','R','#' for wall, '.' for empty. maze size is at most 50x50. L,F,R each choose start position uniformly and randomly. Find a maze cell as meet point so as to minimize the total expected move distance.
For each pair of F pos, R pos, 20x20x2500(cells)x2500(bfs)=2x10^9, try every cell as a meeting point, do bfs to find distance to all 'L' marks.

No, actually 2x10^9 is too slow. looked at rng_58's code and the solution runs in 20x20x2500xlog(2500)=2x10^7 roughly.

The idea is to loop each F and R pair as before. Then in each iteration, we can compute shortest path via bfs from current F and current R. Now each cell can be reached at dist=d(cell,currF)+d(cell,currR). We also know that a neighboring cell can be reached at d+1 provided that it is not '#'. Now use a priority_queue to compute min dist for each cell, and update dist(F,R,L) if that cell is an 'L'. Accumulate dist(F,R,L) to num. And deno is size(F) x size(R) x size(L).

A potential catch is that num might be larger than MAX_INT, as the path may wind several times because of '#'. So use long long.

This time p500 and p1000 both look doable, although it is much harder to actually get it.

Friday, August 19, 2011

CF 2C Commentator Problem

A very nasty problem. It does not take long to realize that the point (x,y) satisfies
d1/r1=d2/r2=d3/r3
Now if you look at it as two equations, then you have the trajectory as two circles, possibly degenerate to lines. Then you take the intersection of the two circles(lines) and check if you have the desired point. This is messy to implement.

A better way is watashi's solution. You look for the scaling factor theta. Code with comments below.

One more solution: you can binary search the scaling factor if you know how to check whether three circles have common area. Notice that pairwise intersection does not imply all three have common area. The paper discussed conditions that three circles overlap, let p1 and p2 be two points of circle C1 intersects C2, then p1 is inside C3 and p2 is outside C3. This ensures the commentator problem has a feasible solution.
http://www.dtic.mil/cgi-bin/GetTRDoc?Location=U2&doc=GetTRDoc.pdf&AD=ADA463920

// For (x1,y1,r1) and (x2,y2,r2), the trajectory is a circle, or a line if
// r1=r2
//
// the official solution is to find the intersection of two circle/line, which
// is quite messy to implement. there are 4 cases, line-line, line-circle,
// circle-line and circle-circle.
//
// Algorithm:
// take a second look at the equations
//
// (x-x0)^2 + (y-y0)^2 = (s*r0)^2
// (x-x1)^2 + (y-y1)^2 = (s*r1)^2
// (x-x2)^2 + (y-y2)^2 = (s*r2)^2
//
// we can find s(theta), and then get x,y
// -2(x0-x2)*x + -2(y0-y2)*y = s^2*(r0^2-r2^2) - (x0^2+y0^2)
// -2(x1-x2)*x + -2(y1-y2)*y = s^2*(r1^2-r2^2) - (x1^2+y1^2)
//
// Comments:
// - we only need theta2, that is s^2
// - s^2 need to be >1
// - there might be two s values, the problem asks for the smaller one
//
#include
#include
using namespace std;

const double EPS=1.0e-7;

int main()
{
double x[3],y[3],r[3];
for(int i=0; i<3; ++i)
scanf("%lf %lf %lf", &x[i],&y[i],&r[i]);

double a[2],b[2],c[2],d[2],det, x1,x2,y1,y2, delta;
a[0]=-2*(x[0]-x[2]); a[1]=-2*(x[1]-x[2]);
b[0]=-2*(y[0]-y[2]); b[1]=-2*(y[1]-y[2]);
c[0]=r[0]*r[0]-r[2]*r[2]; c[1]=r[1]*r[1]-r[2]*r[2];
d[0]=x[2]*x[2]-x[0]*x[0]+y[2]*y[2]-y[0]*y[0];
d[1]=x[2]*x[2]-x[1]*x[1]+y[2]*y[2]-y[1]*y[1];
det=a[0]*b[1]-b[0]*a[1]; // det != 0 because three centers non-collinear

fprintf(stderr,"%lf\n", det);

x1=(b[1]*c[0]-b[0]*c[1])/det; // matrix inv
y1=(c[1]*a[0]-c[0]*a[1])/det;
x2=(b[1]*d[0]-b[0]*d[1])/det;
y2=(a[0]*d[1]-a[1]*d[0])/det;

fprintf(stderr,"%lf %lf %lf %lf\n",x1,y1,x2,y2);

a[0]=x1*x1+y1*y1;
b[0]=2*x1*(x2-x[0])+2*y1*(y2-y[0])-r[0]*r[0];
c[0]=(x2-x[0])*(x2-x[0])+(y2-y[0])*(y2-y[0]);
delta=b[0]*b[0]-4*a[0]*c[0];
fprintf(stderr,"%lf %lf %lf (%lf)\n", a[0],b[0],c[0],delta);

double theta2; // theta^2, theta is the scaling factor, theta>1
if (fabs(a[0]) if (fabs(b[0]) if (fabs(c[0]) else theta2=-1.0;
} else { theta2=-c[0]/b[0]; }
} else {
if (delta<-EPS) theta2=-1;
else {
theta2=(-b[0]-sqrt(fabs(delta)))/(2*a[0]); // prefer smaller theta2
fprintf(stderr,"%lf\n",theta2);
if (theta2<1-EPS) {
theta2=(-b[0]+sqrt(fabs(delta)))/(2*a[0]);
}
}
}
fprintf(stderr,"%lf\n", theta2);
if (theta2>1-EPS) {
double xans=x1*theta2+x2, yans=y1*theta2+y2;
printf("%.6lf %.6lf\n", xans, yans);
}
else {
//printf("No Solution.\n");
}
}

Sunday, August 14, 2011

why spam emails are poor at spelling

It has puzzled me for quite some time why there are so many misspelled words in a spam. In fact, solely by counting the number of misspelled words one can reasonably dictate whether a message is spam or not. Later I learned somewhere that the very reason is that most spam filter programs counting on the appearance of specific keywords like coupon, free, etc. Now to evade the detection, they will intentionally misspell those words. But then a spam filter can use that information. So the game goes.

CF problem 7E

The problem gives n<=100 macro definitions and asks to check whether an expr is safe or not.

One idea is to replace all macros to get final expr and check. This runs out of time and memory as each macro might have 30+ terms and with 100 recursive substitution the resulting expr might have 30^100 terms which is simply too long.

Need to use standard parsing algorithms to do the check. Maybe it is not necessary to rewrite the left recursive grammar. watashi's idea is to evaluate all macros to
Expr: something without + - * /
Term: something with + -
Factor: something with * /
Suspicious: something already suspcious

Now keep all previous evaluated terms in a stack, as well as a stack for operators not processed so far. When hit lparen, make a recursive call and let the call return with the maching rparen. When hit a new operator, pop and evaluate all ops in stack with precedence higher or same. The code is very clean.

Enter div1 in Codeforces

Round #81 is a div1 and div2 both round and I got really lucky to get problem B. problem A is so simple and I have an almost correct solution, except reading in double costs precision issues. Anyway a boost of +128 sent me to div1 for the first time

Captain lantimilan
Li Yan, Riverside, CA, United States (USA)
Online

User''s contest rating in Codeforces community Contest rating: 1752
User''s contribution into Codeforces community Contribution: +12

Saturday, August 13, 2011

floating point in C++

Sometimes floating point might catch you in the least expected place:

See the innocent looking code below:

double x; cin >> x;
cout << int(100*x) << endl;

On some machines, if x is 0.70, then you will get 69 instead of 70. I guess this has something to do with the precision of double, as some decimal floating point does not have an exact binary representation. So the double is read and stored in an approximate value. Then the cast to int results in something like 69.999999999999.

Wednesday, August 10, 2011

TC SRM514 div2

Another div2 competition for me, as I am the borderline of div1 and div2.

p250 easy problem, just take diff of x and y, then return sqrt(dx^2+dy^2)

p500 the constraints are small, so just bfs on a large enough grid. maxX, maxY <=30, so you need only a grid of +-60.

p1000 looks daunting, but turns out to be easy, just didn't got scared during SRM. Actually I had 50+ minutes for p1000. Just not that confident. The string concatenation is a disguise, you only need to keep track of length of each element, and number of ones. A trivial DP. Two places need to be careful. One is that the length might get to 50*100=5000 digits, so even long long is not big enough. But you need only the first 15 digits so check that. Another place is my fault, when allocating vector, use max(n+1,K) since n could be smaller than K-1. The first time I got 'uncaught exception' in TC arena yet it runs fine on my own machine.

Another good lesson is to use vec.clear() and vec.resize() to reinitialize to zero. As I was using one[i]+=(s[i]=='1'), the second testcase picks up the old value from testcase 1.

Tuesday, August 9, 2011

brute force with twist

http://codeforces.com/contest/104/problem/E

The problem asks all sum a+b*k k=0...(a+b*k<=n) for p different pairs of (a,b).
The size of array, n<=3*10^5. Also p<=3*10^5.

Naive impl has to TLE, so some cleverness needs to be there.

Since we can compute sum a+b*k for same b, different a in one scan of the array, we can take care of all b<=1000 queries. For others we have b>=1000, then the number of summands in one (a,b) pair is at most 3*10^5/10^3=300, and for p=3*10^5 we have total being 300*3*10^5=10^8 which is fast enough.

It is a bit strange that prob E in CF is usually easier or less messy than prob D for div2, if you see what the problem is really asking for.

A few notes about implementation:
1. CF machine favors scanf/printf, the same code runs 1970ms with cin/cout and 890ms with scanf/printf
2. CF machine requires %I64d for long long, do NOT use %lld as this will cause WA sometimes.
3. use vector instead of plain array is not much slower, runs in 1080ms. Even repeated push_back for 10^5 times is not slow at all.
4. accessing elements in large array is cheap, repeatedly add to a[i] is as fast as make a local tmp += d; and then a[i]=tmp.

Friday, August 5, 2011

a bit string game

An exercise from Matousek's discrete math book.

Two people playing a game, they agree on a number n. Then they write 0 or 1 in turns until one of them is forced to repeat a pattern of length n. Then the game terminates and that person loses.

Question 1: if n is odd, show that the second person has a simple winning strategy.

Question 2: if n=4, show that the first person has a winning strategy.

It was mentioned that for even n >=6 it is an open problem.

Thursday, August 4, 2011

a pigeon hole principle exercise

From Matousek's discrete math book.

Prove that any n+1 numbers from 1 to 2n contains two numbers a and b such that a divides b.

Some observations: first n+1 is necessary to guarantee the existence of a and b. If we only pick n numbers, then pick n+1 ... 2n will do. No two divides each other.

second is that you need pigeon hole principle, and you need to build n holes somehow. Several attempts do not seem to be promising. The idea suggested in the solution is to build buckets with respect to 2^k*(2*m+1), that is, use the odd factor to build buckets. Now you have 500 buckets, with odd factor 1, 3, 5, ..., 999. Numbers fall into the same bucket has the form
2*(2m+1), 2^2*(2m+1), ..., 2^k*(2m+1) and the min will divide the max.

Are there other proofs?

algorithms and random thoughts