algorithms and random thoughts: 2010

Thursday, December 2, 2010

no more firefox crash

After install latest cups package for fedora, firefox no longer crashes when attempting to print a webpage. Along the way installed a bunch of debuginfo packages for cups and pulseaudio.

After reading some debugging book, almost attempted to use gdb to figure out what went wrong when trying to print a webpage. Now it was fixed by updating cups.

Wednesday, November 24, 2010

bash and gentoo

Arguably the best bash tutorial that I have ever seen. Concise but touched very advanced usage. Excellent examples.

http://www.ibm.com/developerworks/library/l-bash3.html

Also Gentoo seems to be worth exploring.

http://www.gentoo.org/doc/en/gentoo-x86-quickinstall.xml

Monday, November 22, 2010

book: how to solve it by computer

How to Solve it by computer.
By R. G. Dromey, 1982

I forgot how I get introduced into this book but this is the book that I wish I could read the first day when I started learn programming. Many fundamental but important techniques and guidelines are introduced via examples, with ingredients taken from almost every area in computer science, algorithm and data structure, formal verification. Very fine details are mentioned. Nowhere else have I seen these things.

One example is on Fibonacci numbers:
f(0) = 1, f(1) = 1, f(n) = f(n-1) + f(n-2)

Everybody knows how to compute the n-th fibo number using a loop running n times, but can you do it in O(log n) time?

I have a terrible memory of this problem because this was used in google codejam 2008 round 1A, and I didn't solve it. Neither can I read other people's solutions because I do not understand the magic numbers they use.

Now there are at least two ways to do it. Both using the idea of fast exponentiation.

1. Matrix representation. (I learned this from Dr. Chrobak in his Advanced Algorithm class).


/        \      /      \   /        \
| f(n)   |      | 1  1 |   | f(n-1) |
| f(n-1) |  =   | 1  0 |   | f(n-2) |
\        /      \     /    \        /

To compute n-th fibo number you just need to fast exponentiate the matrix.

2. Using the identity f(2n) = f(n)^2 + f(n-1)^2.
To derive it, observe this:

f(2n) = f(2n-1) + f(2n-2) = f(1)f(2n-1) + f(0)f(2n-2)
f(2n) = f(2n-2) + f(2n-3) + f(2n-2) = 2f(2n-2) + f(2n-3) = f(2)f(2n-2) + f(1)f(2n-3)
See the pattern?
f(2n) = f(i)f(2n-i) + f(i-1)f(2n-2-(i-1)) (*)
Now take i=n, you have the identity. You can prove (*) by induction.

Thursday, November 18, 2010

AMPL the modeling language

Very natural to set up models which deals with large scale optimization problems. Close to algebraic notation.

Students version comes for free but limit is 300 variables and 300 constraints.

[lyan@localhost LPSolver]$ cat dualfit.mod
param k;
var f >= 0;
var d {1..k} >= 0;
var a {1..k} >= 0;
var x {1..k,1..k} >= 0;

maximize Dual: sum {j in 1..k} a[j];
subject to deno: f + sum {j in 1..k} d[j] <= 1;
subject to fac {j in 1..k}: sum {i in j..k} x[i,j] <= f;
subject to maxzero {j in 1..k, i in j..k}: x[i,j] >= a[j] - d[i];
subject to trig {j in 1..k, i in 1..j}: a[j] <= a[i] + d[i] + d[j];

[lyan@localhost LPSolver]$ cat dualfit.dat
param k := 20;

[lyan@localhost LPSolver]$ cat runk.sh
#!/bin/bash
if [ $# != 3 ]
then
echo "Usage: runk "
exit 1;
fi

START=$1
END=$2
OUTFILE=$3
if [ -f $OUTFILE ]
then
rm $OUTFILE
fi

for i in `seq $START $END`
do
echo "param k := $i;" > dualfit.dat
cat dualfit.dat >> $OUTFILE
ampl dualfit.run >> $OUTFILE
done

how to check webpage update time

In the address bar, type
javascript:alert(document.lastModified)

Sunday, September 19, 2010

ipe: the diagram kit for latex

Ipe is a nice kit for drawing diagrams for latex. It exports both as eps and pdf. You can then includegraphics to get the diagram. The nice thing is that it allows you to use latex command to write math symbols and formulae.

Installation for Debian/ubuntu is straightforward, a couple of apt-get. However if you do not have root previlige and you are using Redhat system, here is what you can do:

fontconfig: http://www.fontconfig.org/release
Install the latest version, I use 2.8.0 and installed to /opt

Qt: http://get.qt.nokia.com/qt/source/qt-everywhere-opensource-src-4.6.3.tar.gz
in projects.pro, add a line
LIBS += /opt
so that qt will find your fontconfig before hits the old one in /usr/
compile Qt take a while.

LUA: put in /opt as well.

Here is my change for config.mak under ipe/src
LUA_CFLAGS ?= -I/opt/include
LUA_LIBS ?= -L/opt/lib -llua
QT_CFLAGS ?= $(shell pkg-config --cflags QtGui QtCore)
QT_LIBS ?= $(shell pkg-config --libs QtGui QtCore)

MOC ?= moc

there might be build errors when you type 'make'. Just go to the source file and comment out offending lines, or enable some lines that says 'Alternative'

Finally, the ipe script to run ipe, the PIEANCIENTPDFTEX is necessary if your pdflatex is older than 1.4

$ cat ~/bin/ipe
============================
#!/bin/bash
export LD_LIBRARY_PATH="/opt/lib:/opt/ipe7/lib:$LD_LIBRARY_PATH"
export IPEANCIENTPDFTEX=1
/opt/ipe7/bin/ipe $* &
============================

Saturday, September 4, 2010

palindrome

Given a string, find the shortest prefix that is an even palindrome, in linear time.

The problem can be seen as asking for a minimum overlap of S and S^R. Then we can slide S^R to match a prefix of S by using Knuth-Morris-Pratt's algorithm to achieve O(n) time.

Monday, June 14, 2010

GCJ 10 Round 3 Problem A De-RNG-ed

A problem looks simple but hard to get correct.

1. Observations: for k=1 and k=2, and whether S[0] == S[1]

2. Now the formula becomes clear,

S[0]*A + B = S[1] mod P
S[1]*A + B = S[2] mod P

A = (S[2]-S[1])*(S[1]-S[0])^-1 mod P
B = S[1] - S[0]*A

3. How to compute multiplicative inverse?
3.1 Method 1, extended Euclidean Algorithm. The easiest impl is to use recursion

pair gcd(int a, int p)
{
if (p%a==0) return make_pair(1,0);
pair pp = gcd(p%a,a);
int aa = pp.first; int bb = pp.second;
return make_pair(bb-aa*(p/a) , aa);
}

3.2 Method 2, use the fact that a^p = a mod p for prime p
inverse is then mypow(a,p-2)
int mypow(int a, int d, int p)
{
int ret = 1;
while(d)
{
if (d&1) ret = (ret*a)%p;
a = (a*a)%p;
}
return ret;
}

4. Watch integer overflow. Once it gets to 10^6, square it gets 10^12, which is more than MAX_INT.

5. When round a negative number to 0 to P-1, use a = a%p, if (a < 0) a+=p;
a while loop like while(a < 0) a += p; kills you, the running time goes up from a few seconds to 13 min.

6. Precompute primes using naive sieve.

7. Optional, may use binary search to find first prime to work on, but no significant saving in time. Makes no difference in this problem in terms of time.

Sunday, June 13, 2010

Chad Fowler, the passionate programmer

A good writing from a developer/manager. All sections are 2-pages long with wisdom in developing a software engineers career. Not technical but offers a lot advice. A worth reading. circa 200 pages.

latex template

Spent much time battling LNCS format, last time I had a manuscript with excessive white space but this time I got very small margin and what is worse, the output pdf seems to be in A4 paper, which means much larger margin at bottom than top. Use pdflatex instead of latex/dvips/ps2pdf solves the second problem, but still, where is the large margin?

After went home, I compared previous draft with the current one line by line. And finally, package fullpage is the culprit. Remove it solves all the problems. This is a small paper submitted to a workshop. Now I have 13 pages instead of 10, and the limit is 12 pages. After happily removed one section of pseudo code, everything fits.

Saturday, June 12, 2010

GCJ 10 Round 3

As in 2008, I failed in round 2. But I made it closer this year. Almost immediately after the contest of round 2, I figured out problem B, the fifa world cup problem. Solving that one in the first hour would send me to the first 500 to qualify. But I just didn't get it and panicked during the contest. Wasting time recompile and test code that I know a problem exists. Lessons learned: Better make every detail correct on paper and write down as much code as possible on paper before start typing. Anyway, it scores zero if the typed code is incorrect.

Round 3 I have to be a spectator, instead of a contestant. All 4 problems are tough to me. But it is amazing to see how other top coders worked out their last problem with only 1min left. Will think over the problems.

TCO 10 Qual 3

I didn't register in that round so I missed all 3 qual rounds of this year's TCO. But it doesn't hurt to work on the problems in practice room.

p250: Simple problem asking to calc the right bottom number given the top row and left column. The catch is, "return 0 if the bottom right cannot be determined". But this never happes !

p500: Some music jargons. If you read the problem, you know how to solve it.

p1000: The only interesting problem. Given 1000x1000 board, a robot will execute a program made of U, D, L, R, meaning up,down,left,right by one unit. It starts at position (xpos,ypos), and given the program, it will draw the line by executing the program. The problem asks how many pieces will the line cut the board. topleft is (0,0), bottom right is (1000,1000).

After a while I figured out it is a graph connectivity problem. But to actually implement it there are several challenges. To work out the implicit graph needs careful management of the diff in x,y of the two cells separated by that step of robot move. Finally you need to do a BFS or DFS to count number of connected components. But DFS quickly goes segmentation fault because stack overflow. Note each 1x1 cell is a vertex and there are 10^6 of them. A simple BFS with queue implementation will do it.

Monday, May 24, 2010

codejam 2010, round 1

I end up participating all 3 subrounds, and the reason is simple, I didn't qualify the first two subrounds.

round 1A is the perfect time, 6pm PST. I gave up problem A because it is too long, although it turns out anyone solve A in one hour gets to top 1000. problem B is a DP for sure, but for some reason I just can't figure out how insertion works. problem C I had every piece correct, even sensed fibonacci numbers. However my submission for small problem is incorrect, while after the match the same code was determined to be correct!

round 1B is another disaster. Solved only problem A, which is far from enough. problem B is not difficult but I am confused on the way swap is done. Failing problem B and looking at problem C. It is even more interesting. I solved small input but forgot to mod 100003, and made 4 failed attempts! God. After the match the moment I looked at one of the top scorer's solution, I figured out the DP recurrence immediately. As later practice attempts show, getting problem C is still tricky because integer overflow issue. Moreover I forgot to remember previous computation to make it a DP.

round 1C, finally I solved problem A and B. This time my analysis made hits and went as far as to make two correct submissions. problem C is still hard for me, I have solved rectangle version but just didn't recognize the same idea for squares. Also the way remove cells is handled is a bit hard to figure out and I basically gave up after 15min, with about 1hour left. It was 3:30am and I have been working on those codejam problems for two days. Having observed I was ranked around 500, I thought I am safe to advance, so went to sleep.

In a way, google codejam is a very nice programming event. The problem sets and testcases are clever and exhaustive. You also get a chance to read top coders's code for free. The analysis is not easy but you will certainly appreciate it once you figure out what is going on. The difference between programming competition and algorithm textbook is that you have to be able to analyze the problem and realize the constraints. Also, you have to be able to code a DP solution.

Monday, February 8, 2010

Cayley's Formula

In Topcoder SRM 460, div2 1000, it asks for the number of labelled trees with a set of connected components. Each component is itself a tree of course. If each component is a single node, then Cayley's formula gives the result, |T_n| = n^(n-2). Now we have each component may be a tree, suppose there are k components with size c_1, c_2, ..., c_k, now what is the number of trees?

The formula is: n^(k-2)*c_1*c_2*...*c_k. But why?

http://en.wikipedia.org/wiki/Cayley%27s_formula
From this wiki page, the proof can be generalized to prove the above formula. Notice that if a component has c_i nodes, and it has degree d_i when collapsed as a single node in the tree, then there are actually c_i^d_i different trees derived from the tree with component C_i collapsed, because every node in C_i can be used to connect to the outside node for one degree. Now we have additional terms c_1^d_1 * c_2^d_2 * ... * c_k^d_k in the lemma, in addition to (k-2)! Now the multinomial formula that we are to use is to take x_1 = c_1, x_2=c_2, ... x_k = c_k, and

(c_1+c_2+...+c_k)^(k-2)
is what we are looking for.

An interesting alternative proof might be from the elegant Prüfer sequence,
but I am not sure how to extend this sequence to show the generalized formula.

algorithms and random thoughts