codechef July Challenge: Favorite Number problem.
The algorithm finds all matching positions that matches one or more in the given set of patterns.
It implements a finite state automaton. States are 0, 1, 2, ....
goto(s, a) = s' move from state s to s', when input is letter a
fail(s, a) = s' move from state s to s', when input letter a does not match any next state
output(s) = { set of strings in pattern }
When do matching text to patterns, just walk through the trie and follow fail link when necessary, similar to Knuth-Morris-Pratt.
First we show how to use the finite state machine to process text
state = 0
for each ch in text
while goto(state, ch) == fail do, state = fail(state) // must stop at state = 0
state = goto(state, ch)
if output(state) not empty, then emit output(state)
Now we show how to construct the functions.
goto() is easy, for each keywords y[i] , i=1..k (we have k keywords)
state = 0
for each ch in y[i]
if goto(state, ch) == fail, then goto(state, ch) = cntstate++
else do nothing
append y[i] to output[state]
fail() function is constructed using a BFS on the graph. We first do depth=1 nodes, which have parent as state=0, then do it with increasing depth.
queue = empty
state = 0
// do depth=1
for each next_state of state, fail(next_state) = state, enqueue(next_state)
// do depth=d using d-1
while queue not empty do
r = queue.pop()
for each symbol ch
if goto(r, ch) = s != fail, then // s is at depth=d with parent=r
enqueue(s)
state = fail(r)
while goto(state, ch) == fail, do state = fail(state)
fail(s) = state
output(s).append(output(state))
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment