(via drawohara)
Monthly Archives: May 2008
acts_as_state_machine for Dummies, part I
I recently applied this great plugin a few times to tackle different tasks and would like to share with you the joys of thinking in state machines!
Disclaimer: This installment is ‘just’ an intro to the topic (a teaser if you like), it doesn’t contain actual instructions or code on how to use acts_as_state_machine – that comes in part II. Though the original intent was to write up a small tutorial with some example code, I started with an intro and the article grew so long that I decided to split it up – so stay tuned for part deux!
State what…?!?
A finite state machine (FSM for short) a.k.a. finite state automaton (FSA) is a 5-tuple (Σ,S,s0,δ,F)… OK, just kidding. I doubt too much people are interested in the rigorous definition of the FSM (for the rest, here it is), so let’s see a more down-to earth description.
According to wikipedia, Finite state machine is “_a model of behavior composed of a finite number of states, transitions between those states, and actions_”. Somewhat better than those gammas and sigmas and stuff but if you are not the abstract thinker type, it might take some time to wrap your brain around it. I believe a good example can help here!
Everybody knows and loves regular expressions – but probably it’s not that wide known fact that regular expression matching can be solved with an FSM (and in fact, a lot of implementations are using some kind of FSM on steroids). So let’s see a simple example. Suppose we would like to match a string against the following, simple regexp:
ab+(a|c)b*
First we have to construct the FSM, which will be fed with the string we would like to match. An FSM for the above regular expression might look like this:
String matching against this FSM is basically answering the question ‘starting from the initial state, can we reach the finish after feeding the FSM the whole string?’. Pretty easy – the only thing we have to define is ‘feeding’.
So let’s take the string ‘abbbcbbb’ as an example and feed the FSM! The process looks like this:
- we are in q0, the initial state (where the ‘start’ label is). Starting to consume the string
- we receive the first character, it’s an ‘a‘. We have an ‘a‘ arrow to state q1, so we make a transition there
- we receive the next character, ‘b‘. We have two b-arrows: to q1 and q2. We choose to go to q1 (in fact, staying in q1) – remember, the question is whether we _can_ reach the finish, not whether all roads lead to the finish – so the choice is ours!
- identical to the above
- after the two above steps, we are still in q1. We still get a ‘b‘ but this time we decide to move to q2.
- we are in q2 and the input is ‘c‘. We have no analysis-paralysis here since the only thing we can do is to move to q4 – so let’s do that!
- Whoa! We reached the finish line! (q4 is one of the terminal states). However, we didn’t consume the whole string yet, so we can’t yet tell whether the regexp matches or not.
- So we eat the rest of the string (the ‘how’ is left as an exercise to the reader) and return ‘match!’
Let’s see a very simple non-matching example on the string ‘abac’
- in q0
- got an ‘a‘, move to q1
- in q1, got a ‘b‘, move to q2
- in q2, got an ‘a‘, move to q3 – we reached the finish, but still have a character to consume
- in q3, got a ‘c‘… oops. We have no ‘c’ arrow from q3 so we are stuck. return ‘no match!’
Of course the real-life scenarios are much more complicated than the above one and sometimes FSMs are not enough (for example to my knowledge it’s not possible to tell about a number whether it is prime or not with a vanilla FSM – but a regexp doing just that has been floating around some time ago) but to illustrate the concept this example served fine.
This is cool and all but why should I care?!?
Well, yeah, you are obviously not going to model an FSM the next time you would like to match a regexp – that would be wheel-reinvention at it’s finest. However there are some practical scenarios where an FSM can come handy:
- sometimes the logic flow is just too complicated to model – an if-forrest is rarely a good solution (on the flip side, don’t model an if-else with an FSM :-))
- encapsulate complex logic flow into a pattern and not clutter your code with it.
- you are in a stateless world – for example HTTP
- asynchronous and/or distributed processing where you explicitly need to maintain your state and act upon it
Some real life examples of FSM usage in the Ruby/Rails world are why the lucky stiff’s Hpricot (using Ragel) or Rick Olson’s restful authentication plugin (using acts_as_state_machine)
The Next Episode
In the next installment I’d like to focus on the practical usage of the acts_as_state_machine plugin – I’ll attempt to create an asynchronous messaging system in a Rails app using it.