Designing high-performance State Machine in Java

Question

I am in the process of starting to write a Java library to implement high-performance Finite State Machines. I know there are a lot of libraries out there, but I want to write my own from scratch, as almost all the libraries out there construct automatons optimized for handling only one at a time.

I would like to know what the people in the SO community who have dabbled in state machine design feels are the most important / best design principles when it comes to implementing high-performance libraries like these.

Considerations

The automatons generated are typically not massive. (~ 100-500 states).
The implementation should be able to scale though.
The implementation should enable fast transformations (minimization, determinization etc.).
Looking to implement DFA, NFA, GNFA, PDA and possibly Tree Automata. Hopefully under a single interface if possible.
Should have a good balance between memory use and performance.

Current questions regarding design for me at the moment are:

Should classes for State, Symbol and Transition be defined? Or should a "hidden" internal structure be used. Personally I feel that using classes as such would waste a lot of memory since the same information can be stored in a much more condensed form. But, does this enable faster transformations? Does it hold any other pros / cons?
What would be the best way to store the data internally? Using data structures like HashMap and HashSet enables amortized constant time lookups, but there is an element of overhead involved. Is this the best way? Storing the transition information as a primitive (or not) array seems to waste quite a bit of memory. Especially when the library needs to handle a lot of automatons at a time. What are the pros / cons of the different data structures?

I appreciate any input. Thanks!

Hi, maybe you can use my professors implementation as a reference. It is DFA and NFA based and open source. Implementation: http://www.brics.dk/automaton/ Performance benchmarks: http://tusker.org/regex/regex_benchmark.html — Lasse Espeholt, Mar 12 '11 at 10:57
@lasseespeholt - Lol, I actually have the code already and I am working through it. It's one of the only Java libraries I've found that are actually in active development. Give my appreciation and regards to your professor! — Nico Huysamen, Mar 12 '11 at 11:03
Have you thought about using stateful rule sessions in Drools? — Chris J, Mar 12 '11 at 12:50

score 8 · Accepted Answer · answered Mar 12 '11 at 12:28

Well how fast do you want it to be? The code at brics.dk/automaton does declare its own State and Transition classes altough, obviously, these could be rewritten using primitives (heck, the entire Transition class's state apparently would easily fit on a long).

Thing is, if you move, for example, the Transition class to simply a primitive, then you're not forced to use anymore the slow HashMap<Transition,...> default Java collections: you can use libraries like Trove's TLongObjectHashMap (or TLongInt... or TLongLong, whatever) which owns the default HashMap big times (the Trove libraries basically provides maps and sets that are super efficient, both fast and small, when you work with primitives: you don't generate countless garbage nor constant needless wrapping around primitives, so less GC etc. If you're into performance, then you do want to check Trove... And their 3.0 upcoming release is 20% faster than Trove 2.0).

But is it really useful? Apparently that library is already plenty of fast. There's no doubt it can be made faster by not wastefully creating objects and by using collections that do actually perform well but it's not clear that it would be desirable.

Besides that, I'm pretty sure that the library above is not thread safe. The State constructor creates a unique ID by doing this:

static int next_id;
.
.
.
id = next_id++;

and that constructor is called from... 90 different places!

Textbook example of a way to not create a unique ID in a multi-threaded scenario (heck, even making next_id volatile wouldn't be sufficient, you want, say, an AtomicInteger here). I don't know the library well enough but this ID thinggy looks very fishy to me.

Thanks for the info, especially into *Trove*, I will definitely have a look at it. As far as speed is concerned, the faster the better, so long as transformation speed is not affected. — Nico Huysamen, Mar 12 '11 at 12:48
I have been playing around with *Trove*. Those collections are crazy fast. Thanks for the link! — Nico Huysamen, Mar 13 '11 at 12:13
@Nico Huysamen: ahah, nice :) The way they're generated is pretty cool too: they're basically using pre-processor/code-generators as to not have to duplicate Java code for all the primitive types. — SyntaxT3rr0r, Mar 15 '11 at 00:02

score 3 · Answer 2 · answered Mar 12 '11 at 14:11

3

I have some questions:

Which part do you need to be fast, the inputting of the FSA, the building of the FSA, or the execution of the FSA?
Where does the input of the FSA come from? Does a human put in states and arcs, or some automatic process? Does the real input come from a regular expression that's converted to a FSA?
How often can the FSA change? Once a second? Once a year?

You know what you need. Aside from academic Turing machines, I've never seen a significant state machine that didn't start from a textual representation, either as a regular expression or a structured program.

In every case I've dealt with, the preferred implementation was to convert the regular expression directly into a simple structured program and compile it. Nothing will execute any faster than that.

answered Mar 12 '11 at 14:11

Mike Dunlavey

40,059
14
91
135

1) What actually needs to be fast is any transformations done on the automata (determinizing, minimizing etc). 2) 99.9% generated by a random generator. 3) If you are talking about transformations, I would say more once a second than once a year. – Nico Huysamen Mar 12 '11 at 21:38
1

@Nico: Then what I would do is prototype those transformation algorithms on the simplest data structure possible. [Tune them on realistic input.](http://stackoverflow.com/questions/926266/performance-optimization-strategies-of-last-resort/927773#927773) Then do version 2 with whatever data structure you now know is best. – Mike Dunlavey Mar 12 '11 at 21:56
I forgot to mention. I do not wish to convert regular expressions to FAs (well, I'll do that too, but not initially). When generating random FAs, I need full control over the number of states, the transitions, how many transitions per state etc... Simply converting REGEX will invalidate that requirement. – Nico Huysamen Mar 13 '11 at 08:55

Designing high-performance State Machine in Java

2 Answers2

Linked