Sunday, September 25, 2011

Earley Parsing in Haskell

After a bit of a struggle I can run simple parsing examples using my Earley algorithm implementation [1] in Haskell. At this stage the implementation is likely to be flawed, with respect to correctness and likely performance. Regardless, the point I wanted to prove to myself is that it is perfectly plausible to define context-free grammars and their interpretation in plain Haskell, without resorting to code generation, or giving up full grammar analysis. I suspect it is a well-known method but it was a pleasure to discover. See for yourself:



Aycock and Horspool's paper [2] was my source for the algorithm. I admit I did not get very far, pretty much ignoring (for now) the bulk of their paper and the automaton construction, and focusing on their review of the basics. I also had to adjust things a bit to fit Haskell's purity, and devised a slightly modified (and possibly faulty) interaction of completer and predictor.

Earley's algorithm is beautiful. Very simple, fully general (all CFGs!), cubic in worst-case but close to linear on practical grammars, and, perhaps most importantly, incremental (think completion in an IDE). A highly recommended exercise.

[1] https://github.com/toyvo/haskell-earley
[2] http://webhome.cs.uvic.ca/~nigelh/Publications/PracticalEarleyParsing.pdf

No comments:

Post a Comment