CS 451/LLC Lab 6: Parsing Regular Expressions

Due Wednesday 12 April 2006

Using Haskell's (monadic) parsing combinators, write a parser that will read in a regular expression (possibly with minimal parenthesiation) and return a value of a regular expression type (as presented in code in lecture, or of your own design).

More specifically, you should use the characters "0" and "E" for the null and epsilon regular expressions and lower-case alphabetic letters for the literals. The star operator should have highest precedence, followed by concatenation (just juxtaposition) and finally the alternation operator (vertical bar, "|"). Concatenation and alternation should be consistently left- or right-associating (whichever is easier for you to get).

For example, a typical regular expression might look like this:

a*(a|b)*E|0
which would be interpreted as equivalent to the following excessively-parenthesized expression (your program should accept either, and print out something reasonable to verify its correctness, though perhaps not so minimal as the first):
(((((a)*)((a|b)*))E)|0)
(perhaps modulo left-/right association).

You may want to look at:

More references to come, as needed ...