Approaching Text Parsing in Scala

I m making an application that will parse commands in Scala. An example of a command would be:

todo get milk for friday

So the plan is to have a pretty smart parser break the line apart and recognize the command part and the fact that there is a reference to time in the string.

In general I need to make a tokenizer in Scala. So I m wondering what my options are for this. I m familiar with regular expressions but I plan on making an SQL like search feature also:

search todo for today with tags shopping

And I feel that regular expressions will be inflexible implementing commands with a lot of variation. This leads me to think of implementing some sort of grammar.

What are my options in this regard in Scala?


You want to search for "parser combinators". I have a blog post using this approach (http://cleverlytitled.blogspot.com/2009/04/shunting-yard-algorithm.html), but I think the best reference is this series of posts by Stefan Zieger (http://szeiger.de/blog/2008/07/27/formal-language-processing-in-scala-part-1/)


Here are slides from a presentation I did in Sept. 2009 on Scala parser combinators. (http://sites.google.com/site/compulsiontocode/files/lambdalounge/ImplementingExternalDSLsUsingScalaParserCombinators.ppt) An implementation of a simple Logo-like language is demonstrated. It might provide some insights.

Scala has a parser library (scala.util.parsing.combinator) which enables one to write a parser directly from its EBNF specification. If you have an EBNF for your language, it should be easy to write the Scala parser. If not, you d better first try to define your language formally.

