Wednesday, August 12, 2009

Calculating with EBNF

Factor has a syntax for EBNF parsing grammars, implemented in the peg.ebnf vocabulary.

Several useful vocabularies are partially implemented using EBNF syntax, including (among many) the formatting, globs, urls, regexp, brainfuck, javascript, and smalltalk vocabularies.

I thought I'd walk through the creation of a small macro for parsing "calculator" expressions, as a small demonstration of their utility.

First, the goal. We'd like to be able to parse, and compute, the following:

( scratchpad ) "2+2" calc .
4

So, first we need to define a parsing grammar for numbers. Since these can be represented as an optional sign (for negative numbers), a whole number, and optional digits (for floating point numbers), we start with this structure:

sign   = "-"
whole  = ([0-9])*
digit  = "." ([0-9])*
number = sign? whole digit?

We would like the grammar to parse each component as a string, if specified, concatenate it together, and then convert the resulting string to a number, to be used by the rest of the grammar in calculations.

sign   = "-" 
=> [[ >string ]]

whole  = ([0-9])* 
=> [[ >string ]]

digit  = "." ([0-9])* 
=> [[ [ >string ] map concat ]]

number = sign? whole digit? 
=> [[ [ f eq? not ] filter concat string>number ]]

The calculator needs to support operations, so we start with the basic four: addition, subtraction, multiplication, and division. These map the characters "+", "-", "*", and "/" to the words implementing those functions in Factor.

add  = "+" => [[ [ + ] ]] 
sub  = "-" => [[ [ - ] ]] 
mul  = "*" => [[ [ * ] ]] 
div  = "/" => [[ [ / ] ]]

ops  = add|sub|mul|div

We define an expression to be two numbers combined by an operator, which changes infix to postfix notation, using the fry vocabulary to create a quotation that will perform the intended computation.

expr = number ops number
=> [[ [ first ] [ third ] [ second ] tri '[ _ _ @ ] ]]

Putting this all together, we have our calculator:

USING: fry kernel macros peg.ebnf 
math math.parser sequences strings ;

EBNF: parse-calc

sign   = "-"          => [[ >string ]]
whole  = ([0-9])*     => [[ >string ]]
digit  = "." ([0-9])* => [[ [ >string ] map concat ]]

number = sign? whole digit?  
=> [[ [ f eq? not ] filter concat string>number ]]

add  = "+"  => [[ [ + ] ]]         
sub  = "-"  => [[ [ - ] ]]         
mul  = "*"  => [[ [ * ] ]]         
div  = "/"  => [[ [ / ] ]]

ops  = add|sub|mul|div

expr = number ops number
=> [[ [ first ] [ third ] [ second ] tri '[ _ _ @ ] ]]

;EBNF

MACRO: calc ( string -- ) parse-calc ;

You can use the macros.expander vocabulary to see the code that is created when you compile the original example:

( scratchpad ) [ "2+2" calc ] expand-macros .
[ 2 2 [ + ] call ]

This model can be reasonably extended to support other operators (such as "mod", "shift", "pow", etc.), functions (such as "sin", "sqrt", etc.), spaces in the expressions, chained expressions, grouped expressions, input validation, compile-time calculations, and other useful features. But, then it wouldn't be so simple...