Wednesday, August 10, 2011

Printf

The venerable printf function is available in most languages. Used for "formatted printing", it allows you to convert most basic data types to a string. Several years ago, I contributed an implementation for Factor that currently lives in the formatting vocabulary. Using it looks a bit like this:
( scratchpad ) 12 "There are %d monkeys" printf
There are 12 monkeys

Implementation

One of the neat things about this version, is that the format string is parsed and code to format the arguments is generated at compile-time. Below, I've created a simplified version of printf to show how this works.
USING: io io.streams.string kernel macros make math math.parser
peg.ebnf present quotations sequences strings ;
We use the peg.ebnf vocabulary to parse the format string into a sequence of quotations (either strings or format instructions). Each quotation uses the make vocabulary to add these strings to a sequence (to be written out):
EBNF: parse-printf

fmt-%      = "%"   => [[ [ "%" ] ]]
fmt-c      = "c"   => [[ [ 1string ] ]]
fmt-s      = "s"   => [[ [ present ] ]]
fmt-d      = "d"   => [[ [ >integer number>string ] ]]
fmt-f      = "f"   => [[ [ >float number>string ] ]]
fmt-x      = "x"   => [[ [ >hex ] ]]
unknown    = (.)*  => [[ >string throw ]]

strings    = fmt-c|fmt-s
numbers    = fmt-d|fmt-f|fmt-x

formats    = "%"~ (strings|numbers|fmt-%|unknown)

plain-text = (!("%").)+
                   => [[ >string 1quotation ]]

text       = (formats|plain-text)*
                   => [[ [ \ , suffix ] map ]]

;EBNF
You can see the EBNF output by trying it in the listener:
( scratchpad ) "There are %d monkeys" parse-printf .
V{
    [ "There are " , ]
    [ >integer number>string , ]
    [ " monkeys" , ]
}
The "printf" macro takes the parsed output, reverses it (so the elements to be formatted can be passed on the stack in their natural order), applies each format quotation to the elements on the stack, and then writes them back in the original order.
MACRO: printf ( format-string -- )
    parse-printf reverse  [ ] concat-as [
        { } make reverse [ write ] each
    ] curry ;
You can use expand-macros to see the code the macro generates:
( scratchpad ) [ "There are %d monkeys" printf ] expand-macros .
[
    [ " monkeys" , >integer number>string , "There are " , ]
    { } make reverse [ write ] each
]
Implementing sprintf is easy using string streams to capture the output into a string object:
: sprintf ( format-string -- result )
    [ printf ] with-string-writer ; inline

Tests

We can write some unit tests to show that it works:
[ "" ] [ "" sprintf ] unit-test
[ "asdf" ] [ "asdf" sprintf ] unit-test
[ "10" ] [ 10 "%d" sprintf ] unit-test
[ "-10" ] [ -10 "%d" sprintf ] unit-test
[ "ff" ] [ HEX: ff "%x" sprintf ] unit-test
[ "Hello, World!" ] [ "Hello, World!" "%s" sprintf ] unit-test
[ "printf test" ] [ "printf test" sprintf ] unit-test
[ "char a = 'a'" ] [ CHAR: a "char %c = 'a'" sprintf ] unit-test
[ "0 message(s)" ] [ 0 "message" "%d %s(s)" sprintf ] unit-test
[ "10%" ] [ 10 "%d%%" sprintf ] unit-test
[ "[monkey]" ] [ "monkey" "[%s]" sprintf ] unit-test

This implementation doesn't support various format parameters such as width, alignment, padding characters, uppercase/lowercase, decimal digits, or scientific notation. Nor does it support formatting sequences and assocs, like the official version. However, adding those features is straightforward once you understand the basic mechanics.

The code for this is on my Github.

No comments: