Saturday, December 24, 2016

The Twelve Days of Christmas

Programming Praxis posted a task to write a program to print the words to The Twelve Days of Christmas song. We are going to solve it in Factor.

We start off by defining all the gifts received on each day:

CONSTANT: gifts {
    { "first" "a partridge in a pear tree" }
    { "second" "two turtle doves and " }
    { "third" "three French hens, " }
    { "fourth" "four calling birds, " }
    { "fifth" "five golden rings, " }
    { "sixth" "six geese a-laying, " }
    { "seventh" "seven swans a-swimming, " }
    { "eighth" "eight maids a-milking, " }
    { "ninth" "nine ladies dancing, " }
    { "tenth" "ten lords a-leaping, " }
    { "eleventh" "eleven pipers piping, " }
    { "twelfth" "twelve drummers drumming, " }
}

Then we iterate through the days, gathering all the gifts in reverse for each day, and formatting them, and wrapping to 72 columns of text for display.

gifts [
    [ first ] [ 1 + gifts swap head values reverse concat ] bi*
    "On the %s day of Christmas my true love gave to me %s." sprintf
    72 wrap-string print nl
] each-index

Which gives us these words:

On the first day of Christmas my true love gave to me a partridge in a pear tree.

On the second day of Christmas my true love gave to me two turtle doves and a partridge in a pear tree.

On the third day of Christmas my true love gave to me three French hens, two turtle doves and a partridge in a pear tree.

On the fourth day of Christmas my true love gave to me four calling birds, three French hens, two turtle doves and a partridge in a pear tree.

On the fifth day of Christmas my true love gave to me five golden rings, four calling birds, three French hens, two turtle doves and a partridge in a pear tree.

On the sixth day of Christmas my true love gave to me six geese a-laying, five golden rings, four calling birds, three French hens, two turtle doves and a partridge in a pear tree.

On the seventh day of Christmas my true love gave to me seven swans a-swimming, six geese a-laying, five golden rings, four calling birds, three French hens, two turtle doves and a partridge in a pear tree.

On the eighth day of Christmas my true love gave to me eight maids a-milking, seven swans a-swimming, six geese a-laying, five golden rings, four calling birds, three French hens, two turtle doves and a partridge in a pear tree.

On the ninth day of Christmas my true love gave to me nine ladies dancing, eight maids a-milking, seven swans a-swimming, six geese a-laying, five golden rings, four calling birds, three French hens, two turtle doves and a partridge in a pear tree.

On the tenth day of Christmas my true love gave to me ten lords a-leaping, nine ladies dancing, eight maids a-milking, seven swans a-swimming, six geese a-laying, five golden rings, four calling birds, three French hens, two turtle doves and a partridge in a pear tree.

On the eleventh day of Christmas my true love gave to me eleven pipers piping, ten lords a-leaping, nine ladies dancing, eight maids a-milking, seven swans a-swimming, six geese a-laying, five golden rings, four calling birds, three French hens, two turtle doves and a partridge in a pear tree.

On the twelfth day of Christmas my true love gave to me twelve drummers drumming, eleven pipers piping, ten lords a-leaping, nine ladies dancing, eight maids a-milking, seven swans a-swimming, six geese a-laying, five golden rings, four calling birds, three French hens, two turtle doves and a partridge in a pear tree.

Monday, November 28, 2016

AnyBar

AnyBar is a macOS status indicator that displays a "colored dot" in the menu bar that can be changed programatically. What it means and when it changes is entirely up to the user.

You can easily install it with Homebrew-cask:

$ brew cask install anybar

The README lists a number of alternative clients in different programming languages. I thought it would be fun to show how to use it from Factor. Since AnyBar responds to AppleScript (and I added support for AppleScript a few years ago), we could do this:

USE: cocoa.apple-script

"tell application \"AnyBar\" to set image name to \"blue\""
run-apple-script

The AnyBar application also listens to a UDP port (default: 1738) and can be instructed to change from a Terminal using a simple echo | nc command:

$ echo -n "blue" | nc -4u -w0 localhost 1738

Using our networking words similarly is pretty simple:

"blue" >byte-array "127.0.0.1" 1738 <inet4> send-once

But if we wanted to get more fancy, we could use symbols to configure which AnyBar instance to send to, with default values to make it easy to use, and resolve-host to lookup hostnames:

SYMBOL: anybar-host
"localhost" anybar-host set-global

SYMBOL: anybar-port
1738 anybar-port set-global

: anybar ( str -- )
    ascii encode
    anybar-host get resolve-host first
    anybar-port get with-port send-once ;

AnyBar is a neat little program!

Saturday, November 26, 2016

Reverse Factorial

A few years ago, I wrote about implementing various factorials using Factor. Recently, I came across a programming challenge to implement a "reverse factorial" function to determine what factorial produces a number, or none if it is not a factorial.

To do this, we examine each factorial in order, checking against the number being tested:

: reverse-factorial ( m -- n )
    1 1 [ 2over > ] [ 1 + [ * ] keep ] while [ = ] dip and ;

And some unit tests:

{ 10 } [ 3628800 reverse-factorial ] unit-test
{ 12 } [ 479001600 reverse-factorial ] unit-test
{ 3 } [ 6 reverse-factorial ] unit-test
{ f } [ 18 reverse-factorial ] unit-test

Thursday, October 27, 2016

Gopher Server

A few days ago, I noticed a post about building a Gopher Server in Perl 6. I had already implemented a Gopher Client in Factor, and thought it might be fun to show a simple Gopher Server in Factor in around 50 lines of code.

Using the io.servers vocabulary, we will define a new multi-threaded server that has a directory to serve content from and hostname that it can be accessed at:

TUPLE: gopher-server < threaded-server
    { serving-hostname string }
    { serving-directory string } ;

When a file is requested, it can be streamed back to clients:

: send-file ( path -- )
    binary [ [ write ] each-block ] with-file-reader ;

The Gopher protocol is defined in RFC 1436 and lists a few differentiated file types. We use the mime.types vocabulary to return the correct one.

: gopher-type ( entry -- type )
    dup directory? [
        drop "1"
    ] [
        name>> mime-type {
            { [ dup "text/" head? ] [ drop "0" ] }
            { [ dup "image/gif" = ] [ drop "g" ] }
            { [ dup "image/" head? ] [ drop "I" ] }
            [ drop "9" ]
        } cond
    ] if ;

When a directory is requested, we can send a listing of all the sub-directories and files it contains, sending their relative path to the root directory being served so they can be requested properly by the client:

:: send-directory ( server path -- )
    path [
        [
            [ gopher-type ] [ name>> ] bi
            dup path prepend-path
            server serving-directory>> ?head drop
            server serving-hostname>>
            server insecure>>
            "%s%s\t%s\t%s\t%d\r\n" sprintf utf8 encode write
        ] each
    ] with-directory-entries ;

To know which path was requested, we read the line, split on the first tab, carriage return, or newline character we see:

: read-gopher-path ( -- path )
    readln [ "\t\r\n" member? ] split1-when drop ;

With all of that built, we can now implement a word to handle a client request:

M: gopher-server handle-client*
    dup serving-directory>> read-gopher-path append-path
    dup file-info directory? [
        send-directory
    ] [
        send-file drop
    ] if flush ;

Initializing a gopher-server instance and providing a convenience word to start one:

: <gopher-server> ( directory port -- server )
    utf8 gopher-server new-threaded-server
        "gopher.server" >>name
        swap >>insecure
        binary >>encoding
        "localhost" >>serving-hostname
        swap resolve-symlinks >>serving-directory ;

: start-gopher-server ( directory port -- server )
    <gopher-server> start-server ;

This is available in the gopher.server vocabulary with a few improvements such as:

  • Support for .gophermap files for alternate results when content is requested.
  • Support for .gopherhead files to print headers above directory listings.
  • Navigation to parent directories using .. links.
  • Display file modified timestamp and file sizes.
  • Improved error handling.

Monday, August 8, 2016

Cuckoo Filters

A Cuckoo filter is a Bloom filter replacement that allows for space-efficient probabilistic membership checks. Cuckoo filters provide the ability to add and remove items dynamically without significantly degrading space and performance. False positive rates are typically low.

This data structure is explained by Bin Fan, Dave Andersen, Michael Kaminsky, and Michael Mitzenmacher in two papers: Cuckoo Filter: Better Than Bloom and Cuckoo Filter: Practically Better Than Bloom. There is also an implementation in C++ that can be referred to.

The Cuckoo filter is basically a dense hash table that can support high load factors (up to 95%) without degraded performance. Instead of storing objects, we will store a hashed fingerprint.

Buckets

First, we need to create a number of buckets. Each bucket will hold 4 fingerprints. Load factors over 96% will cause us to grow our capacity to the next-power-of-2.

! The number of fingerprints to store in each bucket
CONSTANT: bucket-size 4

! The maximum load factor we allow before growing the capacity
CONSTANT: max-load-factor 0.96

: #buckets ( capacity -- #buckets )
    [ bucket-size /i next-power-of-2 ] keep
    over / bucket-size / max-load-factor > [ 2 * ] when ;

Making our buckets is then just an array of arrays:

: <cuckoo-buckets> ( capacity -- buckets )
    #buckets [ bucket-size f <array> ] replicate ;

Given a fingerprint, we can check if it is in a bucket by calling member?:

: bucket-lookup ( fingerprint bucket -- ? )
    member? ;

To insert a fingerprint into the bucket, we find the first empty slot and replace it with the fingerprint. We return a boolean value indicating if we were able to insert it or not:

: bucket-insert ( fingerprint bucket -- ? )
    dup [ not ] find drop [ swap set-nth t ] [ 2drop f ] if* ;

To delete a fingerprint, we find its index (if present) and set it to false.

: bucket-delete ( fingerprint bucket -- ? )
    [ f ] 2dip [ index ] keep over [ set-nth t ] [ 3drop f ] if ;

If the bucket is full, we need to be able to swap a fingerprint into the bucket, replacing/removing an existing one:

: bucket-swap ( fingerprint bucket -- fingerprint' )
    [ length random ] keep [ swap ] change-nth ;

Hashing

Our hashing strategy will be to generate the SHA-1 hash value for a given byte-array, splitting it into two 32-bit values (a 32-bit fingerprint, and a 32-bit index value). We will also generate an alternate index value as well using a constant from the MurmurHash to mix with the primary index:

: hash-index ( hash -- fingerprint index )
    4 over <displaced-alien> [ uint deref ] bi@ ;

: alt-index ( fingerprint index -- alt-index )
    [ 0x5bd1e995 w* ] [ bitxor ] bi* ;

: hash-indices ( bytes -- fingerprint index alt-index )
    sha1 checksum-bytes hash-index 2dup alt-index ;

Insert/Lookup/Delete

Our Cuckoo filter holds our buckets:

TUPLE: cuckoo-filter buckets ;

: <cuckoo-filter> ( capacity -- cuckoo-filter )
    <cuckoo-buckets> cuckoo-filter boa ;

To insert an item into the Cuckoo filter, we calculate its hash-indices and then try inserting it into the bucket specified by the first index, then the bucket specified by the second index. If those buckets are full, we go through a "kickdown" process to move fingerprints from other buckets until we find a bucket that has space, or exceed the maximum number of attempts:

! The maximum number of times we kick down items/displace from
! their buckets
CONSTANT: max-cuckoo-count 500

:: cuckoo-insert ( bytes cuckoo-filter -- ? )
    bytes hash-indices :> ( fp! i1 i2 )
    cuckoo-filter buckets>> :> buckets
    buckets length :> n
    {
        [ fp i1 n mod buckets nth bucket-insert ]
        [ fp i2 n mod buckets nth bucket-insert ]
    } 0|| [
        t
    ] [
        2 random zero? i1 i2 ? :> i!
        max-cuckoo-count [
            drop
            fp i n mod buckets nth bucket-swap fp!
            fp i alt-index i!

            fp i n mod buckets nth bucket-insert
        ] find-integer >boolean
    ] if ;

To lookup an item, we calculate the hash-indices and then check the two buckets to see if the fingerprint can be found.

:: cuckoo-lookup ( bytes cuckoo-filter -- ? )
    bytes hash-indices :> ( fp i1 i2 )
    cuckoo-filter buckets>> :> buckets
    buckets length :> n
    {
        [ fp i1 n mod buckets nth bucket-lookup ]
        [ fp i2 n mod buckets nth bucket-lookup ]
    } 0|| ;

To delete an item, we calculate the hash-indices and then try and remove it from the first index, or the second index if not found in the first bucket.

:: cuckoo-delete ( bytes cuckoo-filter -- ? )
    bytes hash-indices :> ( fp i1 i2 )
    cuckoo-filter buckets>> :> buckets
    buckets length :> n
    {
        [ fp i1 n mod buckets nth bucket-delete ]
        [ fp i2 n mod buckets nth bucket-delete ]
    } 0|| ;

This is available in the cuckoo-filters vocabulary along with some tests, documentation, and a few extra features including the ability to change the checksum being used.

Friday, July 15, 2016

Backticks

Most languages support running arbitrary commands using something like the Linux system function. Often, this support has both quick-and-easy and full-featured-but-complex versions.

In Python, you can use os.system:

>>> os.system("ls -l")

In Ruby, you can use system as well as "backticks":

irb(main):001:0> system("ls -l")

irb(main):002:0> `ls -l`

Basically, the difference between "system" and "backticks" is:

  • "system" executes a command, returning the exit code of the process.
  • "backticks" executes a command, returning the standard output of the process.

Factor has extensive cross-platform support for launching processes, but I thought it would be fun to show how custom syntax can be created to implement "backticks", capturing and returning standard output from the process:

SYNTAX: `
    "`" parse-multiline-string '[
        _ utf8 [ contents ] with-process-reader
    ] append! ;

You can use this in a similar fashion to Ruby or Perl:

IN: scratchpad ` ls -l`
Note: This syntax currently requires a space after the leading backtick. In the future, we have plans for an improved lexer that removes this requirement.

This is available in the backticks vocabulary.

Wednesday, July 6, 2016

Clock Angles

Programming Praxis posted about calculating clock angles, specifically to:

Write a program that, given a time as hours and minutes (using a 12-hour clock), calculates the angle between the two hands. For instance, at 2:00 the angle is 60°.

Wikipedia has a page about clock angle problems that we can pull a few test cases from:

{ 0 } [ "12:00" clock-angle ] unit-test
{ 60 } [ "2:00" clock-angle ] unit-test
{ 180 } [ "6:00" clock-angle ] unit-test
{ 18 } [ "5:24" clock-angle ] unit-test
{ 50 } [ "2:20" clock-angle ] unit-test

The hour hand moves 360° in 12 hours and depends on the number of hours and minutes (properly handling midnight and noon to be ):

:: hour° ( hour minutes -- degrees )
    hour [ 12 = 0 ] keep ? minutes 60 / + 360/12 * ;

The minute hand moves 360° in 60 minutes:

: minute° ( minutes -- degrees )
    360/60 * ;

Using these words, we can calculate the clock angle from a time string:

: clock-angle ( string -- degrees )
    ":" split1 [ number>string ] bi@
    [ hour° ] [ minute° ] bi - abs ;

Friday, March 25, 2016

left-pad

In the wake of an epic ragequit where Azer Koçulu removed all of his modules from npm (the node.js package manager), there have been so many entertaining discussions and explanations covering what happened.

Today, Programming Praxis posted the leftpad challenge, pointing out that the original solution ran in quadratic time due to it's use of character-by-character string concatenation (but not pointing out that it only works with strings).

First, the original code in Javascript:

function leftpad (str, len, ch) {
  str = String(str);
  var i = -1;
  if (!ch && ch !== 0) ch = ' ';
  len = len - str.length;
  while (++i < len) {
    str = ch + str;
  }
  return str;
}

Now, a (simpler? faster? more general?) version in Factor:

:: left-pad ( seq n elt -- newseq )
    seq n seq length [-] elt <repetition> prepend ;

Using it, you can see it works:

IN: scratchpad "hello" 3 CHAR: h left-pad .
"hello"

IN: scratchpad "hello" 10 CHAR: h left-pad .
"hhhhhhello"

And it even works with other types of sequences:

IN: scratchpad { 1 2 3 } 3 0 left-pad .
{ 1 2 3 }

IN: scratchpad { 1 2 3 } 10 0 left-pad .
{ 0 0 0 0 0 0 0 1 2 3 }

I should also point out that Factor has pad-head that does this in the standard library and node.js has a pad-left module that solves the quadratic time problem (but still only works with strings).