Today's programming challenge is to implement the "old Unix Sys V R4"
"The original sum calculated a checksum as the sum of the bytes in the file, modulo 216−1, as well as the number of 512-byte blocks the file occupied on disk. Called with no arguments, sum read standard input and wrote the checksum and file blocks to standard output; called with one or more filename arguments, sum read each file and wrote for each a line containing the checksum, file blocks, and filename."
First, some imports:
USING: command-line formatting io io.encodings.binary io.files kernel math math.functions namespaces sequences ;
A quick file-based version might look like this:
: sum-file. ( path -- ) [ binary file-contents [ sum 65535 mod ] [ length 512 / ceiling ] bi ] [ "%d %d %s\n" printf ] bi ;
You can try it out:
( scratchpad ) "/usr/share/dict/words" sum-file. 19278 4858 /usr/share/dict/words
The main drawbacks to this version are: loading the entire file into memory (which might be a problem for big files), not printing an error if the file is not found, and not supporting standard input.
A more complete version might begin by implementing a function that reads from a stream, computing the checksum and the number of 512-byte blocks:
: sum-stream ( -- checksum blocks ) 0 0 [ 65536 read-partial dup ] [ [ sum nip + ] [ length + nip ] 3bi ] while drop [ 65535 mod ] [ 512 / ceiling ] bi* ;
The output should look like
CHECKSUM BLOCKS FILENAME:
: sum-stream. ( path -- ) [ sum-stream ] dip "%d %d %s\n" printf ;
We can generate output for a particular file (printing
FILENAME: not found if the file does not exist):
: sum-file. ( path -- ) dup exists? [ dup binary [ sum-stream. ] with-file-reader ] [ "%s: not found\n" printf ] if ;
And, to prepare a version of
sum that we can deploy as a binary and run from the command line, we build a simple MAIN: word:
: run-sum ( -- ) command-line get [ "" sum-stream. ] [ [ sum-file. ] each ] if-empty ; MAIN: run-sum
The code for this is on my Github.