cksum
I had trouble unearthing the CRC used by GNU’s cksum. Its manpage mentions the particular algorithm is a POSIX standard. On seeing cksum print 32-bit integers on test inputs, I assumed the algorithm was CRC-32, the most popular 32-bit CRC.
I eventually realized I should have searched for “posix cksum”. Although it shares the same polynomial, the algorithm differs slightly from CRC-32.
Here’s a simple C implementation (typical implementations use a lookup table for the polynomial reduction):
#include<stdint.h> #include<stdio.h> int main() { unsigned int p = 0x04C11DB7, r = 0; void f(char c) { for(int i = 7; i >= 0; i--) { int msb = r & (1<<31); r <<= 1; if (msb) r = r ^ p; } r ^= c; } int c, len = 0; while(EOF != (c = getchar())) { len++; f(c); } int n = len; do { f(n & 0xff); n >>= 8; } while(n); f(0), f(0), f(0), f(0); printf("%u %u\n", ~r, len); return 0; }
and a Go version, with support for filename arguments:
package main import ("bufio";"flag";"fmt";"io";"os") func main() { cksum := func(file *os.File) { in := bufio.NewReader(file) var p, r uint p = 0x04C11DB7 r = 0 f := func(b byte) { for i := 7; i >= 0; i-- { msb := r & (1<<31) r = r<<1 if msb != 0 { r = r^p } } r ^= uint(b) } n := 0 for done := false; !done; { switch b,er := in.ReadByte(); er { case io.EOF: done = true case nil: f(b) n++ default: println("cksum:", file.Name() + ":", er) return } } for m := n;; { f(byte(m) & 0xff) m = m >> 8 if m == 0 { break } } f(0); f(0); f(0); f(0) fmt.Print(^r, n); if file == os.Stdin { fmt.Println() } else { fmt.Printf(" %v\n", file.Name()) } } flag.Parse() if 0 == flag.NArg() { cksum(os.Stdin) return } for i := 0; i < flag.NArg(); i++ { if flag.Arg(i) == "-" { cksum(os.Stdin) continue } f,er := os.Open(flag.Arg(i)) if f == nil { println("cksum:", er) continue } cksum(f) f.Close() } }
Bits
Go seems at least as good as C when it comes to bit-fiddling. One might argue it is better, because of little differences:
-
the operator precedence is more natural
-
bitwise NOT is "^", rather than "~"
-
the dedicated "&^" operator combines AND and NOT.
-
shifts are defined in the specification, rather than left to the implementation; shifts are arithmetic when the first operand is signed, and logical otherwise.
-
lack of implicit casting reduces bugs
The CRC32 package
A reader noted that Go has a crc32 package, which can compute more commonly encountered CRC-32 checksums.
Unfortunately, I can’t remember how cksum differs (perhaps it goes MSB first instead of LSB first or something like that); it’s been a while since I looked into it, and I’m busy.