CodeFaster

CodeFaster

Don't serialize SSNs as floats

Tyler Adams's avatar
Tyler Adams
Dec 21, 2023
∙ Paid
2
Share

I can’t believe people did this, but they did and it wasn’t even their fault.

Reader Rob Kolstad brings a story where SSNs were processed with AWK. Great, AWK was a fabulous fast tool to process text before we had perl and now python. Except AWK at the time happily defaults numbers to 32 bit float (today it’s a 64 bit float) and can drop quite a few bits of precision from an SSN. To show what can go wrong with 32 bit floats with SSNs

#include <stdio.h>
void
main () {
    float ssn = 4425567143;
    float ssn2 = 4425567232;
    printf ("ssn is %f %f\n", ssn, ssn2);
}
$ ./a.out
ssn is 4425567232.000000 4425567232.000000
$

And to show how awk can still go wrong with 64 bit floats (but not SSNs)

$ awk 'BEGIN {print 999999999999999}'
999999999999999
$ awk 'BEGIN {print 9999999999999999}'
10000000000000000

This might cause a few hours of debugging.

CodeFaster is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Tyler Adams
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture