Random Strings on the Command Line
Browsing through some old files today I came across this note:
To Generate A Random String:
tr -dc A-Za-z0-9_ < /dev/urandom | head -c 8 | xargs
Curios if it still worked I pasted it into my terminal and, unsurprisingly, was met with an error:
tr: Illegal byte sequence
The tr
utility is one of the many old-school Unix programs with history reaching way back to System V. It stands for “translate characters”, and with the -dc
flags on, it should have ignored all input except for alphabet characters A-Z, both upper and lower case, and the integers 0 through 9, and the underscore character. The “Illegal byte sequence” error means it was really not happy with the input it was getting from /dev/urandom
.
On macOS, the pseudo-device /dev/urandom
is, according to the man page, “a compatibility nod to Linux”. The device generates random bytes, so if we read it we’ll get back raw binary data that looks like:
00010101 01011001 10111101
The reason the command is not working like it used to is because most modern computing system expect text character encoding to be UTF-8. When tr
gets the string of random bytes from /dev/urandom
, it expects the bytes to be in a specific sequence that it can translate into printable characters on the screen. Since we are intentionally generating random bytes though, we might get a few characters that translate properly, but eventually we’ll encounter the “illegal byte sequence” error above.
To fix the problem, all we need to do is set LC_ALL=C
before running tr
:
LC_ALL=C tr -dc A-Za-z0-9_ < /dev/urandom | head -c 14 | xargs
Setting LC_ALL=C
sets the language setting back to POSIX, or the original C ASCII encoding for text. That means when tr
is fed a random string of bytes, it interprets each byte as a character, according to the ASCII table, which looks something like this:
Character | ASCII Decimal | ASCII Hexadecimal | Binary Representation |
---|---|---|---|
A | 65 | 41 | 01000001 |
B | 66 | 42 | 01000010 |
C | 67 | 43 | 01000011 |
Now each byte is interpreted as a character that matches the list passed as an argument to tr
.
➜ LC_ALL=C tr -dc A-Za-z0-9_ < /dev/urandom | head -c 14 | xargs
Rhac_WGis7tHzS
So, to break down each command in the pipeline:
tr
: filters out all characters except those in the sets A-Z, a-z, 0-9, and_
head -c 14
: displays the first 14 characters of the input fromtr
xargs
: adds a nice newline character at the end of the string, so it’s easy to copy.
This command could be easily adopted to use base64
instead of tr
without setting LC_ALL=C
if you wanted more random characters in the string:
base64 < /dev/random | head -c 14 | xargs
Expanding head -c
to 34 or so makes for a nice password generator.
In fact, I’ve aliased this to pgen
in my .zshrc
:
pgen()
{
base64 < /dev/random | head -c 32 | xargs
}
There’s almost certainly easier ways to generate a random string in the shell, but I like this, and it works for me.
Update: The good Dr. Drang suggested ending the pipeline and running
echo
instead ofxargs
for clairity, which makes a lot of sense to me. I updated the alias tobase64 < /dev/random | head -c 32; echo
.