The history of md5crypt

When Bill Jolitz released 386BSD-0.0 in March 1992, the Wassenaar agreement was still just an idea, and export from the USA of cryptographic source code, considered a “munition”, carried stiff penalties, so the crypt(3) function Bill supplied looked like this:

char *
crypt (k, s) char *k,*s; {
     write(2,"Crypt not present in system\n", 29);

This was obviously no good, and already a month later, along with patchkit version 2.2.3, a special patch number 50000 existed which added the right and proper DES based crypt.c to the system.

Patch 50000 is one of the few files from back then I don’t have a copy of, but I am pretty certain that it contains a verbatim copy of crypt.c from the 4.4-Lite release. (ie: this file)

By june a CVS tree has overtaken the patch-kit, and after a few days the 4.4-Lite crypt.c is added to it, only to be removed a month later because of the ITAR exportability issue.

Instead a homebrew password scrambler was added to the tree, and after a few iterations of bugfixes it settled down.

Then a lot of stuff happened, including the BSD-USL lawsuit, my move to USA, and near the end of 1994 we were ramping up towards the FreeBSD 2.0 release, which I had been suckered into being release engineer for.

One of the interesting parts of my job at the time, was that I had access to a network of 120-130 PCs with 33 MHz 486 CPUs and a couple of 60 MHz Pentiums, (FDIV bug and all) and a handful of SUN/1000 servers, including 100Mbit ATM network and whatsnot.

All these PC’s ran NT3.1, but it only took me about 15 minutes to boot them all diskless from floppy disk, using the Sun servers as NFS server.

And thus, outside normal working hours, the first FreeBSD “supercomputer” cluster was born. I had a lot of fun with that.

Later I used it for real work too, for instance to demonstrate how fragile TCP was in Solaris 2.3 and general stress testing of the network for our system.

But of course, mostly I used it to have fun, for instance by compiling all of FreeBSD in 21 or 22 minutes, a record not broken for many, many years.

I also used the cluster to see how many crypt(3) calls I could do per second, and as should be obvious if you read it, the above function is actually really really fast.

That worried me.

Running the same statistical tests I had used to point out flaws in credit-card PIN code algorithms when I worked at an oil company, I realized that it was also really really lousy.

I also happened to talk to an acquaintance who did stuff with FPGAs about it. He ran some tests after work and reported his numbers, and it was not good news: With the kit he had access to at work, he could bruteforce any likely DES-crypt.c password in a matter of days.

Something clearly had to be done, and in the early years of FreeBSD, that meant that the Release Engineer had to do it.

Deciding to base the new crypt.c on MD5 was the easiest bit. I would have preferred to use MD2, the slowest member of the Ron Rivest trio, but MD2 was only released for “non-commercial Internet Privacy-Enhanced Mail”. MD4 and MD5 were released for any use, and MD5 were the slower of the two.

I had already some months previously added MD2, MD4 and MD5 to FreeBSD as a “libmd” library to avoid them proliferating throughout our tree (Fat lot of good that did).

I spent some days figuring out what semantics the passwd::pw_passwd field actually had, and found out that most of the code made no assumptions about it, so I could extend its length and change its charset, as long as I stayed away from colons.

And with that out of the way, md5crypt was born

Several design decisions are worth pointing out:

I used a much bigger salt, 48 bits vs DES crypts 12 bits, I added the ‘$1$’ algorithmic identifier (“so we can get better later on”) I made the process slow, 36 milliseconds on our state of the art Pentium 60 machine and I removed the length restriction on the typed password.

Since differential attacks on cryptographic algorithms were sort the big buzz at the time, I also tried to design the code/data path to avoid any risk of two-way-leverage against the algorithm.

But the most important decision I made was to slap the beerware license on the code. I’ll tell the story about the beerware license another time, but for now its sufficient to say that its clarity of intention made md5crypt a very popular copy&paste target.

“Ponto Facto, Caesar Transit” as the romans would say, I moved on to other and more pressing issues,

At the 1999 USENIX conference, Niels Provos and David Mazières presented “bcrypt” I remember reading their paper and feeling a tiny bit slighted by their dismissal of md5crypt, but hey, they were OpenBSD guys and I knew how they rolled…

Around the same time, I was back in Denmark, with our ill-fated document-imaging system, and received a software upgrade for the Cisco 7010 routers. After setting the “enable” password I was rewarded with a tell-tale “$1$blabla$blablabla” string and after a quick compatibility check, discreetly asked a FreeBSD committer who worked at cisco if he could take a peek at that bit of their source code.

He did: They had.

Rumour has it that there is a bottle of Anchor Steam sitting next to a signed copy of the official Cisco P/O waiting for me in Menlo Park somewhere.

In the same timeframe I was also contacted by phone by an IBM laywer involved in the due dilligence on their takeover of Whistle Communciations, whose product were based on FreeBSD.

That was the silliest conversation I have ever had with a a lawyer, and he knew it would be, so he had a hard time staying serious and at the end we had a good laugh about the legal professions lack of basic beer-etiquette.

I also found out that GNU::LIBC had adopted md5crypt, but in their usual zeal they had rewritten it and erased all signs of where it came from. I raised a minor stink about that, and they grudingly put in attribution.

The only place I have ever actively advocated adoption of md5crypt was the RIPE database: If we were going to store and email hashed passwords around, at the very least they could hash them well, and with a non-ITAR affected algorithm.

In 2004 I was invited to submit a “sidebar” for an issue of IEEE Software magazine focusing on using open source software, I used the chance to make the point that if people adopted md5crypt for their own devices, it might be a good idea that I knew about it, so that I knew who to contact, should I have something important to tell them.

Either nobody read that issue of IEEE Software or it was considered a silly idea, because I usually found out about md5crypts migration only when people noticed it.

One notable instance was some computer game including the beerware license in their credits. That caused a number of people to think I wrote the entire thing, sending me emails with everything from complaints and ideas, to angry protests that they had already paid full price and “didn’t owe nobody no beer” etc.

By then my laptop were ten-twenty times faster than the FDIV-challenged Pentium, and various students of cryptography started emailing me their results, which slowly and steadily edged closer and closer to one million calls per second, using CPUS and GPUS and a somewhat higher number using FPGAs.

By 2010 I concluded that md5crypt had definitively come to the end of its serviceable life, but I had no way to effectively communicate that message.

Where, for instance, would I find the bearded hacker who told me at USENIX ATC in 1998 that he had ported md5crypt to COBOL on a mainframe ? And who used the JavaScript implementation ?

Having plenty of distractions, I did nothing, rationalizing it by grumbling that somebody else ought to write something better and that would be that.

Nothing of the sort happended, and then a couple of days ago, LinkedIn lost 6.5 million unsalted SHA1 scrambled passwords.

And various places on the web I saw advice of the form “Just use md5crypt and you’re safe” being given, by programmers from the “lost generation” who have never heard about Peter G. Neumann or Robert Morris.

As any hunter knows, there comes a day where a real man must take his gun and his old dog for one final long walk – and do what a real man must do.


For md5crypt, that day had finally arrived too .

Thanks for using my code.