20141116 – Building Infrastructure

The last couple of days have been spent writing infrastructure code and some of it annoys me a fair bit.

To take my posterboy example, where the heck is:

int tcp_open(const char *hostname, const char *portname);

Why hasn’t that been a standard function in the UNIX system libraries for the last 25 minutes ?

Hint: Because the UNIX ecosystem is filled with morons who would rather fight with each other, than further software quality.

Anyway…

Some infrastructure is application specific, for instance anything dealing with the NTP protocol will have to be able to encode and decode NTP packets, which involves both endianess-issues (refid) and timestamp handling.

And once you have the timestamps, you need to be able to do basic arithmetic on them etc.

A uniform way to generate diagnostics and debugging messages, would also be nice to have.

And a basic “todo-scheduler” which can also be used for simulations,

And code to deal with the unescapable tweakable parameters.

And so on, and so forth.

For the parameters I use table-driven programming, and since few people seem to be aware you can even do that I’ll switch to digitus magistrans mode for a moment:

Table driven programming

For instance the NTP protocol has a mode field, which I have tabulated in a series of macro calls in ntp_tbl.h:

#ifdef NTP_MODE
NTP_MODE(0,     mode0,          MODE0)
NTP_MODE(1,     symact,         SYMACT)
NTP_MODE(2,     sympas,         SYMPAS)
NTP_MODE(3,     client,         CLIENT)
NTP_MODE(4,     server,         SERVER)
NTP_MODE(5,     bcast,          BCAST)
NTP_MODE(6,     ctrl,           CTRL)
NTP_MODE(7,     mode7,          MODE7)
#endif

In the ntp.h include file, this table is used to build an enum:

enum ntp_mode {
#define NTP_MODE(n, l, u)       NTP_MODE_##u = n,
#include "ntp_tbl.h"
#undef NTP_MODE
};

At some point I will probably want a function to convert the enum to a string, it would look like this:

const char *
ntpmode2str(enum ntp_mode mode)
{
        switch(mode) {
#define NTP_MODE(n, l, u)   case NTP_MODE_##u: return (#l);
#include "ntp_tbl.h"
#undef NTP_MODE
        }
        WRONG("enum ntp_mode has illegal value");
}

The trick is that instead of defining macros in the .h file and call them from the .c files as usual, we do the opposite: Define the macron in .c and use them in .h file.

This is a great way to avoid typing the same set of data in over the place.

I originally learned this trick for a book, lcc, A Retargetable Compiler for ANSI C which I highly recommend.

If you want to, you can include all sorts of stuff in the macros, including the documentation.

Here is how a statistics counter in Varnish is defined:

VSC_F(vsm_overflow,             uint64_t, 0, 'g', diag,
    "Overflow VSM space",
        "Number of bytes which does not fit"
        " in the shared memory used to communicate"
        " with tools like varnishstat, varnishlog etc."
)

The tweakable parameters in the code I’m writing now will be defined using a similar table.

Typed structures

Another convention I have adopted are a set of macros to manage and tag structs.

There is a class of mistakes which are very easy to make, particularly if you pass a struct pointer through a void pointer.

Doing that is hard to avoid in practice.

For instance most event libraries allow you to specify an event as a {function pointer, void *} duplet and almost always that void * gets cast to struct foobar by the function.

But what if you mess up and the pointer doesn’t point to a struct foobar but rather a struct snafu ? You will be amazed how long time it can take to spot that.

So pretty much all of my structs have, by convention, a first member called magic, and a corresponding #define for its magic value:

struct todolist {
        unsigned                magic;
#define TODOLIST_MAGIC          0x7db66255
        TAILQ_HEAD(,todo)       todolist;
};

A set macros use this to wrap the casts in a basic sanity check:

struct todolist *tdl;

ALLOC_OBJ(tdl, TODOLIST_MAGIC);

CHECK_OBJ_NOT_NULL(tdl, TODOLIST_MAGIC);

CAST_OBJ_NOT_NULL(tdl, ptr, TODOLIST_MAGIC);

and if anything is amiss, asserts stop us right there and then.

Which brings me to…

Asserts

Yes, I love them, and I try to make it so that anybody silly enough to try to “optimize” them out will not do so by mistake.

Three macros I use a lot are:

#define AZ(foo)    do { assert((foo) == 0); } while (0)

#define AN(foo)    do { assert((foo) != 0); } while (0)

#define WRONG(foo) do { assert(0 == (uintptr_t)foo); } while (0)

Basically “Assert Zero”, “Assert non-zero” and “Simply shouldn’t happen”.

The WRONG macro takes a string explantion as argument as you saw in the example earlier.

And once you have infrastructure like this in place, the actual programming becomes a lot less tedious, repetitive and error-prone.

We’re getting close now…

phk