20141129 – Details, details, details

I’m getting into the “I’ll sort that out later” details and one of these is “when do we send packets ?”

In the steady state we have a poll period, say 64 or 256 seconds, and we can spread the servers evenly across it:

while (1)
        poll(next_server)
        delay (poll_period / number_of_servers)

This decision optimizes for a regularity of time measurements which improves both the filters and the PLLs responsiveness.

But what about startup ?

If we only poll each server every 64 seconds, it is going to take quite a while to get the PLL to settle down, and we’d actually like to be locked and settled before the first 64 seconds have elapsed.

So during startup, we send more packets, and if we have only one server, the best way to ramp up is to double the poll period:

dt_poll = 1
while (1)
        poll(server)
        if dt_poll < poll_period
                dt_poll *= 2
        sleep(dt_poll)

If we make sure the poll_period is always a power of two, that works nicely.

But how exactly does that look if we have more than one server ?

The first server we want to poll as above, at 1, 2, 4, 8,… seconds.

If we have a second server, we could slot it in the middle between those packets: 1.5, 3, 6,… and so forth for the third server etc.

Rather than hack this up, I decided to just do it the right way by breaking out the log(3) and exp(3) functions:

startup_period = 64
startup_packets = 6
time = 0
while (1)
        poll(next_server)
        dt_poll = poll_period / number_of_servers
        if time < startup_period
                d = exp(log(startup_period) /
                    (startup_packets * number_of_servers))
                if time * d < startup_period
                        dt_poll = time * dt - time
        sleep(dt_poll)
        time += dt_poll

This distributes startup_packets * number_of_servers packets exponentially over the startup_period and then switches to constant rate of poll_period / number_of_servers:

../../_images/20141129_fig1.png

And with a logarithmic Y-axis:

../../_images/20141129_fig2.png

The nice thing about doing this right is that it usually needs less code lines to get more flexible code. In this case we can change the controlling variables on the fly, and it will still do the right thing.

That is not a far-fetched scenario, it is quite common for one or more of the configured servers to not reply at all, in which case we can now eliminate it from the list, and still have the right thing happen.

Here I deleted two of the four servers after the 15th packet:

../../_images/20141129_fig3.png

Sometimes 20 lines of code can take an entire day.

… and be totally worth it.

phk