20150111 – What happened next?¶
Happy New Year!
Thanks for all the feedback, it is rather humbling to see the amount of buzz generated by the announcement of the Ntimed project.
As in Varnish I will try to keep the github queues of issues and pull requests clean, both by fixing the trouble, but also by admitting when I cannot.
This is a lesson I learned in FreeBSD, where our failure to manage the bug-tracker caused it to become a swamp of unactionable tickets which in the end nobody could even stomach looking at.
So if you open a ticket, and I decide I cannot do anything about it, I will close it with a polite messaage. Feel free to reopen if you have further information that should change my mind, but understand that in the end my decision is final.
That said: Keep ‘em comming...
The 2015-06-30 leapsecond was an unwelcome new years present, I had hoped to have more time before my code needs to face that hack.
I have received a fair number of questions asking if I want to implement the “leap-smear” which Google have pioneered.
The answer is “maybe”, but if I do it will not be in the Ntimed-client program, but in Ntimed-slave and Ntimed-master sequels, and thus it will not be ready for the 2015-06-30 leapsecond.
The reason I am not going to do it in Ntimed-client is that I think it is the wrong place to do it: Only in an installation where all machines, including the embedded devices, run ntimed-client would you get consistent timekeeping.
By contrast, doing it on the site-wide (slave) server(s), and blocking NTP from clients in the firewall, sets up a closed “time-system” where hacks like the leap-smear can be successfully employed, no matter which NTP client software your lawn-mower robots use.
Some of the first reports and patches which came in were about portability and I have tried to handle these to the best of my ability.
Ntimed-client puts the entire interface to the OS timekeeping in four trivial functions for portability, but there are other nits and downright idiotic incompatibilities, because, quite frankly, the UNIX ecosystem is filled with narrowsighted bigots.
At the timekeeping-level, Windows and OS/X are the odd ones out, and both of these will need a dedicated set of the four functions. I hope somebody with skills and access to those platforms will contribute them.
As for the pointless incompatibilities between modern UNIXen, I’ll try to navigate them on a least-resistance path, but I’m going to be very sarcastic about it in source-code comments.
One of the realities of modern systems is that they live in discontinuous time. A laptop may suspend in Denmark and resume two days later in New Zealand. A server may be suspended on on cluster of VM servers and resume instantly or a month later on another.
(This is not a new phenomena, somebody once powered and old PDP/11 back up after it had been in storage for years, only to find it sending out the emails in /var/spool/mqueue as soon as he connected it to the net.)
Ideally the UNIX architects would assign “SIGWARP” as a kernel sent signal to all processes (default: SIG_IGN) in such circumstances, but given that there are no UNIX architects and even if there were they could not agree what color fire should be, that ain’t gonna happen.
So I made the decision that the Ntimed daemons will interpret SIGHUP as “something may have changed, quite possibly time itself” and do processing consistent with that.
For Ntimed-client, that amounts to testing what servers we can reach, figuring out what time it is, possibly stepping the clock, and then letting the filters, combiners and the PLL do their job – just like on startup.
The next thing on my list is the server-management code in Ntimed-client. There are multiple tricky issues to deal with here, most notably when to stop using a particular server in the *.pool.ntp.org facility, but also detection of duplicates, servers which appear more than once in the list of resolved DNS names, either because they are dual-homed or because of the combination of arguments provided.