CTM – 24½ years and counting

Now that FreeBSD 12.0 has been branched, it has been proposed to move the ctm(1) program from src to ports.

Chances are you have never heard about the ctm(1) program, despite it being the longest running continuous distribution of the FreeBSD source code, 24½ years and counting.

The story starts in late 1993 where I had got enough of my job at FLS Data, and as a very undesirable sideeffect, lost acces to the blazing fast 14.4 “SLIP” connection to the internet, which I had persuaded FLS to acquire.

FreeBSD used CVS as version control and committers used “supfile”, a sort of early front-runner for rsync, to mirror it locally, and without TCP/IP connectivity that was out of my reach.

I still had email and shell account with internet access, through the danish unix user-group, DKUUG, but no direct TCP/IP connectivity.

For a short time, I manually tracked the commitlog and mailed the changed files from the CVS tree to myself, and downloaded them from the shell account with KERMIT, but that got old fast.

I automated it, by keeping a copy identical to the CVS tree I had on freefall.freebsd.org, and doing a diff -r between that and the official tree, emailing a compressed and uuencoded copy of the patch to myself, patch(1)’ing my CVS tree (!) and once that had succeeded also patching the “reference” tree on freefall.

Unfortunately these events took place before the mishap which deleted our early mail archives (Ask Jordan about that one!) so I have not been able to establish exactly when I mentioned this stuff to other contributors.

In my archive I do have a file with the names of people who expressed interest (Jim Babb, Piero Serini, Steve Ratliff, Mikael Hybsch, David Dawes and Piero Serini.) when I sent out this README file:

Having read the comments I have received, it seems that the interest for an
automagic email service is far larger than I thought.  I have revise the
scheme I proposed originally, to avoid doing the same work twice.

The new scheme looks like this:

A delta is contained in one file, which has a patch, and an informative
header, which contains instructions if any files are to be removed.
This file will be machine-readable, such that a script can receive the
file by email and extract it.

The deltas will get a serial number, staring with 001 from a FreeBSD release.

A certain size limit will be put on the uncompressed size of a delta,
to avoid stressing sendmail's, and only a limited number of deltas will be
created in any 24h.

Once per day, an index-file over /usr/src is created, and as many deltas
as allowed will be created from the present state of the "remote" tree
to /usr/src as the quota allows.  If the remote tree then is in sync with
/usr/src, this will be noted.  These deltas will be mailed to a
majordomo-list, and put up for ftp somewhere.

Once per week, the deltas on disk which are older that one week will be
removed along with the last "super-delta", and a new "super-delta"
spanning all changes from the base-distribution until the most recent time
that the "remote" tree was in sync will be created.  In summary the
maximum diskspace allocated will be one super-delta, and two weeks worth
of daily deltas.

At any one time, to join the system, you will need:
        The base-release (eg 1.1)
        The most recent super-delta.
        Deltas from the super-delta till today,

It seems reasonable to expect that gzip -9 will reduce the size of a
patch to one third it's size, after uuencoding this is increased to
around 40% of the uncompressed size.  Considering this, a size of
100K per delta seems to me to be well chosen.  And making the upper
limit of 20 deltas per day (2M uncompressed = 80000 mail volume),
will allow us to send out a complete new gcc in five days.  I am quite
sure the normal daily volume will be significantly less than this.

The advantage to this procedure, over the direct one (make a daily diff,
chop it up, and stick it in a queue for transmission) is that with a little
luck some changes will be collapsed, and thus bandwidth saved.  Most often
a piece of code which is changed is likely to change again rapidly and some
of these changes will be factored out, while the remote tree is out of
sync.

FreeBSD 1.1 was not released until May 1994, but this happened well before march 1994 where I moved to USA to work at TRW Financial Systems Inc - tfs.com.

One of the first things I did upon arrival, was to give RCVS the boot and redo it from scratch under the name CTM, as a service to the many “back home in Europe” who were still behind UUCP email - as I would be again, once I returned home from a year of “training”.

Contrary to the later manual page CVS did not mean “Cvs Through eMail” or “Current Through eMail”, but rather “Cvs Tree Mirror”, as can be seen in this initial announcement:

From freefall.cdrom.com!freebsd-current-owner Mon Apr 11 17:00:25 1994
Message-Id: <m0pqVKi-0003vwC@TFS.COM>
From: phk@tfs.com (Poul Kamp)
Subject: CTM: HowTo track -current via eMail or FTP
To: freebsd-hackers@freefall.cdrom.com, freebsd-current@freefall.cdrom.com
Date: Mon, 11 Apr 1994 16:23:59 -0700 (PDT)

# ----------------------------------------------------------------------------
# "THE BEER-WARE LICENSE" (Revision 42):
# <phk@login.dkuug.dk> wrote this file.  As long as you retain this notice you
# can do whatever you want with this stuff. If we meet some day, and you think
# this stuff is worth it, you can buy me a beer in return.   Poul-Henning Kamp
# ----------------------------------------------------------------------------

Hello, Welcome to the world of CTM.  CTM means "Cvs Tree Mirror", and is
intended to fill the gap left for the IP-needy in the FreeBSD-hackers group.
The intent is to allow people without the ways and means of a TCP/IP internet
connection to track the '-current' in an efficient manner.

Presently this is in a state of "feasibility test".  If this turns out to be
significantly better than sex (or at least handy & efficient), it will be
blessed and made official, but for now: purely "as is, no guarantees given".

Please contact Poul-Henning Kamp, phk@login.dkuug.dk about this stuff,
he's the only one who knows what goes on just now.  Comments, suggestions
and ideas are much welcome.

How does it work ?
------------------
CTM works in a very simple way:  A program scans the CVS-master tree, and
makes a file with the changes it finds.  The resulting file can be distributed
via eMail and FTP.

Presently CTM runs on machine ref.tfs.com.  You can subscribe to the
automatic mailings of patches by sending an email to majordomo@ref.tfs.com.
The patches are also available via anonymous FTP from ref.tfs.com or via
majordomo's 'get' command.  Every now and then a snapshot will be taken,
to make entry easier.  Snapshots are only available via FTP, it would take
several hundred mails to send an entire snapshot.  Send me mail if you need
to be able to get snapshots via email.

A patch file is named: pch:<list>:<nbr>
A snapshot is named:   tar:<list>:<nbr>
        <list> is replaced by the name of the CTM-stream.
        <nbr> is a 3 digit serial number.
If you pickup snapshot tar:foo:034 then the first patch to apply on top of this
is pch:foo:035.

When you receive CTM-patches via eMail, the subject will say
        "Subject: CTM pch:<list>:<nbr>"
and the body of the mail will contain a uuencoded version of the "gzip -9"-ed
CTM-patch-file.

Please read the description below carefully, BEFORE you apply any patches.

For now only one list is set up: 'src-cur' it will generate patches twice every
day, and limit them to 100Kbyte in uuencoded format if at all possible.

summary: to join the CTM club you have to:
        send a subscribe mail to majordomo@ref.tfs.com
                (echo subscribe ctm-src-cur | mail majordomo@ref.tfs.com)
        get a recent snapshot and patches since then, until you have an
        unbroken sequence going.

AND IF ANY OF THIS IS NOT 100% CLEAR TO YOU, DON'T DO IT !  This is not a
fool-proof, point-and-click, user-friendly windows application.

Many thanks to Piero and S\o"ren for suffering the first trials of this.

Poul-Henning
phk@login.dkuug.dk

Format:
-------
A CTM-patch consist of some magic lines, all matching '^MAGIC_' and a
number of 'patch -u' patches.  Here is a an explanation of the information
provided:

    MAGIC_HDR src-cur
        src-cur is the name of the distribution, this is a purely
        administative choice "Source Current".  This is used
        in creating the name of the CTM-patch file: "pch:src-cur:001"
        The 001 is the sequence, see below.
    MAGIC_SEQ 1
        This is the serial number of the CTM-patch, they shall of
        course be applied in sequence to the tree.
    MAGIC_DATE 199302250001Z
        UTC timestamp of when this CTM-patch was made.
    MAGIC_PREFIX 386BSD/src
        This is the common prefix of all files in this CTM-patch.
        this can be used to find the correct '-p#' argument to patch.
        In this case -p2 will be correct if you are in '/usr/src-current'
    MAGIC_COMMENT FOO
        Zero or more of these lines can provice information from the
        master to the slave end.  This info should be presented to the
        owner of the slave.

Then comes the meaty part, any number of the next three, in any kind of
order can be present:

    MAGIC_NEW this/file/is/new
        This file is to be added, a patch is included, which will do that.
    MAGIC_CHG - this/file/was/modified
        This file is to be modified, a patch is included, which will do that.
        It is the intention to replace the '-' with an MD5 or similar check-
        sum, which the file must match.  This will be an additional safeguard.
    MAGIC_DEL - this/file/went/away
        This file should be removed.  The patch will NOT do this.
        For the '-' see above.

    MAGIC_SYNCHED yes
        This tells that there is no outstanding changes, and that after
        application, the source-tree is identical to the master-tree.
        The other possiblity is of course 'no', which means that the
        size-limit was reached, yet more changes are to be made.  The
        next CTM-patch will continue, where this one ends.
    MAGIC_PATCH BEGIN
        Marks the beginning of the patches.  These will look like:
            diff -u 386BSD/src/bin/df/df.c:1.3 386BSD/src/bin/df/df.c:1.4
            --- 386BSD/src/bin/df/df.c:1.3  Thu Feb 24 14:55:52 1994
            +++ 386BSD/src/bin/df/df.c      Thu Feb 24 14:55:52 1994
            @@ -199,9 +199,9 @@
                    used = sfsp->f_blocks - sfsp->f_bfree;
                    availblks = sfsp->f_bavail + used;
                    (void)printf(" %*ld %7ld %7ld", headerlen,
            -           sfsp->f_blocks * sfsp->f_fsize / blocksize,
            ...
    MAGIC_PATCH END
        Marks the end of the file, if this isn't here, you have lost
        part of the CTM-patch-file.

Manual Application:
-------------------
zcat CTM_PATCH_FILE.gz | grep MAGIC_DEL
    # Delete by hand the files listed.  it is important to do this *first*
    # because CTM will use a CTM_DEL+CTM_NEW sequence if that results in
    # a smaller patch size.
zcat CTM_PATCH_FILE.gz | patch -p2 -d /usr/src-current

Automatic Application:
----------------------
It would be nice if someone wrote a program to use all the available
information and do it all per automagic.  Please contact me if you
would like to tackle this relatively easy task.

Poul-Henning Kamp
phk@login.dkuug.dk
-EOF- nothing lost.

Back then the CVS tree was not open for the public, so two separate CTM streams were maintained “src” which gave you the source tree, and “cvs” which gave what it said on the tin, but only for committers.

Two days later, Piero replied:

From: piero@strider.st.dsi.unimi.it (Piero Serini)
Date: Wed, 13 Apr 1994 17:48:19 +0200 (MET DST)
Message-Id: <199404131548.RAA02469@strider.st.dsi.unimi.it>

Quoting from Poul Kamp (Tue Apr 12 01:23:41 1994):
> Automatic Application:
> ----------------------
> It would be nice if someone wrote a program to use all the available
> information and do it all per automagic.  Please contact me if you
> would like to tackle this relatively easy task.

ctmp.sh is available from me via mail request. I'm mailing it to PHK
so maybe he will put it somewhere in the distrib.
[...]

And with that CTM exploded into the badly connected parts of the world, and I soon lost count of the many people who thanked me profusely for making it possible for them to keep up with FreeBSD development.

After we got FreeBSD 2.0 out, CTM was announced as an official project service:

From: Poul-Henning Kamp <phk@ref.tfs.com>
Date: Mon Feb 27 00:16:22 PST 1995

CTM is now finally in a state where I will publicly announce it, so here goes.

This is the ftp://freefall.cdrom.com/pub/CTM/README file:

# ----------------------------------------------------------------------------
# "THE BEER-WARE LICENSE" (Revision 42):
# <phk at login.dknet.dk> wrote this file.  As long as you retain this notice you
# can do whatever you want with this stuff. If we meet some day, and you think
# this stuff is worth it, you can buy me a beer in return.   Poul-Henning Kamp
# ----------------------------------------------------------------------------
#
# Mon Feb 27 00:06:22 PST 1995
#

  Obtaining FreeBSD-current sources using CTM.
  ============================================

CTM is a method to keep a remote directory-tree in sync with a central one.
It has been developed for FreeBSD usage, but other people might use it as
time goes by, but little if any documentations exists on this time on the
process of creating deltas.


Why should I use CTM ?
----------------------
CTM will give you a local copy of the "FreeBSD-current" sources.
If you are an active developer on FreeBSD, but have lousy or non-existent
TCP/IP connectivity, CTM is made for you.
You will need to pick up up to four deltas per day (or you can have them
arrive in email automatically) and sizes are as small as we can do it:
typically less than 5K, one delta in ten is like 10-50K and every now and
then a biggie of 100K+ comes around.

You need to make yourself aware of the caveats of following the "current"
sources, refer to the relevant FAQ for more info on that topic.

Only if you have commit priviledge, or are similary authorized, can you get
access to the cvs tree by the same means.  Contact phk at FreeBSD.org for that.

What do I need to use CTM ?
---------------------------
You need two things.  The "ctm" program and the stuff to feed it.  "ctm" is
in the FreeBSD-current tree from version 2.0.0 and forward. (src/usr.sbin/ctm)

The "deltas" you feed ctm can be had two ways, ftp or email.


FTP-access:
-----------
The CTM-deltas can be found on the following sites:

 ftp://freefall.cdrom.com/pub/CTM


eMail-access:
-------------
Send email to majordomo at freebsd.org, subscribe to the list "ctm-src-cur".
Use the ctm_rmail program to unpack and apply the emails with.  You can
actually use the ctm_rmail program directly from a entry in /etc/aliases
if you want.  Check the "ctm_rmail" man page.


How to get started.
-------------------
You need to get up to speed.  Every now and then I will produce a special
additional delta: a delta from nothing.  You can recognize these in two
ways, the are large: 25 to 30 Megabytes gzip'ed, and they have an 'A'
appended to the number. (src-cur.0341A.gz for instance).  You will also
need all deltas with higher numbers.

Now working...
--------------
To apply the deltas, simply say

 cd /where/ever/you/want/the/stuff
 ctm -v -v /where/you/store/your/deltas/src-cur.*

Unless it feels very secure about the entire thing, ctm will not touch
your tree.  To check out a delta you can add a "-c", then ctm will never
touch you tree.

There are other options to ctm as well, look in the sources.  It's a
little bit confusing right now, but it will become better I hope.

I would be very happy if somebody will help with the "user-interface"
part, as I have realized that I can't make up my mind on what options
should do what, how and when...

ctm understands deltas which have been put through gzip, so you don't need
to gunzip them first.

That's really all there is to it.  Everytime you get a new delta, you
run it through ctm.

Don't remove the deltas, if they are hard to download again.  You just might
want to keep them around in case something bad happens.  Even if you only have
floppy disks, consider using "fdwrite" to make a copy.


Plans:
------
Tons of them.  Don't forget to tell me what you want though...


Misc. stuff:
------------
If you are a frequent or valuable contributor to FreeBSD, I will be willing
to arrange special services, one option is delivery via ftp or rcp to a
machine closer to you.  You need to have earned this, since it takes time
to do, but I'll be all the more happy to do it for you then.

Thanks!
-------
Bruce Evans, for his pointed pen and invaluable comments.
Soren Schmidt, for patience.
Stephen McKay, wrote ctm_[rs]mail, much appreceiated.
Jordan Hubbard, for being so stubborn that I had to make it better.
All the users,  I hope you like it...

Comments ?
----------
email phk at FreeBSD.org

Poul-Henning

--
Poul-Henning Kamp <phk at login.dknet.dk>
TRW Financial Systems, Inc.
I am Pentium Of Borg. Division is Futile. You WILL be approximated.

By april 1997 I was finally back home in Denmark, and since I had TCP/IP connectivity my interest in CTM had been fading and I happily handed over the reins to Rickard Wackerbarth:

To: ctm-announce@freebsd.org
Subject: New CTM meister...
Date: Thu, 03 Apr 1997 11:39:05 +0200
Message-ID: <338.860060345@critter>
From: Poul-Henning Kamp <phk@critter.dk.tfs.com>

Just this bit to make it official:

Richard Wackerbarth <rkw@dataplex.net>  Has taken over the job of CTM
meister for FreeBSD,  he is now the man behind the "ctm@FreeBSD.org"
alias, which as always is the place to send email on the subject of CTM.

Thanks for taking this over Richard!

Poul-Henning

From that point somebody else will have to tell the story of CTM.

Whereever and whenever it ends…

phk