Getting CVS to do changesets

This page explains how I have come to use CVS in conjunction with some other tools to pretty much make CVS do changesets. Maybe someone else will find it helpful.

In my work, I am often dealing with pairs of codebases which are derived from a common ancestor, and which are very similar to each other, but which have diverged years ago, and which reside in separate CVS repositories (or sometimes only one of the pair is in CVS, the other is managed another way.)

Very often, changes in one codebase need to be ported to the other, and vice versa. Often, the best way to do this is via patches, and the patch program. However, "normal," conventional use of CVS does not generally result in a repository from which it is easy to extract nice, clean patches. Typically developers will commit bits and pieces of a fix, in a more or less haphazard way, or commit multiple fixes in a single commit, or several fixes in several commits, but the boundaries between commits doesn't correspond to the boundaries between logical changes. It becomes difficult and time consuming to extract a single logical change from a CVS repository which is used in the "normal" way. It's asking a lot to have developers constrain their commits to be single, complete logical changes to ease extraction of patches, but, with the help of some extra tools, doing this becomes not so onerous, and once one becomes accustomed to doing things a certain way, it's actually a pretty nice way to work.

So, first off, here are the extra tools I've found to be immensely helpful.

Some helpful tools:

cvsps by David Mansfield. This is what makes it possible to get sensible patches out of CVS. This is a program that can scan through the output of "cvs log" and from that, extract patches which correspond to a single commit, essentially regrouping the multi-file commit that CVS does into a single patch. It can also produce a listing of commits which is very handy for getting an overview of the development that's been taking place, who's been doing what, etc. Very handy for just identifying what changes you might be interested in sucking out of a repository.
Andrew Morton's patch scripts. This is what makes it possible for developers to put sensible patches into CVS, yet still maintain their sanity. Alternately, Quilt can do much the same thing, and is a rewrite of Andrew's patch scripts, and is arguably easier to learn to use. However, (last time I checked) it lacks "import_patch" and "export_patch" two features of Andrew's scripts which the Quilt developers don't seem to like for some strange reason, and these are two features which I use all the time. (I'm the author of import_patch and export_patch actually, I wrote them because I needed them.) So I prefer Andrew's scripts. Either will work, if you don't need import_patch or export_patch. There is some more documentation of Andrew's patch scripts here. BTW, When you unpack the patch scripts, go into the patch-scripts directory and do "make stripit"
patchutils contains a number of handy tools for manipulating patches.
wiggle by Neil Brown can often get rejected hunks to almost magically go in anyway. It's like patch, but uses word-based diff, rather than a line-based diff. It places rejects inside the file much as a CVS merge does. I have seen it do completely the wrong thing sometimes, so you've got to be careful with it. When it works, which is fairly often, it's fabulous.
Ccache Not really realated to cvs or patches, but ccache, a caching front end for gcc greatly speeds things up. If you ever do "make clean ; make" you need ccache.

Getting patches out of CVS

The tool for this is "cvsps." There are two main functions of cvsps that I use.

	cvsps -x -u > changes.txt

Here is some sample output of cvsps -x -u for Gneutronica

---------------------
PatchSet 1 
Date: 2005/07/05 13:13:09
Author: smcameron
Branch: HEAD
Tag: (none) 
Log:
initial commit

Members: 
	COPYING:INITIAL->1.1 
	INSTALL:INITIAL->1.1 
	Makefile:INITIAL->1.1 
	gneutronica.c:INITIAL->1.1 
	sched.c:INITIAL->1.1 
	sched.h:INITIAL->1.1 
	drumkits/Kurzweil_PC88.dk:INITIAL->1.1 
	drumkits/Roland_Dr660_Standard.dk:INITIAL->1.1 
	drumkits/general_midi_standard.dk:INITIAL->1.1 
	drumkits/generic.dk:INITIAL->1.1 
	drumkits/yamaha_motifr_rockst1.dk:INITIAL->1.1 

---------------------
PatchSet 2 
Date: 2005/07/05 16:00:31
Author: smcameron
Branch: HEAD
Tag: (none) 
Log:
Added MIDI setup window to send MIDI bank/patch change messages and change the MIDI channel to transmit on.

Members: 
	gneutronica.c:1.1->1.2 

---------------------
PatchSet 3 
Date: 2005/07/05 16:45:47
Author: smcameron
Branch: HEAD
Tag: gneutronica_point_one_alpha 
Log:
Added some documentation

Members: 
	documentation/arranger_window.png:INITIAL->1.1 
	documentation/gneutronica.html:INITIAL->1.1 
	documentation/pattern_editor.png:INITIAL->1.1 

---------------------

(There's more, but you get the idea.) The patchsets are numbered. To generate a patchset, use "cvsps -g -s patchsetnumber". For example, to generate patchset number 2, do:

[scameron@zuul gneutronica]$ cvsps -g -s 2 > 2.patch
[scameron@zuul gneutronica]$ cat 2.patch 
---------------------
PatchSet 2 
Date: 2005/07/05 16:00:31
Author: smcameron
Branch: HEAD
Tag: (none) 
Log:
Added MIDI setup window to send MIDI bank/patch change messages and change the MIDI channel to transmit on.

Members: 
	gneutronica.c:1.1->1.2 

Index: gneutronica/gneutronica.c
diff -u gneutronica/gneutronica.c:1.1 gneutronica/gneutronica.c:1.2
--- gneutronica/gneutronica.c:1.1	Tue Jul  5 11:13:09 2005
+++ gneutronica/gneutronica.c	Tue Jul  5 14:00:31 2005
@@ -30,6 +30,7 @@
 #include < math.h>
 #include < signal.h>
 #include < setjmp.h>
+#include < netinet/in.h> /* . . . just for for htons() */
 
 #include < gtk/gtk.h>
 
@@ -56,6 +57,7 @@
 #define PLAY_ONCE 0
 #define PLAY_LOOP 1
 #define PLAYER_QUIT 2
+#define PERFORM_PATCH_CHANGE 4
 
 int midi_fd = -1;
 int player_process_fd = -1;
@@ -105,6 +107,7 @@
 GtkWidget *SaveBox;

(I omit most of the patch, as it's pretty big.)

So, you can see that with cvsps, it's easy to get changes out of CVS which were committed in a single "cvs commit" and get them out in the form of a patch. That's great. But that doesn't help so much if what was committed at one go doesn't amount to a single coherent logical change.

So the trick is to work with CVS in such a way that whenever you commit, you're committing a single logical change, so later you can suck that change back out with cvsps for easy porting to some other CVS that contains a cousin code base.

This is where Andrew Morton's patch scripts really help.

Committing sensible patches into CVS

In the normal course of programming, I often find myself working on several things, several "logical changes" at once. In order to accomplish some goal, I might have to first make some change to a structure, causes some interface change to some functions throughout the code. In the course of making that change, I might spot some small, unrelated bugs which need fixing. Furthermore, development may take awhile, wnd I may need to keep up with ongoing commits to a CVS repository, but still want to keep my changesets separate from each other, and separate from whatever other commits are happening in the repository.

Andrew Morton's patch scripts handle this situation pretty well. It layers patches on top of a source code base in a stack, and provides the means to easily apply, remove, refresh (rediff) and reorder patches.

The main commands I use from the patch scripts are:

pstatus Shows the status of patches, which are applied, which need refreshing.
import_patch Brings an existing patch into the system, and is easiest way to start a brand new patch (by importing an empty file.)
export_patch Exports the current set of patches to a set of filenames which sort into the same order as they are currently replied. Useful for mailing someone a pile of patches which can easily be applied in the right order.
refpatch Re-diffs the topmost patch, incorporating new changes.
pushpatch Applies the next patch in the sequence to the codebase, and makes it the topmost patch.
poppatch Unapplies the topmost patch from the codebase.
mv-patch Allows an unapplied patch to be renamed.
fpatch Tells the patch system what files are touched by a patch (this information is needed for refpatch to work.) (BTW, forgetting to do fpatch is probably the 2nd easiest way to shoot yourself in the foot with Andrew's patch scripts. The easiest way is to forget which patch is the topmost patch and edit a file and have those changes end up in the wrong patch.)

The typical sequence of work is like this... First, assume you check out some code from CVS:

cvs co somecode
cd somecode

Presumably, you want to make a change, suppose you want to add frickin' lasers to the sharks heads. First create an empty patch, and import it:

[scameron@zuul somecode]$ echo > /tmp/add_frickin_lasers.patch
[scameron@zuul somecode]$ import_patch /tmp/add_frickin_lasers.patch
Recreated ./pc/add_frickin_lasers.pc
Warning: add_frickin_lasers has no description.

Then, tell the system what files this patch will be touching

[scameron@zuul somecode]$ fpatch add_frickin_lasers beasts/sharks/great_white.c 
[scameron@zuul somecode]$ fpatch beasts/sharks/hammerhead.c
[scameron@zuul somecode]$ fpatch weapons/laser.c
file weapons/laser.c appears to be newly added

Then, edit the files, do the edit/compile/run/debug cycle...

[scameron@zuul somecode]$ vi beasts/sharks/hammerhead.c beasts/sharks/great_white.c weapons/laser.c
3 files to edit

When you're happy, or whenever you come to a point you'd like to save a snapshot

[scameron@zuul somecode]$ refpatch
rm -f ./patches/add_frickin_lasers.patch
Placing patch in  ./patches/add_frickin_lasers.patch
diff -puN beasts/sharks/great_white.c~add_frickin_lasers beasts/sharks/great_white.c
diff -puN beasts/sharks/hammerhead.c~add_frickin_lasers beasts/sharks/hammerhead.c
diff -puN /dev/null weapons/laser.c
Refreshed add_frickin_lasers
[scameron@zuul somecode]$ pstatus
1:a:add_frickin_lasers
[scameron@zuul somecode]$

The pstatus command shows 1 patch, which is applied (the "a" indicates that ti's applied) and the patch is add_frickin_lasers.patch.

Suppose in the course of adding the lasers, you notice the steering algorithm in the great_white.c has a bug which needs to be fixed before the lasers can be aimed properly, but which is not strictly speaking, anything to do with adding the lasers.

First, pop off the freakin' laser patch:

[scameron@zuul somecode]$ pstatus
1:a:add_frickin_lasers
[scameron@zuul somecode]$ poppatch
patching file beasts/sharks/great_white.c
patching file beasts/sharks/hammerhead.c
patching file weapons/laser.c
Removed add_frickin_lasers, no patches applied

Create a new steering fix patch:

[scameron@zuul somecode]$ echo > /tmp/fix_gw_steering.patch
[scameron@zuul somecode]$ import_patch /tmp/fix_gw_steering.patch
Recreated ./pc/fix_gw_steering.pc
Warning: fix_gw_steering has no description.
[scameron@zuul somecode]$ fpatch fix_gw_steering beasts/sharks/great_white.c

Reorder the series file so the steering patch is first:

[scameron@zuul somecode]$ cat series
add_frickin_lasers.patch
fix_gw_steering.patch
[scameron@zuul somecode]$ vi series
[scameron@zuul somecode]$ cat series
fix_gw_steering.patch
add_frickin_lasers.patch
[scameron@zuul somecode]$ pstatus
1:a:fix_gw_steering Needs refpatch
2:-:add_frickin_lasers

Change whatever needs to be changed to fix the steering problem:

[scameron@zuul somecode]$ vi beasts/sharks/great_white.c
[scameron@zuul somecode]$ refpatch
rm -f ./patches/fix_gw_steering.patch
Placing patch in  ./patches/fix_gw_steering.patch
diff -puN beasts/sharks/great_white.c~fix_gw_steering beasts/sharks/great_white.c
Refreshed fix_gw_steering
[scameron@zuul somecode]$ pstatus
1:a:fix_gw_steering
2:-:add_frickin_lasers

Push the laser patch on top of the steering fix.

[scameron@zuul somecode]$ pushpatch
patching file beasts/sharks/great_white.c
patching file beasts/sharks/hammerhead.c
patching file weapons/laser.c
file weapons/laser.c appears to be newly added
applied add_frickin_lasers, next is

[scameron@zuul somecode]$ refpatch
rm -f ./patches/add_frickin_lasers.patch
Placing patch in  ./patches/add_frickin_lasers.patch
diff -puN beasts/sharks/great_white.c~add_frickin_lasers beasts/sharks/great_white.c
diff -puN beasts/sharks/hammerhead.c~add_frickin_lasers beasts/sharks/hammerhead.c
diff -puN /dev/null weapons/laser.c
Refreshed add_frickin_lasers
[scameron@zuul somecode]$ pstatus
1:a:fix_gw_steering
2:a:add_frickin_lasers
[scameron@zuul somecode]$

Now, you've got two separate patches containing two separate fixes. Reordering patches is as simple as popping the patches off, editing the series file to get the order you want, then pushpatch, refpatch for each patch. (Presuming patches don't conflict or have dependencies, otherwise reordering is a bit more work, or impossible, depending on the situation.)

Picking up new CVS commits is as simple as popping off all the patches, doing a "cvs update" then pushing all the patches back on.

When "pushpatch" fails, you can do "pushpatch -f" which will throw rejected hunks. You then manually resolve the rejected hunks (possibly using wiggle), and refpatch.

Committing your patches is done one at a time. Pop all the patches off, push the first one on, refpatch, then "cvs commit" it it one go (so that cvsps can reconstruct it later). Repeat this for each patch you wish to commit to CVS. Also, for each patch you commit to CVS, remove them from the series file and the applied-patches file. (Since they're in CVS, they are not really "patches" anymore, they're part of the baseline code. "pstatus' may still show these patches at the bottom of the list with a '?' next to them. That just means it sees these patch files in the patches directory, or the ~ files in the source directory, but doesn't see them in the series file. Those entries can be ignored.

Transporting patches from one CVS to another Generally, you'll extract the patchs from one CVS with cvsps, or you may be handed a set of patches from a developer. The CVS you're trying to put them into may have a number of problems. The code may be different to some extent. Directories may be moved around, renamed, etc. For this reason, I often find that I cannot simply use "import_patch" on the patches that I got out of one CVS into another CVS. Sometimes you can, and when you can, that's great.

Sometimes, just the directories are off a little bit, and some hand-editing of the patch can make the patch able to find the directories, and make import_patch usuable.

If that's not feasible, one way is to just create a new patch against the new cvs with "import_patch" and "fpatch" to tell it what files you're going to be changing, then use "patch" with the foriegn patches, and it will complain about the files it can't find, but give you a chance to tell it what files to patch. Then after patch is done, you refpatch, and you're good to go.

Another similar way is to use filterdiff to suck out the parts of the patch on a file by file basis, and use that to patch files one at a time, then refpatch at the end.

Cutting a giant messy diff into logical patches Sometimes you just have to deal with a big mess, and no amount of messing with cvsps will get you sensible patches. Here are some strategies for dealing with that mess.

Supposing you have two directories full of source, a, and b, and you want to cram the difference between a and b into a third set of source c, and maybe you're not all that familiar with the codebase, or what changes were being made, as they were done be several people.

You may try to use cvsps to get some hints about the sequence of changes that were done, though if the developers were sloppy about grouping their commits, this may not get you useful patches. Sometimes you don't even have this luxury of hints, because you have no cvs.

Use filterdiff to annotate the hunks of the giant diff. This will number each hunk. Then go through the annotated giant diff, and look at each hunk, and write down the hunk number, and what sort of changes are being done in the hunk.

After this, you should see that there are some subsets of hunks that are doing one or two things. Often, you will be able to construct a sensible patch or two just by using filterdiff to select the hunks that make up that patch.

Take those easy patches and using Andrew's patch scripts, put them into codebase c, and just patch to put them into codebase b.. Then rediff a vs. b. This will result in a slightly smaller version of the giant diff which doesn't contain the patches you've just made. Just keep repeating that process, cutting the giant diff down in size patch by patch.

With hunks that contain bits from several logical changes, you just have to kind of manually separate them, by applying the bits of the hunk manually, or applying the whole hunk and undoing bits, then refpatch...

Keep applying the extracted patches, then rediffing the giant diff and gradually the giant diff gets smaller and smaller. Pain in the ass, but that's how I've done it in the past.