Tuesday, August 19, 2014

Mercurial Queues to manage 'Spikes'



We all do some kind of R & D on code. Sometimes it’s purely for learning purposes whilst sometimes it’s to try something new on an existing code base. I call the second exercise a ‘Spike’.
A spike is quite a fluid activity. Depending on the complexity of the piece, it can take weeks if not months to bring a spike close to a state of fully fledged feature. The goal of this post is to identify a neat way to incorporate version control to keep a proper history of the work done while you are spiking.
Unlike when you are fixing a bug or developing a well understood feature, a spike can sometimes be a walk in the dark - until you see the light. You will of course set your self incrementing targets but some of these targets (or ideas) might turn out to inefficient, not robust or downright wrong. So once you decide that what you have done for the past 2-3 days is wrong, how would you start over? Would you revert all the changes? Or do you painstakingly try to identify changes that are still useful and get rid of the rest? This may sound trivial, but a healthy spike can touch various part of your solution including many source files. Believe me it’s not a  very fulfilling activity.
But with version control you wouldn’t get this problem, right? You’d of course commit what ever atomic changes you do as part of the small target and then it’s a matter of rolling back the selected changesets. Although this sounds better than the previous approach, this still has problems. Changesets in version control is immutable. It will be part of the history. Whilst this is desirable in most instances, when you are spiking this might not be so. Because spikes can have lot of intermediate steps which are either not complete, or misdirected or even wrong. Even if these changesets may be of value to you, once you push this upstream it might be confusing or irrelevant to other users of the repository. These changesets can end up polluting your version control history unnecessarily.
[Note: You don’t necessarily have to be familiar with mercurial to follow the rest of the article, any experience with a modern version control system would suffice]
With mercurial queues (MQ), you can get the best of both worlds. MQ is an extension to the popular distributed version control system Mercurial (Hg). It extends Hg functionality to manage a collection of patches together. MQ is said to be initially inspired by a tool called ‘Quilt’ used by open source community in the pre-git days to manage patches. Although the use of a similar tool in an open source project is quite useful, the focus of discussion here is on managing spike branches. MQ can be enabled by putting the following line in .hgrc file in any of your repositories.

[extensions]
hgext.mq =

MQ helps you build a stack of patches. Each patch on its own is like a bucket continuously accepting changes to it. Compared to a changeset - which crystallizes as soon as you create it- a patch in the context of MQ can be refreshed again and again without permanently saving or committing it to the repository. The fact that you can record your temp check points in to a patch is the most useful feature for me. Once you get some confidence then you save the patch and create a new one.

MQ also posses commands to alter the state of your patch stack. You can pop or push patches or even change the order of patches in the stack. This can greatly help to clean up the repository. In addition MQ enables you to roll up several patches into one. This is a great pre-cursor to converting patches into a regular Hg changesets.

Let’s go through a typical workflow.

1. I need to refactor a large part of my code base involving presentation, business logic and data access layers. This will touch at least 4-5 projects.
2. I get a clone of a repository to my work directory
Basic Usage of MQ

1. Start a queue
>hg qnew -m “Try a sample change” -u “user” patch1a.patch
This will create an empty patch

2. Do necessary changes in the project(s). (Assume new files are added in the process)
>hg add
Adds new files to the tip
>hg qrefresh
qrefresh will update the current patch with the latest changes on the tip

3. Now keep doing changes and keep doing hg qrefresh as often as possible.

4. Remember to start a new patch as soon as you think your immediate goal is achieved. This is quite important as the granularity of your patches will decide how flexible your patch stack is for tinkering later in the process.

5. Once you are ready, start a new patch (essentially closing the existing patch)
>hg qnew -m “Change few more entities” -u “user” patch1b.patch
This will start a new patch patch1b.patch. Previous patch is now in the stack.

6. Check the patch stack
>hg qseries
patch1a.patch
patch1b.patch
>hg qpplied
patch1a.patch
>hg qunapplied
patch1b.patch
7. Go back to step #2


Patch Stack Management - Change Order MQ also has some simple stack manipulation commands that can be used to pop or push patches in and out of the stack.
1. Assume the following state in your patch queue
>hg qapplied
patch1a.patch
patch1b.patch
patch1c.patch

2. Now imagine that you realise there’s a bug in the core of your code base. Ideally, this should have been done before any of these patches. This is  how you would go about it;
>hg qpop -a
This pops all patches out of the quese. (They are still in-tact, don’t worry)

3. >hg qnew core.patch (Other command params are omitted for brevity)

4. Now do the code changes you want to make in core code.

5. >hg qrefresh
Updates the patch

6. >hg qpush core.patch

7. >hg qpush -a
Push all the rest of the patches on top of core.patch

8. >hg qapplied
core.patch
patch1a.patch
patch1b.patch
patch1c.patch


Patch Stack Management - Combine Patches
Assume that you have spent time on the spike up to a level that you are quite confident that the existing patches will end up in a feature branch. May be some patched on its own don’t deserve to be an hg changeset.

1. >hg qapplied
core.patch
patch1a.patch
patch1b.patch
patch1c.patch
2. >hg qpop -a
3. >hg qpush
This pushe core.patch back in to the queue
4. >hg qpplied
core.patch
5. Now we need to combine patch1a, patch1b and patch1c in to a single patch - patch1
6. >hg qnew patch1.patch
7. >hg qapplied
core.patch
patch1.patch
8. >hg qfold patch1a.patch patch1b.patch patch1c.patch
9. >hg qapplied
core.patch
patch1.patch


Converting Patches to Changesets
Once you are comfortable with your changes - i.e patch stack, you can easily convert the patches to changesets to be pushed to other team members or to a separate branch.

1. >hg qfinish
Moves all the applied patches in to the repository history by converting them to changesets. This will release the patches from mq history
2.>hg qseries
Shows nothing
3. Alternatve to applying all of your patches, you can apply just the patch(es) you want (starting from the bottom of the stack up to the provided revision number)
>hg qfinish revision_number

I find Mercurial Queues to be extremely useful in the context of ongoing substantial changes to an existing code base. It provides me all the goodness of a version control system without having to sacrifice the consistency of my repository with unnecessary finger prints.
Post a Comment