Friday, February 13, 2015

Grass is green in Brownfield as well

After a lengthy stint of ‘Brownfield’ development I’m moving on to a ‘Greenfield’ project. I thought it’s time for a bit of retrospection.

Contrary to many other fields, the software field tend to look down on artifacts created by others and especially prior to the current times. Terms like ‘legacy’, ‘brownfield’ reflects this fact very well. In Literature people still hold classics such as ‘Shakespeare’ or even more contemporary works to great heights - they discuss these works and let them be studied by newcomers to the field. If you try and compare the same in Software Engineering, which is as creative a process as writing, you don’t see it at all. Which pieces of classic software that we study to learn about our craft or to inspire us?

In architecture and building construction, terms like ‘regeneration’, ‘revitalization’, ‘rebuild’ are considered positive terms conveying a positive message to the listener. However a common software term like ‘refactor’ doesn’t bring that same positive energy to a listener, especially if she’s not a Software professional. What has caused this? Is the insatiable thirst for shiny new things depriving the industry of the chance to look back and appreciate what was achieved before us?

Another factor for this immature attitude towards already created running software could be the on going high percentage of newcomers to the industry. It says that 50% of the industry is having less than 5 years of experience. This is a probably one reason why there’s a lack of love for existing systems and code. It’s hard to appreciate golden oldies when all you have seen is less than 5 years old.

Approaching Brownfield
I’ve been lucky to be in both sides of the green and brownfield development for the same application over past few years. I joined the team when the application was being written work for some time, and then left. And then joined the same team when the application is on maintenance mode. So today, while cringing at some of the code and patterns used in the code base, I soon realise that it’s probably me who wrote it couple of years back and then start empathising with that developer - who happens to be me in this case.

The point is, to be successful in brownfield you got to have empathy. You got to put yourself under numerous constraints the previous developer might have had to work under. The constraints could be his own lack of experience, time pressure, lack of functional clarity, technical limitations at the time etc…The common syndrome of this code sucks doesn’t quite work here - because I’ve come to realise that this attitude tends to extend beyond the code. May be you should focus on the code itself and get the individuals who are behind that code out of your mind.

Another good technique of approaching brownfield is to treat it as a sort of detective or investigative work. You have the remains of a case - code - and then you got to draw up whole crime scene. The motives and decisions that went in to the code at hand. The tools at your disposal could be unit tests (even if they turn out to be disposable), knowledge about technical limitations and best practices at the time or even personalities and work patterns of the developers involved in the code base.

The rewrite
A common pitfall of brownfield development is the desire to rewrite - rewrite from scratch. This is equally dangerous for a module or component as much as the whole system. There’s a lot of implicit knowledge hidden beneath that code than  you care to appreciate. A rewrite should be taken extremely cautiously, both for technical and business reasons. Most of the time, the men with the cheques don't care a hoot about whether the application is using jQuery 1.5 or 1.9, they just want the defects to be a minimum and system to be performant. So any considerable rewrite has to justify both the technical and business risks. Middle path here is to develop a separate application/module on top of the existing code base and just route the users to this new piece instead of the old one. The old one is still kept for sometime as a mitigation strategy. I’ve found that this strategy gives the business decision makers lot more peace of mind which actually increases the chances of a rewrite being approved in the first place.

Another opportunity to rewrite parts of the system is when a new piece of work (Enhancements) come along your way. This is the chance to get some free testing and strategically do some framework enhancements or component rewrites under the covers.

Adding value
Almost 95% of the industry is working on maintaining a system. So as an industry, it’s surprising that the amount of resources available for newcomers (Remember 50% of us have less than 5 years of experience) are primarily focused on developing systems from scratch. I hope there will be more resources on how you look after a live system and improve it while it’s kept running.

Probably the first improvement is to making sure that you use modern tools and techniques to look after this system. Again, if you look at the building industry you won’t see builders coming for a repair job of an old 1800 built classic building with just stone knives and hammers. They will have the latest tools. If the code base is not yet ready for that, the first job is to upgrade it so that it’s ready. An example from .Net would be to making sure that the code base is compatible with latest Visual Studio.

A low hanging improvement for these systems is to upgrade to the latest frameworks or to latest servers. Say for a web application, this could be upgrading from IIS6 to IIS8. Without changing any code the users could experience vast improvements.

Sometimes adding value to an existing system is misunderstood to making it look shiny. Although this could get the users attention, that’s probably not a high priority. If you are at early stages in a brownfield project, you might be too scared to propose big changes to the system - fair enough. Focus on what improvements you can do to make your teams life easier. How difficult it is to do a simple defect fix in a particular component. Try and make that component more welcoming to changes. May be you can write a test suite around that component inviting others to contribute and soon you realise that the team is not so afraid to change that component. Another thing is to review what version control the project uses and what kind of workflow is being followed. I’ve come to realise that some teams don’t get the best of version control and that makes them move at a lower velocity. It's not hard to find enterprise developers who had been bitten by using Source Safe and still don't quite get the benefits of having a modern version control system.

An ongoing issue in brownfield is how the implicit knowledge of your team members and code are captured. Do you have 5 year old functional specs that deviates from the current system so much making them useless or even dangerous? What works for our team is a Wiki. It’s a live document capturing bits and pieces of information that’s relevant on a daily basis. When documenting always try and stick to the ‘Whys’ - not ‘Hows’. Documenting why a certain decision has been made is far more important than documenting how a certain functionality works. The ‘How’ is already in the code - make sure your code is readable.

When adding a new feature or doing a significant enhancement always try and stay away from dependencies that are hard to change or obsolete. Instead try and replicate the old piece with new libraries or dependencies and run them in parallel. See how you perform over time and then you can even think about getting rid of the old piece. I've had to do this for some data access components which were dependent on an obsolete version of an Oracle driver. In the interim you might have a bit of duplication in the system, but you always have the long term goal in your mind. This is where software engineering purity doesn't quite get you there, pragmatism trumps purity in these cases.

Leaving Brownfield
Brownfield projects should not be treated as doing time for ones sins. It’s a great environment to learn from seasoned individuals and most importantly time tested and seasoned code base. Sometimes learning what not to do is as important as learning what to do.

Make sure you pick a few patterns and practices from the existing code base. Try and document them, since chances are you will forget these in a hurry. One way to do this is to update the existing documentation (wiki) of the project. This will add value to the team as well. Updating the Architecture document, starting a wish list, updating any existing defect records are activities that will leave a positive impression of you among  your team members.

Remember it will be so easy for the team (and quite natural as well) to blame you for some of the issues that could arise in the future after you are gone. Make sure that they don’t get that chance. Tidy up what ever little mess you have left unattended. Be it code, documentation or even a rogue folder in your dev server. Be nice to teammates and continue to value their time even after you leave.

Monday, November 24, 2014

Not your fathers' Microsoft

I’m sure anyone who heard recent Microsoft Announcements around .Net (their flagship development platform) is either hyper excited or super suspicious. Microsoft open sourced .Net and also committing to make it run on Linux. This is so far fetched from Microsoft so many people have known so far.

I will try to explain these and many other recent Microsoft decisions from a strategy point of view. I will simply demonstrate it using 3 rough time periods

  1. Good Ol’ Microsoft (Prior to 2010)
  2. Current State (2010-2014)
  3. Future is here (2014 - Onwards)

Good Ol’ Microsoft

 Operating System at the centre of all attention (See Green)

Microsoft owns the OS used by majority of enterprises and individuals. Microsoft bets on its overwhelming lock-in advantage in the operating system market to enhance its chances of winning the overall battle. It’s well known for its bullish tactics and quite used to being burnt by developer flames.  

Current State (Past 2 years and probably next 2 years)

Programming platform is taking over the OS (Greener platform)

Cloud is changing the ball game and Microsoft is feeling it.

Enterprises no longer have to take a huge risk and invest heavily in order to change their development platform & strategy. They can even start slowly with the cloud and see how it goes and then once confident migrate fast. This is loosening the grip of Microsoft with regards to its OS advantage.

In addition Windows as an Operating System has not been innovating all that well. While it has wasted lot of energy on Start Buttons and Live Tiles, Linux community is coming up with game changing technologies like ‘Docker’.

Microsoft has invested heavily on their Cloud platform - Azure. They start very late but they’ve already become the largest hyperscale cloud provider with their data centre capabilities almost 6 times that of Google and 3 times of Amazon.

However their development story is still primarily woven around Windows.

Future is here

As mentioned earlier in the article, Microsoft is keen to change their development story and change it fast. They have realised that they need to approach this from top to bottom where developer lock in is more important. Developers are no longer restricted by IT capabilities (or lack of it) of the Enterprise - Cloud has given them lot more freedom and options when designing solutions. Options that go beyond the OS or infrastructure. Unless Microsoft provides a compelling set of tools and technologies to developers, they will see Microsoft as a toy suited for prototyping - not an end to end tool used for delivering complex software solutions.

In addition to realising this reality, Microsoft has embraced Open Source as the quickest form of widening the capabilities of their platform - Not to mention the extremely valuable PR brownie points they earn among communities.

It’s partnering with players that were once competitors or even too tiny to be bothered with. Some of the big ones are Xamarin (Mobile Application Development), Mono (Open source .Net), Docker (Next generation application container technology) etc…

All in all I feel Microsoft is heading towards the right direction. I’m sure early cloud adapters like Amazon won’t take the challenge lightly and they will start publicizing their story as well. There will be lot of innovation and lot more noise from all major players. The challenge for the development community as a whole is not to be drowned in all these noise but to find the gems among the chaos.

Monday, October 20, 2014

Before you code

A new project starts and everyone is excited. The development team is so keen to dig right in and start coding. But wait, there are few things that needs to be done at the start, to stop you from some head banging later in to the project.
Here are few that I’ve come across. Love to hear your suggestions.

  1. Servers and Developer machines
    1. Establish templates for your servers - You should be able to churn out a server with relevant software/hardware configuration in minutes during the life of your project.
    2. Make sure your developer machines have enough firepower - This may be an ideal time to get some funding for those upgrades that developers never got
    3. Development Server - In some instances, teams decide to have a common development server as an integration testing area. They may even end up working on a common database at the initial phases of the project. Although this could seem productive at the start, sooner you get off of this approach the better. This set up delays automation activities, gives false sense of velocity and could easily result in artificial delays in areas like migration.

  1. Environments
    1. A typical project can have separate environments for development/testing and productions. Each environment may consist of a combination of applications, configurations, databases etc...
    2. Team need to decide how many environments each of development and testing. Also how they wish to manage differences between each. Commissioning a new environment should again take only minutes.
    3. Another concern is how you connect your upstream and downstream applications to each of this environments. In most of the enterprise projects, data flows have to established both up and downstream from external systems in to your dev/test environments.

  1. Version Control
      1. Distributed or central. Distributed version control is more common place now and you got great and varied options. You can even rely on third party providers like github instead of setting up your own. Or may be you want to head down for a tool like TFS in light of its overall Application life cycle support.
      2. Establish your version control workflow
        1. Do you go for named branches or separate repositories?
        2. Is rebasing allowed?
        3. Tagging
        4. Logging the commits
      3. It's not just code. Version control should not limit to your code. Any artifact that makes your project what it is, needs to be version controlled. Your configurations, database, documents etc...I've found that some enterprise db developers are not very excited of using version control software. As a team you need to make sure that this is not the case and that database and configuration are treated as a first class citizen in the version control world.

  1. Knowledge Management
    1. The team should decide on a mechanism to share and build knowledge about the product/project.
    2. A preferred method is to have a project wiki. You can start dumping your initial thoughts and then easily keep updating it.
    3. If the project involves a lot of documents, a document management system might also be considered.
    4. The Wiki can be used to drive consistency across the team, be it standardization of terminology or standardization of technical entities.
      1. Common project specific terms, abbreviations and their meanings
      2. Technical standards and compliance details (data types, naming standards)
      3. Reference implementations and code samples
    5. It can also host project management information such as contact details, communication escalation paths, holiday plans and high level project plans and schedule information

  1. Project Management methodology and delivery expectations
    1. Team should decide what their delivery model is. For example, Is it going to be 2 week iterations or a continuous deployment of features in to an integration area as and when things roll out of developer hands.
    2. What sort of tool or method may be used to track the progress of the team.
    3. Seemingly simple things like consensus on  when a task will be marked 100% can go a long way in understanding the status of the project at any given moment.

I guess you do understand that none of the above should stop you from actually getting things done in the start. It’s not imperative that you have all of it before you open up your IDE. However in my experience the team should make it a point to get most of these concerns out of the way within the first couple of iterations.

Tuesday, August 19, 2014

Mercurial Queues to manage 'Spikes'

We all do some kind of R & D on code. Sometimes it’s purely for learning purposes whilst sometimes it’s to try something new on an existing code base. I call the second exercise a ‘Spike’.
A spike is quite a fluid activity. Depending on the complexity of the piece, it can take weeks if not months to bring a spike close to a state of fully fledged feature. The goal of this post is to identify a neat way to incorporate version control to keep a proper history of the work done while you are spiking.
Unlike when you are fixing a bug or developing a well understood feature, a spike can sometimes be a walk in the dark - until you see the light. You will of course set your self incrementing targets but some of these targets (or ideas) might turn out to inefficient, not robust or downright wrong. So once you decide that what you have done for the past 2-3 days is wrong, how would you start over? Would you revert all the changes? Or do you painstakingly try to identify changes that are still useful and get rid of the rest? This may sound trivial, but a healthy spike can touch various part of your solution including many source files. Believe me it’s not a  very fulfilling activity.
But with version control you wouldn’t get this problem, right? You’d of course commit what ever atomic changes you do as part of the small target and then it’s a matter of rolling back the selected changesets. Although this sounds better than the previous approach, this still has problems. Changesets in version control is immutable. It will be part of the history. Whilst this is desirable in most instances, when you are spiking this might not be so. Because spikes can have lot of intermediate steps which are either not complete, or misdirected or even wrong. Even if these changesets may be of value to you, once you push this upstream it might be confusing or irrelevant to other users of the repository. These changesets can end up polluting your version control history unnecessarily.
[Note: You don’t necessarily have to be familiar with mercurial to follow the rest of the article, any experience with a modern version control system would suffice]
With mercurial queues (MQ), you can get the best of both worlds. MQ is an extension to the popular distributed version control system Mercurial (Hg). It extends Hg functionality to manage a collection of patches together. MQ is said to be initially inspired by a tool called ‘Quilt’ used by open source community in the pre-git days to manage patches. Although the use of a similar tool in an open source project is quite useful, the focus of discussion here is on managing spike branches. MQ can be enabled by putting the following line in .hgrc file in any of your repositories.

[extensions] =

MQ helps you build a stack of patches. Each patch on its own is like a bucket continuously accepting changes to it. Compared to a changeset - which crystallizes as soon as you create it- a patch in the context of MQ can be refreshed again and again without permanently saving or committing it to the repository. The fact that you can record your temp check points in to a patch is the most useful feature for me. Once you get some confidence then you save the patch and create a new one.

MQ also posses commands to alter the state of your patch stack. You can pop or push patches or even change the order of patches in the stack. This can greatly help to clean up the repository. In addition MQ enables you to roll up several patches into one. This is a great pre-cursor to converting patches into a regular Hg changesets.

Let’s go through a typical workflow.

1. I need to refactor a large part of my code base involving presentation, business logic and data access layers. This will touch at least 4-5 projects.
2. I get a clone of a repository to my work directory
Basic Usage of MQ

1. Start a queue
>hg qnew -m “Try a sample change” -u “user” patch1a.patch
This will create an empty patch

2. Do necessary changes in the project(s). (Assume new files are added in the process)
>hg add
Adds new files to the tip
>hg qrefresh
qrefresh will update the current patch with the latest changes on the tip

3. Now keep doing changes and keep doing hg qrefresh as often as possible.

4. Remember to start a new patch as soon as you think your immediate goal is achieved. This is quite important as the granularity of your patches will decide how flexible your patch stack is for tinkering later in the process.

5. Once you are ready, start a new patch (essentially closing the existing patch)
>hg qnew -m “Change few more entities” -u “user” patch1b.patch
This will start a new patch patch1b.patch. Previous patch is now in the stack.

6. Check the patch stack
>hg qseries
>hg qpplied
>hg qunapplied
7. Go back to step #2

Patch Stack Management - Change Order MQ also has some simple stack manipulation commands that can be used to pop or push patches in and out of the stack.
1. Assume the following state in your patch queue
>hg qapplied

2. Now imagine that you realise there’s a bug in the core of your code base. Ideally, this should have been done before any of these patches. This is  how you would go about it;
>hg qpop -a
This pops all patches out of the quese. (They are still in-tact, don’t worry)

3. >hg qnew core.patch (Other command params are omitted for brevity)

4. Now do the code changes you want to make in core code.

5. >hg qrefresh
Updates the patch

6. >hg qpush core.patch

7. >hg qpush -a
Push all the rest of the patches on top of core.patch

8. >hg qapplied

Patch Stack Management - Combine Patches
Assume that you have spent time on the spike up to a level that you are quite confident that the existing patches will end up in a feature branch. May be some patched on its own don’t deserve to be an hg changeset.

1. >hg qapplied
2. >hg qpop -a
3. >hg qpush
This pushe core.patch back in to the queue
4. >hg qpplied
5. Now we need to combine patch1a, patch1b and patch1c in to a single patch - patch1
6. >hg qnew patch1.patch
7. >hg qapplied
8. >hg qfold patch1a.patch patch1b.patch patch1c.patch
9. >hg qapplied

Converting Patches to Changesets
Once you are comfortable with your changes - i.e patch stack, you can easily convert the patches to changesets to be pushed to other team members or to a separate branch.

1. >hg qfinish
Moves all the applied patches in to the repository history by converting them to changesets. This will release the patches from mq history
2.>hg qseries
Shows nothing
3. Alternatve to applying all of your patches, you can apply just the patch(es) you want (starting from the bottom of the stack up to the provided revision number)
>hg qfinish revision_number

I find Mercurial Queues to be extremely useful in the context of ongoing substantial changes to an existing code base. It provides me all the goodness of a version control system without having to sacrifice the consistency of my repository with unnecessary finger prints.