Delta Branches

Dec 5, 2011 at 8:59 PM
Edited Dec 7, 2011 at 2:55 PM

I am trying to move a number of old projects from VSS to TFS.  These are Java/Maven/Sonar efforts.  I have moved several successfully and the developers now get continuous integration at checkin time as well as weekly SW analytics from Sonar (complexity, redundancy, tangle, etc.).  The issue is that one or two VSS projects rely heavily on "delta branches".  These projects each have a primary development branch, but, largely to accomodate customer specific changes (not suitable for common approaches to skinning or Spring), these projects each have a set of (sparse) delta branches each of which contains just the files touched to produce the customizations needed for a single customer delivery.  At build time, all files are taken from the appropriate (VSS) label in the primary development branch and then those files are overwritten by the files in the appropriate delta branch ... and then a build is initiated.  I am not wild about this practice, but one of the projects has over a hundred and forty such delta branches and I have to find a way to either accommodate a similar scheme in TFS or plan a transition to an alternative strategy.

Any suggestions?

Dec 5, 2011 at 11:42 PM
Edited Dec 5, 2011 at 11:44 PM

Let me clarify the underlying requirements for these one or two troublesome projects.

  • Can you describe the type of application architecture? Is it an internal website? Is it a standalone windows app that runs on a desktop or a client-server windows ? I know that sounds irrelevant, but it will help us understand just in case it reveals some nuance.
  • You said these projects each have a main dev branch, so in terms of requirements, your team works on one release at a time, together, right? They develop and checkin directly to that dev branch. 
  • Is there a trunk that the dev branch is split off of or is their "dev" branch really the main trunk line? If the dev branch is separate, by what rule and how often do you merge dev to main?
  • How many customers have unique variations? 2? 20? 200? 
  • Those hundred and forty delta branches, let me understand clearly. They are short-lived and represent different builds right? These are not long-lived development branches for one hundred forty customers? (hence my previous question).
  • I assume that you want the main bulk of work done in the dev branch, and readily available to ALL customers. This means even the special customers get the benefits of the main line development effort.
  • How often do you deliver the standard version of the product?
  • How often do you deliver the specialized versions of the product?

 

 

 

 

 

Dec 6, 2011 at 8:28 PM
Edited Dec 6, 2011 at 8:38 PM

The most interesting case is a project with about 70 customer "sites".  This is an advertising offering that provides a variety of retailers the ability to put up ads and promotions (w/flash, etc).  Currently there is a primary development source "branch" that has been around for many years.  At various times, new advertisers (our customers) are acquired.  Each time there is a new customer acquisition, a "delta branch" is created.  This branch will hold skinning materials as well as some customized java (but the delta branch will only hold the artifacts from the primary development branch that must be touched to satisfy a given customer's special needs).

Developers are always working against the primary development branch (overlayed by the appropriate delta branch).  There is only one development branch and no "mainline".  Picture a single long development branch with a delta branch spur for each deployment.  Upgrades involve copying spurs to a more recent part of the primary development branch and reconciling customer specific artifacts with the newer portion of the primary branch.

Each (and every) customer (roughly 70 at the present time) has unique artifacts maintained in a dedicated delta branch.

The delta branches are long lived.  The current state of each customer is represented by a label on the primary development branch plus a delta branch.

Most features are supported by the primary development branch.  Customer specific features (and "skinning") are supported by the delta branch.

Every customer has a delta branch because, at the very least, there is always skinning to be done.  The "standard" version of the product is never delivered.  A specialized version is always delivered.

I realize that skinning can be done in other ways and that features can be made configurable.  The reality here is that there are so many unique requests that making everything configurable would mean that a great many features would only be configured "on" for a single customer.  Burdening the primary development branch with all these minor, "one-shot" variations would vastly complicate and obfuscate the primary development branch...

Dec 6, 2011 at 9:25 PM

Here are some design ideas to try.

Option 1  - Keep doing what you are doing. TFS version control can support that just as well as VSS did. Why change if you are happy with your model?

If you are unhappy with your current design, keep reading.

Option 2 - Keep your main branch, against which you develop enhancements, and simply make 70 branches directly off of it, that are full branches of the entire main branch. No need for sparse branches.  TFS does not needlessly make duplicate files on the server. You won't have 70xProject size on the server.  Then update each branch with its unique artifacts, checkin, and you are done setting up your new structure. Do builds for each customer from their branch.  Whenever a customer is ready to take a newer version, forward integrate (merge from Main to Customer branch). In this model, you will never reverse integrate (merge from Customer to Main) because customers changes don't belong in the common product.  Regarding disk space requirements, the server may be smart, but the clients are not so much. Meaning, if a developer were to download all 70 branches, they WOULD have 70 x Project files. But why do that? A dev can get latest only on a project that needs changing. They can update it with skinning changes, or merge enhancements in, test them, and check them in.  And can be deleted from their local drive when they don't need it.

One variation to consider for either option is to setup a single Development branch off of the main branch. Using my terminology, you are currently checking into the Main branch as part of your development process. With your current approach, do you ever find that developers are in the middle of making changes, and the main branch is not ready and fully working, but  a customer needed to get the latest stable build, but you could not give it to them cause it's in transition?  If not, then you are fine. If you have experienced this friction, then you would enjoy having one development branch. Developers can do their jobs in there without regard for stability.  And if a customer needs the latest stable stuff, a developer can merge from Main to the Customer branch, since Main does not get polluted with work in progress. Then you can update Main every time your Development branch has something safe and stable and deployable.  In an ideal development process, every checkin is safe, stable and deployable. But I know in real life, sometimes people make changes that break the product while those changes are in transition.  That is why I recommend a separate development branch.

Let me know what you decide to do and why. I'm interested to learn from your situation.

Dec 6, 2011 at 10:51 PM
Edited Dec 6, 2011 at 10:59 PM

First, many, many thanks for your input.  It is good to find an active TFS community.  Going it alone gets boring.

Option 1: The main problem is that TFS provides no built-in "delta branch" functionality that I know of.   (Feel free to educate me. I would love to find that delta branches are supported by TFS.)  And you cannot have two TFS repositories point at the same workarea folder.  (Which is what was done with VSS.)  This makes it hard to "overlay" a delta branch on a primary branch.   And we would want to do that to support developer workstation builds in Eclipse and provide the sort of immediate feedback developers get now when they edit a file (or generate a quick war file to test the app with the Tomcat that is embedded in Eclipse).  Also, if I am to make the delta branch work with TFS continuous builds, I would have to push past the XAML to the MSBuild XML to insert an overlay action between getting the source and compiling it.  This may be possible, but I hate complicated solutions.

Option 2:  I have been arguing for something like this.  The developers come back with the concern that when they upgrade they will not know exactly what was customized.  Now they merge a top-of-tree primary development branch with an old delta branch and they know exactly what was customized.  The customizations are exactly what was in the delta branch.  If they merge a primary development branch with a complete customer branch, they will not know which conflicts are simple upgrades of primary functionality and which are customizations.  They will risk losing customizations because they won't have an inventory of what they were.  This, they say, will introduce signifcantly greater labor costs and more errors.

Regarding the variation, I agree that it would be worth considering.  There are certainly times when more than one customer is being worked on, but I think I should get past the issue in the previous paragraph before I suggest both a "ready to ship mainline" and a primary development branch.

Dec 7, 2011 at 2:16 AM

You're welcome. I feel the same way.

Which version of TFS are you using?

I don't understand what a delta branch is.  I do know that you can branch at as low a level as you like. Although I normally branch at a very high level (for good reasons), you can branch particular folders at any level. Let's say you had CSS customizations in a folder named CONTENT or something like that. You could make branches for just that folder and name them after the customer.  Then you can play the same game you play today (I think).  This way, you get 70 sparse branches, with perhaps a few folders out of the total.   But with your current approach, how do your developers easily test a particular customer's work? Don't they have to get the main line then overlay to test?

About option 2 and the objection. There is a great answer to that. You see, when you merge changes from the main branch to the customer branch, it does not just DO it and step on anything that happened in the customer branch. It will make the changes in your local workspace, and checkout all the files that were changed.  Then the person doing the merging must examine any files that were checked out. Presumably your folks know what files should be left alone. If the main development team altered skinning files, then the developer doing the merge can choose to keep the customer version of those files and disregard the enhancement. Though one would want to explore the implications of ignoring a mainline change. But if the main team is not changing skinning files, then you won't even see anything to worry about on the merge.  And with this model, any developer can easily open up a customer's branch and run it to immediately see how it looks, without piecing stuff together.  It just feels much easier to me.

I recommend you create a new TFS project as a sandbox unless you already have one. Then setup some experiments and try out the options I listed or make up your own.  Practice branching and merging and see how it works for yourself.  A sandbox is a wonderful learning too.  It will be totally isolated and won't hurt anyone.  If you are new to TFS version control, and don't feel you know how to experiment with this,  I recommend you hire a partner who knows branching with TFS to come in for a few hours and play thru some scenarios, perhaps starting with the ones we discussed. If you need a reference to a partner in your area, you can contact Microsoft or just tell us what city you work in and someone will be glad to offer a suggestion.

Developer
Dec 7, 2011 at 2:19 AM

David,

Thanks for you answers on this forum. They are insightful and helpful.

Regards,

Bill Heys
VS ALM Ranger

Dec 7, 2011 at 2:51 AM

Thanks for the kind words, Bill.

Oh, and ajenny, I forgot to ask whether your team uses Team Explorer Everywhere.  I heard about it at the recent ALM Summit. It's apparently a tool that includes plugis for people who use other IDEs like Eclipse on other platforms like Mac and Linux and several others.  I mention it because having good tooling can make the experience much more pleasant for the developers.

David Kreth Allen
Consultant

Dec 7, 2011 at 3:31 PM
Edited Dec 7, 2011 at 10:58 PM

To answer your questions:   We are using TFS 2010, Eclipse 8 and 9 (for Spring) w/ TEE.  I do have a sandbox and I have been experimenting with TFS branching. 

A delta branch is a collection of some (but not nearly all) of the files in the primary development branch source tree.  Files in the delta branch are organized into the same folder structure as the primary development branch, but most of the files in the primary development branch are not included in the delta branch.  Just the files that had to be changed for a specific customer delivery are included in the delta branch.  The changed files can be from anywhere in the primary development branch tree.  They are not located in a single folder nor are they always the same set of files.  Anything can be touched.

Regarding the folder and/or file level branching:  This sounds interesting, but to use it for one of the delta branches described above, I would have to be able to pick multiple files and/or folders from anywhere in the primary development branch tree and include them all in a single branch.  Can I do this?  Or, when we branch at the folder and/or file level, is each folder and/or file branched a separate branch?  Also, if I am able to create a delta branch, is there a way to use it in combination with a primary branch at build time (without digging into the MSBuild XML)?

Regarding the manual merging:  I think efficiency would more-or-less depend on having the person that did the original work do the merging (and do it while details were still easily remembered).  That is not always possible.  People move on to other jobs and (believe it or not) some of our customers wait years before they press for updated primary functionality and/or new customizations.  The delta branch is our unambiguous record of exactly what changes were made for each customer delivery.  A labeled version of the primary development branch plus a delta branch is our unambiguous record of the entirety of the source that was used for each customer delivery.

Dec 7, 2011 at 4:40 PM

Folder level branching seems inappropriate and messy for what you want. Each folder branched is a separate branch.  So I would not go down that path.

Your observation on manual merging implied that it might be too difficult with programmer turnover and length of time between upgrades (merges).  But how is it any easier with your current system? If you take a mainline code base that has evolved for two years, and take customizations made to a version that existed two years ago, and drop the customizations on top of the copy of the new mainline, I would expect a great deal of care would be required to ensure the result still works. There is no guarantee that the customizations made two years ago will work with the new mainline. Using merging tools, you get to see the conflicts, and resolve them one-at-a-time.  Either way, the challenge, as you pointed out, has to do with the essential factor that upgrades are taken on long intervals.  No matter what tools you use, you will need to exercise care in merging the result. Merely overwriting does not make it easier. Oh, it is easy to overwrite the new stuff with changes.  That is quick and easy.  But it just defers the difficulty to testing and debugging instead of examining the conflicts before you drop them into place. Or am I missing something? This upgrade scenario seems like one of the few and most complex and I want to be sure I understand it.

In any case, here is another design to consider.  This one does a better job of preserving each customer's changes in a form that is easy to vie and compare against.  Still we have one main line of code for the core.  Each customer gets a full branch we'll call their development branch.  Off of that, each customer gets one Release Branch.  So with 70 customers, you would have 70 x 2 + 1 main = 141 full branches.

Process Rules:

  1. When you want to upgrade a customer, you
    1. merge from main to their development branch, and overwrite anything in development to essentially make it resemble the customer's release version. Checkin everything changed in Customer development. Now development looks like their release (what they have today).
    2. merge main to customer's development branch.  Test the software, and reconcile any changes required.  Check in as needed. Then when you consider it "done", you can merge this backup up to the customer's release branch and package and deploy from there.  Or for a variation, you could make another branch off their development branch for each release. That would be a way to preserve previous releases for easy view and comparison.
  2. You never merge from a customer development branch to main.

Let me see if I can summarize your requirements, now that we've discussed for a bit and they have emerged:

  • You have lots of customers.
  • Each has their own variation on the software. 
  • You need to be able to upgrade any one of them to take changes in a main branch.
  • Upgrades may happen infrequently.  They may wait one, two , or more years before they ask for an upgrade.
  • You make changes to the core with a frequency of ____ (we did not talk about that yet).
  • You must be able to grab what they have at any point and view or change it if needed. And you need to easily test the result.

Are there any other requirements that have emerged so far?

 

Dec 7, 2011 at 5:19 PM
Edited Dec 7, 2011 at 6:14 PM

Your requirements look good.  Upgrades to core happen once or twice a year.

Your suggestion looks interesting, but I'm not sure I understand your process rules.  Both 1.1 and 1.2 are merges from main to a customer development branch?  Also,  if customer development happens in the development branch, won't the customer development branch and the customer release branch be the same (less any late breaking bug fixes)?  Or is the customer development branch the version of core that was used to start the customer specific work and the release branch the place that that the customer specific work was done?  Please clarify.  We may be on the right track here.

You say

Process Rules:

  1. When you want to upgrade a customer, you
    1. merge from main(?) to their development branch, and overwrite anything in development to essentially make it resemble the customer's release version. Checkin everything changed in Customer development. Now development looks like their release (what they have today).
    2. merge main to customer's development branch.  Test the software, and reconcile any changes required.  Check in as needed. Then when you consider it "done", you can merge this backup up to the customer's release branch and package and deploy from there.  Or for a variation, you could make another branch off their development branch for each release. That would be a way to preserve previous releases for easy view and comparison.
  2. You never merge from a customer development branch to main.
Developer
Dec 7, 2011 at 6:18 PM

First, In some ways, the way in which TFS stores files when branching seems similar to your *delta branch* concept. I wrote a blog post on this a while ago (http://blogs.msdn.com/b/billheys/archive/2011/05/05/how-tfs-stores-files-and-calculated-deltas-on-versioned-files.aspx).

In essence, when you make a full child branch (branching from a parent branch), TFS does not redundantly store all of the content in both branches. Essentially the new branch contains meta data but not content for all of the files contained in the branch. Only when changes are made subsequent to the branch are copies made (or deltas) of the changed files. 

Second, I agree that David's Process Rules steps 1.1 and 1.2 appear redundant. I think it would be helpful to step back and draw a picture of the proposed branching structure. I might propose something like this:

For the core product, create a branching structure that consists of a Main, Dev, and Release branches. In this way, the Core product could be released on a different cadence from the individual, specialized customer branch structures.

Since each of the customer releases *starts* with a specific version of the core project, you might consider creating a dependency from each Customer branch plan back to the Core project branch plan. You might consider a Main/Dev/Release structure for each customer. The Main branch might be a child branch of a branch in the core project. From this Customer-Main branch, you could create a Customer-Dev branch and a Customer-Release branch. Granted this will cause, perhaps 70 sets of branches (perhaps three branches per customer). 

The structure might look like this:

Core-Main

    | Core-Dev

    | Core-Release 1.0

    | Core-Release 2.0

    | Core Release 2.1

 

Each Customer would have:

Customer 001-Main

    | Customer 001 - Dev

    | Customer 001 - Release 1.0

    | Customer 001 - Release 1.1

    | Customer 001 - Release 2.0

and son on

Key here is understanding the relationship between the Customer branch structures (e.g. Customer 001 - Main) and the Core Branch Structures (e.g. Core - Main)

If you want to ensure that each customer stays in sync with upgrades to the core project, you might consider creating a branching relationship where Core - Main is branched to create each of the 70 Customer - nnn - Main branches.

Any time code for a specific customer wanted to accept updates from the Core code, it would merge Core-Main to Customer-nnn-Main

From there, you would merge updates from Customer-nnn-Main to Customer-nnn-Dev on a frequent (daily) basis. But you would merge Customer-nnn-Dev to Customer-nnn-Main only when you reach a milestone and want to begin stabilizing a new release for the Customer.

Once you stabilize Customer-nnn-Main you would branch it for release to Customer-nnn-Release.

With this branch design, you have a set of main, development and release branches for core and for each customer. The main branch for core is separate from the main branch for each customer. 

For each version (of either core or customer), follow the basic guidance (merge daily from main to dev, merge from dev to main when it is ready to ship. Stabilize in Main, Branch main to release. Never check changes into a Main branch. Never merge from Main to Release after it is created, etc.

Alternatively, you could consider a different relationship between customer and core. You could establish assembly references from Customer-nnn-Main to Core-Release-V1 (or v1.1 or v2.0 etc)

When you want to upgrade a customer from v1 to v2 of Core you simply change the Customer's main branch to point a newer version of Core assemblies.

Hope this helps,

Regards,
Bill Heys
VS ALM Ranger 

Dec 7, 2011 at 10:53 PM
Edited Dec 7, 2011 at 10:57 PM

Your suggestion seems the right way to move from an "overlay" approach to a "complete branch" approach.  I will present it to our developers.  I suspect they will come back with concerns about complexity, but we shall see.

Thanks to both of you for your generous contribution of time and interest.   TFS is a great tool and I look forward to working with it.

Dec 8, 2011 at 8:49 PM

Oops,  I see I mis-typed it.  In 1.1 I meant to say "merge from customer's release branch to customer's development branch" I like Bill's suggestion of drawing a picture. I design branches with pictures and I find it hard to convey without a white board.  I'm glad Bill jumped in here. This one is complex and his knowledge of the product is much deeper than mine.

Anyway, it sounds like you have enough raw material to think this through with your team.  Best wishes on whatever you decide.