New vs Baselined tests

When a feature team at Microsoft completes a chunk of work (new feature or set of bug fixes), they "reverse integrate" (RI) into the main branch of the source control system.
From: A fellow tester
Subject: Test methodolgy question

I wanted some advice on a situation I am facing at the moment. I am having an involved debate with my Dev lead and I was wondering what you would say in a situation such as this. So without further ado, here is the brief description. I hope I am able to describe the nuances accurately.


My team delivers into Longhorn and so we do frequent RIs in WinMain. My team has defined a set of OS configurations that are tested as part of the RI pass. We start running the RI pass approximately 5 days before the deadline. We run whatever tests we have available.


The dev team, however, wants me to “lockdown” the tests before the RI pass. They want the test team to only run tests which we have run in the previous week and for which we have results for. In other words, the dev team wants the test team to execute tests which have an established baseline builds prior to the RI build. So, if a new test comes online during the RI test pass week, they don’t want test to run the test on the RI build unless test has run it on a previous build to generate a baseline.


With this approach, if the new test finds a bug, it is easier to ascertain whether the RI bits introduced the bug or whether it was present in the build in the first place. This information could impact our decision to RI. Their argument being that if a bug exists in the Win-Main already, then we should not stop the RI since the new bits will not worsen the quality of Msmq in Main. This is a very RI centric approach where the focus is on getting the RI through and quality takes a backseat. Since the my test team is in the middle to writing new tests for old features, the situation where new tests come online during RI passes has happened and will continue to happen in the future.


As you would expect, my instinct is to run any test I can get my hands on as soon as possible.  I want to find the bug first and wait to figure out whether it exists on previous builds or not later.  My methodology is in direct contrast to the dev team’s approach.  My approach would be to find the bug, log it and we as a team triage it;  just like ship room does in any established team.  I feel my testing is being hamstrung and unreasonably influenced by Dev’s approach.

 What do you think? Any words of wisdom?

From: Keith Stobie
Subject: RE: Test methodolgy question

A wonderful question and also one I have had to ponder many times.

Do we keep a steady baseline or do we allow new tests?


You’ve outlined the pros and cons pretty well.  I think the answer, in your case, boils down to the triage question.

Yes you can run the test and file the bugs, BUT do they impact (prevent) the RIs?

The devs appear to have a valid point, about old bugs not blocking the RIs – discovery of the old bug doesn’t lower the quality, the quality was always that low.

So ultimately, unfortunately, it becomes a risk assessment for the team as a whole to agree on:

            If we have (new) bugs that we don’t know whether they are regressions or pre-existing, should we RI the code?

It comes down to confidence in your current set of tests.

            Your devs appear confident that the current set of tests is sufficient to catch most regressions and thus the risk of regression is low.

            You appear unconfident that the current set of tests is sufficient and thus this risk of regression is high.

Do you or the devs have any data to support a risk assessment?

            If the code hasn’t changed, I support the Devs – chance of regression is very low, new tests probably just find pre-existing bugs.

            If the code has changed, what is your Devs regression history?


Ultimately the team needs to agree on the risk assessment – in this case the likelihood that new tests are uncovering regressions in changed code.


If your history is that 80% of more of the time the new tests have been detecting existing bugs instead of new regressions, then I wouldn’t block the RIs for bugs found by new tests.

If 20% of the time or more the new tests are finding regressions, then blocking the RI due to possible regression seems prudent.




About testmuse

Software Test Architect in distributed computing.
This entry was posted in software testing. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s