A tester wrote me the following query (abbreviated by me below):
Subject: BVTs, Check-ins, and Automation Methodologies
BVTs accumulate over time, come from different authors, and cover different code paths hopefully). They are known to those outside their direct operation or maintenance by their titles: the ViewerBVT or the EventLogBVT for example. Someone working in the EventLog branch will typically look to the EventLogBVT for validation. What we lose here, when we take the tests in individually, are those tests that are covered in another branch’s BVT. If a core function of both Viewer and EventLog is shared it only needs to be covered once in the BVT suite, as BVTs typically eliminate redundant tests across the entire suite. When we, in an attempt to improve our development process, require a BVT prior to check-in we can find ourselves running an inappropriate suite of tests for the particular code change and checking in with false confidence. What we need, then, is some way to use an appropriate subset of our existing BVT tests effectively (or even to generate an appropriate BVT from the set of all existing tests).
My thought here is to establish a standardization of BVTs (any test suite really, but BVTs are a subset easily grabbed hold of). A system where each test is flagged with the code it covers. A system where any individual test can be executed by itself, without running the entire package, or even the entire package in which it resides. A system where, through use of keyword query, any user can generate an appropriate and necessary set of tests to run against a code change prior to check-in.
Additionally such a system should alert the user when their code is not currently covered under any exiting BVT.
Tests would need to be designed with the capability to contain a keyword classification and execute as a stand alone test. As a user in search of appropriate tests, if your keyword doesn’t get a hit in the query you know your code isn’t covered under any existing BVT scenario. Right up front you know you either need to take more time reviewing this code, or work to develop appropriate tests prior to check-in. My lead thinks this idea is worth constructing a proof of concept, I think it is worth finding out what others are thinking along these lines, to consolidate interest and avoid needless redundancy.
Are people already implementing this? Does it already exist? Why aren’t we already doing this?
My reply is that I know of at least two efforts inside Microsoft that address this.
You outline two different things:
1) A keyword based system of test selection. I have always supported this and several different Test Case Managers (TMCs) at Microsoft allow this. Most TCMs consider this a minor feature (but I agree with you about it being very useful). I prefer keyword management and selection to fixed hierarchies (and symbolic links for multiple hierarchies).
2) Selecting tests based on (code) coverage.
I’ve been amazed how bad the human keyword selection of tests is in general compared to the automated test selection based on coverage. I don’t follow how
“As a user in search of appropriate tests, if your keyword doesn’t get a hit in the query you know your code isn’t covered under any existing BVT scenario.”
Different people think of different keywords for the same thing, so not finding the keyword you think of, doesn’t guarantee that what you need doesn’t exist.
There are a few different projects addressing test selection based on code coverage.
Product Team A has
Checkin Test Requirements
l Unit tests are fully automated and must pass 100% (owned by Devs)
l Code coverage on new/changed code is minimum 60% with a goal of 85% (the milestone exit criteria)
l Test automation and product code is checked in at the same time
l Smoke build and BVT pass
Pre-checkin Testing Process
Phase 1 — Unit testing
1. Engineer designs and writes automated tests to address the code changes.
2. Engineer requests the Product-A Code Coverage system to process the set of private binaries that have been changed.
3. (Automated) Product-A Code Coverage system instruments the private binaries and detects the code changes.
4. (Automated) Test prioritization system selects the minimum set of existing test cases to achieve maximum coverage on the changed code and packs the results in a format that can be directly consumed by Test Automation system.
5. Engineer performs a test pass against the private bits using the set of tests from steps 1 and 4. Code coverage data is collected automatically by Test Automation system.
6. Product-A Code Coverage system reports the code coverage status of the code churn; if more tests are needed go back to step 1.
(Checkin requirement is minimum 60% block coverage on the changed code.)
The other comes from Product B Team.
SelectTests is a test prioritization tool produced by the Product B team. Its intended use is to allow a tester or developer to determine which tests need to be run on a private build of a given binary. SelectTests uses the Echelon tool from the Magellan team to perform the test prioritization and adds ease of use and other features on top of it.
Echelon uses Vulcan technology to compare 2 binaries. It determines which blocks are new or changed. Together these are called impacted blocks. It then accesses the code coverage information at our code coverage web site to determine which traces (aka tests) touch at least one of those impacted blocks. This is done using a code flow prediction algorithm.
If sorting into sets is enabled then Echelon runs an algorithm on the traces to group them into sets. A trace appears at most in 1 set. Sets are the minimum number of tests to get the maximum amount of coverage. The sets are listed in order of the coverage they have.
[Note SelectTests is not the algorithm from A Safe, Efficient Regression Test Selection Technique ]