Review of Introduction to Combinatorial Testing

I received a copy of this book from ACM SIGsoft Software Engineering Notes (SEN) in exchange for reviewing it.

Introduction to Combinatorial Testing
 Written by D. Richard Kuhn, Raghu N. Kacker, and Yu Lei
and published by CRC Press, ©2013, (paperback), 978-1466552296, 319 pp., $71.95.

The book is the most comprehensive introduction to the technique of combinatorial testing I’ve seen.  It’s an interesting amalgamation of academic and very practical test guidance for using combinatorial testing.  The book is actually a compendium from several authors. The examples can require a bit of study to actually glean what the authors are expressing.  Some chapters could easily be skipped by practitioners focused just on using the technique.

Most chapters have figures, tables of inputs, outputs, or tests, and sometimes pseudocode.  Each chapter ends with review questions and answers which help the reader judge their understanding of the material.

Even if the literature is inconsistent, it would have been nice if the authors had chosen a single nomenclature across the book, instead of for each chapter.  For example, Chapter 1 defines “covering array CA(N, n, s, t)”  while Chapter 14 defines “a fixed-value covering array denoted by CA(N,vk,t)”.  Similarly, Chapter 14 states “Methods for constructing CAs can be put into three categories: (1) algebraic methods … (2) metaheuristic methods … (3) greedy search methods”, while Chapter 15 tells us “covering array construction can be classified as either computational or algebraic”, and later breaks algebraic into “computing a mathematical function” and “recursive construction”.

The first four chapters introduce the concept and illustrate combinatorial testing. Chapter 1 introduces a lot of foundation quickly which may be overwhelming for those not used to formalism and mathematics.  Appendix A also provides some of the mathematical background.  Chapter 2 gives examples and rationale, along with Appendix B’s empirical data on software failures.  Chapters 3 and 4 illustrate configuration testing and input testing with the free ACTS tool from NIST. ACTS is also further described in Appendix D.

Chapter 5, Test Parameter Analysis, uses classification tree method (CTM) for this critical step before being able to do combinatorial testing.  It also addresses the number of tests versus known error patterns. It points to the practical need to identify “missing, infeasible, and ineffectual combinations”, and points out “There are few bad tests – but a lot of unproductive ones”.

Chapter 6, Managing System State, introduces yet another notation, direct product block (DPB) which is more understandable to computers than to humans.  This input mechanism for the commercial TestCover tool helps illustrate multiple models for testing UML state machines.

Chapter 7 on measuring combinatorial coverage seems more theoretical (proposed measures), than practical (which measure helps most in empirical testing).  It also makes reference to the Best Choice Strategy papers that discounted the effects of fault masking as described elsewhere in this book.

Chapter 8, Test Suite Prioritization, presents the method and empirical results from ordering tests based on combinatorial covering.  The Combinatorial-based Prioritization for User-session-based Testing (CPUT) tool is described for creating test suites from user logs of real behavior.

Chapter 9 gives practical guidance about when to choose random values versus covering arrays.  Chapter 10 describes covering subsets for the factorial combination of sequences, but without evidence of the error finding effectiveness of this approach.

Chapter 11, Assertion-based Testing, and Chapter 12, Model-based Testing, attempt to address the oracle problem and are not specific to combinatorial testing.  If you go to the trouble of using a symbolic model verifier (SMV), you generally can do advanced test generation, rather than just predicting output from combinatorial input as described here.

Chapter 13 “introduces the fault localization problem, describes the basic framework for fault localization using adaptive testing, and discusses how some of these approaches might be implemented.”  Unfortunately, no empirical data comparing these various approaches, or for delta debugging, are given to guide the practitioner in which to choose.

Chapter 14 gives a nice bit of history and helps distinguish the sometimes-confused orthogonal arrays from covering arrays.

Chapter 15 is background for those who want to understand the algorithms used in the tools.

While Appendix C gives pointers to several tools, I would like to have seen more pointers to practical materials, such as the list of tools at http://www.pairwise.org/tools.asp, and the Domain Testing Workbook by Cem Kaner which complements the multiple, short introductions given to equivalence classes, boundary value analysis, etc.  The examples are sometimes small and can thus be misleading, e.g., Table 6.2 “valid calendar dates” does not appropriately account for century leap years such as 1900 and 2100.

The authors have done little to reduce the burden on the reader of having different nomenclatures in different chapters, but overall I liked the book, and it provided me greater depth and breadth of understanding combinatorial testing.

Advertisements
Posted in software testing | 1 Comment

Reliable Backup is hard to do

I believe in triple copies after working with Cosmos at Bing and other similar services and reading recommendations and descriptions of backup schemes.

So at home, we have 2 disk copies and a cloud copy of every file.
So I thought I was covered.   Not quite.
We have a lot of old static files (photos and videos and old docs) that we keep on a read only disc and a backup on a disk kept in our basement.
For active files, we used a mirrored (two 3TB disks) from Buffalo.

Well, the Buffalo HD-WL6TU3R1 device failed.  It wouldn’t boot.  When turned on, both disk access lights blink red for a few seconds and then shut off.   The manual doesn’t describe this diagnostic code and contacting Buffalo was useless. They just told us we are out of warranty.

My recommendations:

  • Don’t buy Buffalo
  • If you do buy Buffalo, toss it immediately after the warranty period, because it is useless.

My suspicion is that the controller in the device, a Single Point of Failure, failed.

No problem, I thought, I removed the individual drives and using my Coolmax multifunction converter, I temporarily hooked up the drives individually to the PC via USB.  They were readable, but it turns out about a third of the data when being copied produced
“ERROR 87 (0x00000057) Copying File . . . The parameter is incorrect”
Tried running chkdsk to repair the disk, but it failed with several errors.

So now 2 of our 3 copies are incomplete.   No problem, we use Backblaze.

In general, I love Backblaze and have recommended it to all of my family and friends because it is truly unlimited backup, reasonably quick, doesn’t seem to slow down our systems, and works smoothly, quietly, automatically behind the scenes.   The annual price is extremely reasonable for backing up our current 6TB of data.

I have previously restored from Backblaze a few small files while I was travelling and realized the file I needed was on a computer back home.
I also previously restored my 200GB music library with a series of restores over just a couple of days using downloaded zip files.

We replaced our dead Buffalo 3TB mirror with 8TB mirror (two 8TB disks) WD MyBook Duo.   Now we have room to grow as GoPro 4K Videos dump 27GB/hour (instead of HD 9GB/hour).

But now we had to restore the 2TB of data we had mirrored.
Backblaze provides a very reasonable alternative, for $189 refundable, they FedEx us a disk.  We ship the disk back in 30 days and they refund the $189.  We just pay shipping back!

Only 2 problems:

  • It took over a week for Backblaze to “gather” the data and then to “build” the disk, before shipping it. Backblaze recommends doing online .zip restores for files you need during the week.   We had to do a few.
  • Backblaze keeps your “data”, but not the meta-data, specifically the timestamps on the files. So all of your files lose their date and become created, modified, and accessed at exactly the same time – when the disk is built to send you.

I find the lack of timestamps screws up a lot of things for us.  We can’t tell when a picture or video was taken unless it happens to have it inside the metadata of the file (and most of our oldest pictures do not).    Many of our documents for business have different versions and to know when a tax document, corporate motion, or other such file was created or modified is useful.

So Backblaze gives us our data, but not our timepstamps.   Beware.

Ultimately we chose a a multi-step solution:

  • Restore those files I had made a 4th local offline copy of 8 months ago.
  • Restore those files we could read off the broken Buffalo raw disks.
  • Restore using Backblaze ZIP files instead those we really cared about
    (750GB downloaded over 3 days in multiple 100GB downloads)
  • Restore any missing files (about 15% of all of them) using the Backblaze data, with wrong timestamps.

Now, I think I need to create a periodic job that dumps into a file all of the timestamp data, so Backblaze will back that up.   Then I can use a program to reset the timestamps to those that I have logged after I’ve done a restore.   Backblaze can be a hassle.

 

Posted in software testing | 3 Comments

Software Testing: A Research Travelogue by Orso and Rothermel

Interesting survey on updates to software testing was given as an ACM Sigsoft webinar.
Software Testing: A Research Travelogue,” Alessandro Orso, Gregg Rothermel, Willem Visser.
Based on their paper “Software Testing: A Research Travelogue (2000-2014)” by A. Orso and G. Rothermel in Proceedings of the 36th IEEE and ACM SIGSOFT International Conference on Software Engineering (ICSE 2014) — FOSE Track (invited).

This 10 page paper is followed by 210 references! Most encouraging to me was the slide on Empirical Studies & Infrastructure.   I fully agree that Testing is heuristic and thus must be empirically evaluated:

“• State of the art in 2005: study on 224 papers on testing (1994–2003)
None 52%, Case studies 27%, Experiments 17%, Examples 4%

Things have changed dramatically since then 

  • Empirical evaluations are almost required
  • Artifact evaluations at various conferences”

In their conclusion they also stated something I strongly believe in:

“Stop chasing full automation

  • Use the different players for what they are best at doing
    • Human: creativity
    • Computer: computation-intensive, repetitive, error-prone, etc. tasks “

I hope all professional testers are aware of the many topics they touched on:

  • Automated Test Input Generation using Dynamic Symbolic Execution,
    Search-based Testing, Random Testing, Combined Techniques.
  • Combinatorial Testing, Model-Based Testing, and Mining/Learning from Field Data
  • Regression Testing – selection, minimization, prioritization
  • Frameworks for Test Execution and Continuous Integration
Posted in software testing | Leave a comment

Thoughts on Rimby’s How Agile Teams Can Use Test Points

At February 3 SeaSpin meeting, Eric Rimby provided a discussion-provoking thought experiment around what he termed “Test Points” (analogous to Story Points). I didn’t quite have time to snapshot his slide, but I think he finally defined them something like:

“ The number of functional tests cases adequate to achieve complete coverage of boundary values and logical branches strongly correlates with effort to develop.”

He counts functional test cases specified as part of backlog grooming for a user story as “test points”.

While I’ve always had issues with counting test cases (e.g. “small” versus “large” tests and Reponse to How many test cases by James Christie), he at least restricted the context in which the test cases he was counting were created.   He presumed a team trained in a particular test methodology for doing boundary values and logical branches (I suggested Kaner’s Domain Testing Workbook), and that they compared notes over time. Another audience member afterwards indicated that Pex (or other tools) could probably auto generate many of these cases. Similar to how a scrum team should get more uniform in ascribing story points over time, Eric expects the number of functional test cases estimated by various team members for a story would become sufficiently uniform over time.

While I disagree with many of the suppositions he made during his talk, I agree that tracking the number of functional test cases estimated for a story might be a useful thing to track. Whether it correlates to anything remains to be measured. However, I think just getting teams to be better about their upfront Acceptance Test Driven Development (ATDD) as part of story definition can only help.

Abstract from Meetup.com , How Agile Teams Can Use Test Points
Test points are similar to story points. They can be used to estimate story size, track sprint progress, normalize velocity across teams, among other things. Test points have some advantageous that story points do not. They could be used instead of, or alongside with story points.

Posted in software testing | Leave a comment

What are Synthetics?

I attended PNSQC Birds of a feather session “Why do we need synthetics” by Matt Griscom. It was advertised as “The big trend in software quality is towards analytics, and its corollary, synthetics. The question is: why and how much do we need synthetics, and how does it replace the need for more traditional automation?”
I spoke with Matt briefly to understand what he meant by synthetics, because I thought it was a rare, relatively unused term.

I learned a lot from other participants at the session.   First, New Relic is trying to stakeout the term! http://newrelic.com/synthetics   (They describe test monitors as Selenium-driven platform sends data using turnkey test scripts).

Second, I attended a great talk, which I highly recommend, by former colleague from Bing:
Automated Synthetic Exploratory Monitoring of Dynamic Web Sites Using Selenium by Marcelo De Barros, Microsoft.

So synthetics are mainstream.   So what are synthetics?   What are not synthetics?   I had a hard time parsing synthetics as corollary of analytics. Still do.
For me synthetics are tests that run in production (or production-like environments) and use production monitoring for verification.
Analytics are just a method, in this context, for doing monitoring.   To me synthetics are artificial (synthetic) data introduced to the system.

I thought synthetics were almost always automated and A/B testing would be a type of testing where synthetics wouldn’t apply.   I was proven wrong on both with a single example: Using Amazon’s Mechanical Turk to pay people to choose whether they like A or B.!     This is manual testing and synthetic, as it is not being done by the real user base.

Maybe the problem with “synthetics” is the same problem I have with “automation”.   Even “test automation” isn’t very specific, and means many things.   I’m not sure if “synthetics” is supposed to mean synthetic data (Wikipedia since 2009), synthetic monitoring (Wikipedia since 2006 – the description also uses “synthetic testing”), or something else.

Posted in software testing | 1 Comment

How Data Science is Changing Software Testing – Robert Musson talk

I enjoyed Robert Musson’s recent  Qasig.org presentation How Data Science is Changing Software Testing and recommend you watch it or at least read Robert’s Presentation Slides which don’t do it full justice, but should tease you.
As the abstract stated: It will describe the new skills required of test organizations and the ways individuals can begin to make the transition.

I worked with Bob a few times while at Microsoft, and he truly was one of the original Data Science testers for the past decade doing Data Analytics.   He says (37:50 into video) the tide has turned recently and he has “seen more progress in last 6 months than seen in past 10 years.”

So now I need to learn

  • Statistics, e.g., r-value, p-value, Poisson and Gamma distributions
    Homogenous (non-changing) or Non-homogenous (changing) Poisson for reliability measurements to get me used to time analysis.
  • R language (open source version of S).
    Object oriented with many packages to do exploratory data analysis and quick linear models.
  • Python for easier data manipulation including building dictionaries and packages for linear algebra

So I can prepare for the mindset change.

Mindset change is to one of information discovery vs. bug discovery

An audience member asked how to learn, and Bob recommended Coursera.org for many courses, including statistic courses..  He called out specifically,
Model Thinking – Scott Page – U. of Michigan.
I love models, but I might also start with
https://www.coursera.org/course/datascitoolbox

Posted in software testing | Tagged , , , | 2 Comments

Testing Magazines

Last week I listened to the Webinar, State of Software Testing. Tea Time With Testers  approached testing experts Jerry Weinberg and Fiona Charles to review their ‘State of Software Testing’ survey for year 2013 survey results.   Mostly the experts indicated the questions were poor and thus the results irrelevant.   But one side comment caught my attention since they both agreed:

every tester should read at least one test magazine a month.

While I do that, I asked a colleague recently, and he said no and wasn’t even aware of the many free online test magazines available.  Thus, this post to list several Test Magazines to choose from.     Note, I consider this an addition to the poll I’ve often heard, which I also agree with: have your read at least 1 book on software testing and better – have your read at least 1 book on software testing every year.

So now my belief is professional testers as part of their continuing education in their profession should read at least 1 book a year and 1 magazine a month about software testing.    Maybe following some blogs also and many of the magazine sites have associated blogs.

Since coming to Salesforce.com, I discovered two of these magazines due to posts by Reena Mathews.     Not in a priority order, just a count:

  1. Tea Time With Testers
  2. Automated Software Testing
  3. Better Software
  4. Professional Tester
  5. Testing Trapeze
  6. Software Test & Quality Assurance
  7. Testing Experience
  8. Testing Circus
  9. Testing Magazine

Non Free magazines:

10 .ASQ Software Quality Professional       http://asq.org/pub/sqp/index.html

Blog examples:

There are many sites that list lots of software testing blogs.

Posted in software testing | 1 Comment