A big part of our thesis is that TDD is not really a testing activity, but
rather a specifying activity that generates tests as a very useful side
effect. For TDD to be a sustainable process, it is important to
understand the various implications of this distinction. [1]
Here, we will discuss the way our tests are structured when we seek to use
them as the functional specification of the system.
A question we hear frequently is "how does TDD relate to
BDD?" BDD is "Behavior-Driven Development" a term coined
by Dan North and Chris Matts in their 2006 article "Introducing BDD"
[2]. Many have made various distinctions between TDD, ATDD, and BDD, but
we feel these distinctions to be largely unimportant. To us, TDD
is
BDD, except that we conduct the activity at a level very close to the code, and
automation is much more critical. Also, we contend that “development” includes
analysis and design, and thus what TDD enables is more accurately stated to be “behavior-based analysis and
design”, or BBAD.
In BBAD, the general idea is that the "unit" of software that is
being specified is a behavior. Software is behavior, after all.
Software is not a noun, it is a verb.
Software’s value lies entirely in what it does, what
value the user
accrues as result of its behavior. In essence, software only exists in
any meaningful sense of the word when it is up and running. The job of a
software development team is to take a general-purpose computer and cause it to
act in specific, valuable ways. We call these behaviors.
The nomenclature that North and Matts proposed for specifying each behavior
of a system is this: Given-When-Then. Here's a simple example:
Given:
User U has a valid account on our system with Username UN
and password PW
The login username is set to UN and the
login password is set to PW
When:
Login is requested
Then:
U is logged in
Everything that software does, every behavior can be expressed in this fashion. Each
Given-When-Then expression is a specific
scenario that is deemed to have
business value, and that the team has taken upon itself to implement.
In TDD, when the scenario is interpreted at a test, we strive to make this
scenario actionable. So we think of these three parts of the scenario a
little differently, we "verbify" them to convert these conditions
into activities.
Imagine that you were a manual tester that was seeking to make sure the
system was behaving correctly in terms of the scenario above. You would
not wait around until a user with a valid account happened to browse to the
login page, enter his info, and click the "Login" button... you would
create or identify an existing valid user and, as that person, browse to the
page, enter the correct username and password, and then click the button
yourself. Then you'd check to see if your login was successful. You would
do all of these things.
So the Given wasn't given, it was
done by the tester (you, in this
case), the When was not when, it was
now do, and the Then was not a
condition but rather an
action:
go and see if things are correct.
"Given" becomes "Setup".
"When" becomes "Trigger".
"Then" become "Verify".
We want to structure our tests in such a way that these three elements of
the specification are clear and, as much as possible, separate from each
other. Typical programming languages can make this a bit challenging at
times, but we can overcome these problems fairly easily.
For example: Let's say we have a behavior that calculates the arithmetic
mean of two real numbers accurate within 0.1. Most likely this
will be a method call on some object that takes two values as parameters
and returns their arithmetic mean of those values, accurate within
0.1.
Let’s start with the Given-When-Then:
Given:
Two real values R1 and R2
Required accuracy A is 0.1
When:
The arithmetic mean of R1 and R2
is requested
Then:
The return is (R1+R2)/2, accurate
to A
Let's look at a typical unit test for such a behavior:
(Code samples are in C# with MSTest as the testing framework)
[TestClass]
public class MathTests
{
[TestMethod]
public void TestArithmeticMeanOfTwoValues()
{
Assert.AreEqual(5.5d,
MathUtils.GetInstance().
ArithmeticMean(7.0d, 4.0d),.1);
}
}
This test is simple because the behavior is simple. But this is really
not great as a specification.
The
Setup (creation of the
MathUtils object, the
creation of the example doubles 7.0d and 4.0d), the
Trigger (the calling
of the
ArithmeticMean
method with our two examples doubles), and the
Verify (comparing the
method's return to the expectation, 5.5d, and establishing the precision as
.1), are all expressed together in the assertion. If we can separate
them, we can make the specification easier to read and also make it clear that some of
these particular values are not special, that they were just picked as
convenient examples.
This is fairly straightforward, but easy to miss:
[TestClass]
public class MathTests
{
[TestMethod]
public void TestArithmeticMeanOfTwoValues()
{
// Setup
var mathUtils = MathUtils.GetInstance();
var anyFirstValue = 7.0;
var anySecondValue = 4.0;
var tolerance = .1;
var expectedMean = (anyFirstValue + anySecondValue)/2;
// Trigger
var actualMean = mathUtils.ArithmeticMean(anyFirstValue,
anySecondValue);
// Verify
Assert.AreEqual(expectedMean, actualMean, tolerance);
}
}
Here we have included comments to make it clear that
the three different aspect of this behavioral specification are now separate
and distinct from each other. The "need" for comments always seems like a smell, doesn't it? It means we can still make this better.
But we've also used variable names like "
anyFirstValue" to indicate that the
number we chose was not a significant value, creating more clarity about what
is important here. Note that
tolerance and
expectedMean were not named in this way, because their
values are specific to the required behavior.
This, now, is using TDD to form a readable specification, which also happens
to be executable as a test [2]. Obviously the value of this as a test is very
high; we do not intend to trivialize this. But we write them with a
different mindset when we think of them as specifications and, as we'll see,
this leads to many good things.
Looking at both code examples above however, some of you may be thinking
"what is this
GetInstance()
stuff? I would do this: "
// Setup
var mathUtils = new MathUtils();
Perhaps. We have reasons for preferring our version, which we'll set
aside for its own discussion.
But the interesting question is: what if you
started creating the object one way (using “new”), and then later changed your
mind and used a static
GetInstance()
method, or maybe even some factory pattern? If, when that change was
made, you had many test methods on this class doing it the "old" way this would require
the same change in all of them.
We can do it this way instead:
[TestClass]
public class MathTests
{
[TestMethod]
public void TestArithmeticMeanOfTwoValues()
{
// Setup
var arithMeticMeanCalculator =
GetArithmeticMeanCalculator();
var anyFirstValue = 7.0;
var anySecondValue = 4.0;
var tolerance = .1;
var expectedMean = (anyFirstValue + anySecondValue) / 2;
// Trigger
var actualMean = arithMeticMeanCalculator.
ArithmeticMean(anyFirstValue,
anySecondValue);
// Verify
Assert.AreEqual(expectedMean, actualMean, tolerance);
}
private MathUtils GetArithmeticMeanCalculator()
{
return MathUtils.GetInstance();
}
}
Now, no matter how many test methods on this test class needed to access
this arithmetic mean behavior (for different scenarios), a change in terms of
how
you access the behavior would only involve the modification of the single
"helper" method that is providing the object for all of them.
Many testing frameworks have their own mechanisms for eliminating redundant
object creation, usually in the form of a
Setup() or
Initialize() method, etc., and these can be used.
But we prefer the method because we then gain the ability to decouple the
specification from the fact that the behavior we’re specifying happens to be
implemented in a class called
MathUtils.
We could also change this design detail and
the impact would only be on the helper method (the fact that C# has a
var type is a real
plus here… you might be limited a bit in other languages)
But the spec is also not about the particular method you call to get the mean,
just how the calculation works, behaviorally. Certainly an
ArithmeticMean()
method is logical, but what if we decided to make it more flexible, allowing
any number of parameters rather than just two? The meaning of "arithmetic
mean" would not change, but our spec would have to. Which seems
wrong. So, we could take the idea a little bit farther:
[TestClass]
public class MathTests
{
[TestMethod]
public
void TestArithmeticMeanOfTwoValues()
{
//
Setup
var
arithmeticMeanCalculator = GetArithmeticMeanCalculator();
var
anyFirstValue = 7.0;
var
anySecondValue = 4.0;
var
tolerance = .1;
var
expectedMean = (anyFirstValue + anySecondValue) / 2;
//
Trigger
var
actualMean = TriggerArithmeticMeanCalculator(
arithmeticMeanCalculator,
anyFirstValue, anySecondValue);
//
Verify
Assert.AreEqual(expectedMean, actualMean, tolerance);
}
private double TriggerArithmeticMeanCalculator(MathUtils mathUtils,
double
anyFirstValue,
double anySecondValue)
{
return
mathUtils.ArithmeticMean(anyFirstValue,
anySecondValue);
}
private
MathUtils GetArithmeticMeanCalculator()
{
return
MathUtils.GetInstance();
}
}
Now if we change the
ArithmeticMean()
method to take a container rather than discrete parameters, or whatever, then
we only change this private helper method and not all the various
specification-tests that show the behavior with more parameters, etc...
The idea here is to separate the meaning of the specification from the way
the production code is designed. We talk about the specification being
one thing, and the "binding" being another. The specification
should change only if the
behavior changes. The binding (these private
helpers) should only change if the
design of the system changes.
Another benefit here is clarity, and readability. Let's improve it a
bit more:
[TestClass]
public class MathTests
{
[TestMethod]
public
void TestArithmeticMeanOfTwoValues()
{
//
Setup
var
anyFirstValue = 7.0;
var
anySecondValue = 4.0;
var
tolerance = .1;
//
Trigger
var
actualMean = TriggerArithmeticMeanCalculation(
anyFirstValue, '
anySecondValue);
//
Verify
var
expectedMean = (anyFirstValue + anySecondValue) / 2;
Assert.AreEqual(expectedMean, actualMean, tolerance);
}
private
double TriggerArithmeticMeanCalculation(
double anyFirstValue,
double anySecondValue)
{
var arithmeticMeanCalculator = GetArithmeticMeanCalculator();
return arithmeticMeanCalculator.
ArithmeticMean(anyFirstValue,
anySecondValue);
}
private
MathUtils GetArithmeticMeanCalculator()
{
return
MathUtils.GetInstance();
}
}
We have moved the call
GetArithmeticMeanCalculator()
to the Trigger, and
expectedMean
to the Verification [3].
Also we changed the notion of "trigger the calculator" to "trigger the calculation". Now, remember the original
specification?
Given:
Two real values R1 and R2
Required accuracy A is 0.1
When:
The Arithmetic Mean of R1 and R2
is requested
Then:
The return is (R1+R2)/2, accurate
to A
The unit test, which is our specification, very closely mirrors this
Given-When-Then expression of the behavior. Do we really need the comments to make that clear? Probably not. We’ve created a unit test that is a
true specification of the behavior without coupling it to the specifics of how
the behavior is expressed by the system.
Can we take this even further? Of course... but that's for another entry.
:)
[1] It should be acknowledged that Max prefers to say "it is a test which also serves as a specification." We'll probably beat him into submission :), but for the time being that's how he likes to think of it. We welcome discussion, as always.
[2] Better Software Magazine, March 2006.
[3] It should
also be acknowledged that we're currently discussing the relative merits of using Setup/Trigger/Verify in TDD rather than just sticking with Given/When/Then throughout. See Grzegorz Gałęzowski's very interesting comment below on this (and other things).