We recently got a question from Tomas Vykruta, a colleague of ours, and we felt it turned out to be such a good, fruitful question that we wanted to pass it, and our answers, along in this blog.
Here is Tomas' question:
Do you prefer to have unit tests written against the public API, or to test individual functions inside the API? I've seen both approaches at my company, and in many cases, a single class is unit tested with a mix of the two. I haven't seen this topic addressed in any style or testing guides, so it seems to be left as a choice to the author.
While there is likely no right or wrong answer here and each class will require some combination, I thought it would be interesting to enumerate your real world experiences (good and bad) resulting from these 2 strategies. Off the top of my head, here are some pros (+) and cons (-).
API-level:
+ If internal implementation details of API change,s the unit tests don't have to. Less maintenance.
+ Serves as documentation for public usage of API.
+ Does not require fabricating the internal API in a way as to make every function easily testable.
+- Possibly less code to write.
- Does not serve as documentation for individual internal functions.
- Unit tests are less likely to test every single internal function thoroughly.
- Test failures can take some time to track down and identify and require understanding the internal API.
Internal API unit testing (individual functions):
+ Unit tests are very simple, short, quick to write and read.
+ Functions are very thoroughly tested, easy to verify against full range of inputs.
+ Serves as documentation for every internal function.
+ Test failures are easily identifiable even for engineers not familiar with the code base, since each test is focused on a very limited bit of code.
- When any implementation details change, the tests must change with it.
- Not useful to pure external API users who don't care about internal implementation details.
Scott's Response:
My view is this: if you consider the test suite as you would a specification of the system, then the question as to whether to test at one level or another becomes: “is it specified?”
Systems produce behavioral effects, and these effects are what determine the value of the system. Value, however, is always from the point of view of a “client” or “customer” and every system has several customers. All these customers have a behavioral view of the system which can be specified.
For example, the end users have a specification: “this can accurately calculate my income tax”. But so does the legal department: “this has a EULA that indemnifies us against tax penalties”. And the marketing department: “the system has a tax-evaluation feature that our competitor does not”. And the developers themselves: “this has an extensible system for taxation algorithms.” Etc…
Anything in anyone’s spec needs a test. Some of these will be at the API level, some will be further in.
Not all implementation details are part of the specification. If you are able to refactor a particular implementation and still satisfy all customer specifications, then the implementation does not require a separate test.
Amir's Response:
Scott has already expanded on the difference between testing and specification. I would like to add a little to this ‘specification’ perspective.We would love to hear from all of you on this question!
Let me start by saying that all TDD tests must only use public interfaces. This can be interpreted to mean – you must only test through APIs, as they are the public interface of the system. This is true when you consider the external consumers of the system. They see only the public API and hence ‘feel’ the system’s behavior through it. The TDD test will specify what this behavior is (for better or worse).
And just to clarify – when we say ‘public interface,’ we do not refer only to the exposed functional interface. A public interface can also be the GUI, database schema, specific file formats, file names, URL format, a log (or trace facility), Van Eck phreaking, or a Ouija board. As long as the usage of the public interface allows an external entity to affect your system of vice versa, it is considered public.
Some of the interfaces mentioned above may be used by entities within the company, such as support or QA. For all intents and purposes they are still customers of the system and as such their needs (e.g., the types of error report generated under specific circumstances, or the ability to throttle the level of tracing done, or the ability to remotely control a client system) must be specified in the TDD tests. After all, you still want the ‘intra-organizational’ behavior to be known an invariant to other changes.
When we do TDD however, we are not concerned only about the system’s external behavior (as defined above), but also about its internal behavior. This internal behavior has two manifestations (and this is our arbitrary nomenclature, but I hope it makes sense). First is the architecture of the product, second is its design. These two may seem to be the same but there is a subtle difference between them.
The system’s architecture is the set of design decisions that were made to accomplish functional and performance goals. Once set, these become a requirement. An individual developer or team cannot decide to do things differently, but has to operate with these architectural guidelines. This is specified through a set of tests that specify how every architectural element contributes to the implementation of the desired overall behavior.
The system’s design is the set of design decision that are made by the team and individual developers, and are considered to be ‘implementation choices’. The team can assign whichever responsibilities it deems reasonable to the different design entities in order to achieve the desired behavior. This is all well, except that there is one ‘tacit’ requirement that is solely in the responsibility of the team (and probably the technical organization management). This requirement is maintainability, and it is what guides the team in their design choices. The TDD tests help us specify both what the system design is and also what the specific responsibilities assigned to the system entities are.
The point about both design and architecture is that they are internal to the system. As such, how can you test-drive them through the system’s APIs? By testing through the APIs I can see that the behavior is specified correctly. I cannot see that the architecture is adhered to or that the design promotes maintainability.
The answer to this paradox lies in the definition of the word ‘public’. Public is a relative term. If you live in a high rise condo, then the ‘public’ interface may be the building’s front door. But consider the individual apartments. The neighbors can’t come into your condo at will, can they? The condo has a public interface – its door, which is hidden to those outside the building (private) but visible and usable by the internal neighbors. Inside your condo this division continues. You have rooms, with doors (their public interfaces), and storage cabinets, with their doors, and boxes, with their lids, and bottles with their caps. What we get is a complex set of enclosures which are public to their immediate surrounding and private to anything further out.
Computer systems are the same. The APIs are the public doorways to the surrounding clients – these clients do not see the way the system is composed. But the elements of the system themselves do see this design –- they can see the other elements (which they interact with) although they cannot see inside these elements. The interfaces that these inner elements expose, are they private or public. Well, that depends on who you’re asking. From the perspective of the outside clients – they are private. From the perspective of the peer elements they are public. Since they are public, they should be specified through TDD, and this is exactly how we specify the system’s architecture and design.
So, in a nutshell, the answer to the question – “do we test external or internal APIs” is yes.