Test-Driven Development in the Larger Context

Introduction

A question that often arises in our consulting and training practices concerns the relationship between test-driven development (TDD) and acceptance-test-driven development (ATDD). Which one is “really” TDD? Which one should an organization focus on first? Which one should achieve the most attention and resources?

It is common to say that TDD is “developers writing unit tests before the production code”. Yes, that is one form of TDD; but it is a subset of a larger set of ideas and processes.

It is also common to say that ATDD is “acceptance tests being written before the development cycle begins”. Again, while this is correct, it reinforces the idea that it is a thing separate from what the developers do when they are “doing TDD”.

The reality is that they are all “test-driven development.” They are not conducted in the same way nor are they done by the same people (entirely). And the value they provide is different. But at the end of the day, TDD is the umbrella under which they fall.

Comparing ATDD and UTDD

There are two forms of the test-driven process. One is called acceptance-test-driven development (ATDD). The other we will call unit-test-driven development (UTDD). UTDD is what people most commonly have in mind when thinking of TDD.

ATDD and UTDD differ in some significant ways.

Audience. Who creates these tests, updates them, and can ultimately read and understand them in the future. We call this the “audience” of the tests.
Cadence. The pace of tests and the granularity of their activities. We call this the “cadence” of the tests. They are quite different and for good reason.
Automation. The value of automation and the need to automate. Both ATDD and UTDD can be automated, but how critical is such automation, what value is derived with and without it, and where should the effort to automate fall in the adoption process?

While there are differences, ATDD and UTDD are also very similar.

Tests ensure a detailed and specific, shared understanding of behavior. They help create a shared understanding of the valuable behaviors that make up a requirement, and the way those behaviors satisfy the expectations of the stakeholders.
Tests capture knowledge that would otherwise be lost. They capture and preserve high-worth enterprise knowledge in a form that can retain its value over time.
Tests ensure quality design. They help to create high-quality architecture and design, and all the value that this brings in terms of the ability of the organization to respond to market challenges, opportunities, and changes.

Both ATDD and UTDD are types of TDD. They work together into a very powerful way to create alignment around business value in software development.

Regardless of your process, every activity conducted by every individual in a software development organization should be ultimately traceable back to business value: every bit of code that’s written, every test that is defined and executed, every form that’s filled out, every meeting that’s called. Everything. TDD is in complete alignment with this concept. When it is properly conducted, TDD increases the value of every investment of time and effort.

The rest of this paper considers these differences and similarities.

Differences in Audience

ATDD and UTDD differ in terms of who creates tests, updates them, and reads and understands them in the future. We call these the “audience” for ATDD and UTDD.

ATDD Audience

Acceptance tests should be created by a gathering of representatives from every aspect of the organization such as business analysts, product owners, project managers, legal experts, developers, marketing people, testers, and end-users (or their representatives).

ATDD is a framework for collaboration that ensures complete alignment of the development effort that is to come with the business value that should drive it. These business values are complex and manifold, and so a wide range of viewpoints must be included. Acceptance tests should be expressed in a way that can be written, read, understood, and updated by anyone in the organization. They should require no technical knowledge, and only minimal training.

The specific expression of an acceptance test should be selected based on the clarity of that form given on the nature of the organization and its work. Many people find the “Given, When, Then” textual form (often referred to as “Gherkin”) to be the easiest to create and understand. Others prefer tables, or images, or other domain-specific artifacts.

Acceptance tests should use the language of the audience. Always. For example, we once worked with an organization that did chemical processing. In all their conversations and meetings, they used images like this in their presentation slides and on the whiteboards:

TDD

Perhaps most people would not understand these symbols but for this organization, it was obvious and clear. For them, expressing their acceptance tests this way lowered the bar of comprehension. Why make them convert this into some textual or other representation?

Acceptance tests should be represented in forms that are comfortable for the audience. For example, business analysts often spend much of their time working with spreadsheets or Gantt charts or Candlestick charts. Let them write tests, read tests, and update tests in these forms. They should not have to read or understand computer code (unless the only stakeholders are other developers).

Automating acceptance tests should never drive their form. Any representation of acceptance can be made executable given the right tools, even if those tools must be created by the organization. Choosing, say, Robot or Fit or Specflow to automate your acceptance tests before you identify your stakeholders is putting the cart before the horse.

UTDD Audience

Unit tests should be written by technical people: developers and sometimes testers. Often, unit tests are written in the computer language that will be used to develop the actual system although there are exceptions to this. Only people with deep technical knowledge can write unit tests, read them, and update them as requirements change.

To ensure the suite of tests itself does not become a maintenance burden as it grows, developers must be trained in the techniques that make the UTTD effort sustainable over the long haul. This includes techniques such as test structure, test suite architecture, what tests to write and how to write them, tests that should be avoided. Training of the development team must include these critical concepts, or the process rapidly becomes too expensive to maintain.

Differences in Cadence

The pace and granularity – what we call the “cadence” of the work – differ between ATDD and UTDD. This difference is driven by the purpose of the activity, and how the effort can drive the maximum value with the minimum delay.

ATDD Cadence

Acceptance tests should be written at the beginning of the development cycle, for example during sprint planning in scrum. Enough tests are written to cover the entire upcoming development effort, plus a few more in case the team moves more quickly than estimates expect.

If using a pull system such as kanban, the acceptance tests should be generated into a backlog that the team can pull from. The creation of these tests are part of the collaborative planning process and should follow its pace exactly.

At the beginning, acceptance tests begin by failing, as a group. Then, they are run at regular intervals (perhaps as part of a nightly build) to allow the business to track the progress of development as teams gradually convert them to passing tests. This provides data that allows management to forecast completion (the “burn-down curve”) which aids in planning.

The most important purpose of creating acceptance tests is collaboration. The tests are a side-effect – a very beneficial side-effect. The collaboration ensures that all stakeholder concerns are addressed, all critical information is included, and that there is complete alignment between business prioritization and the upcoming development effort.

When training stakeholders to write acceptance tests, the experience should be realistic. Work should be done by teams that include everyone mentioned in the “Audience” section above. Ideally, they should work on real requirements from the business.

UTDD Cadence

A single unit test is written and proved to fail by running it immediately. Failure validates the test because a test that cannot fail, or fails for the wrong reason, has no value. The developer does not proceed without such a test, and without a clear understanding of why it is failing.

Then, the production work is done to make this one test pass, immediately. The developer does not move on to write another test until it and all previously-written tests are “green.” The guiding principle is that we never have more than one failing test at a time, and therefore the test is a process gate determining when the next user story (or similar artifact) can begin to be worked on.

When training developers to write these tests properly, we use previously derived examples to ensure that they understand how to avoid the common pitfalls that can plague this process:

Tests can become accidentally coupled to each other
Tests can become redundant and fail in groups
One test added to the system can cause other older tests to fail

All of this is avoidable but requires that developers who write unit tests are given the proper set of experiences, in the right order, so that they are armed with the necessary understanding to ensure that the test suite, as it grows large, does not become unsustainable.

Differences in Automation

Both ATDD and UTDD can be automated. The difference has to do with the role of such automation, how critical and valuable it is, and when the organization should put resources into creating it.

ATDD Automation

The primary value of ATDD is the deep collaboration it engenders and the shared understanding that comes from this effort. The health of the organization in general, and specifically the degree to which development effort is aligned with business value, will improve dramatically once the process is well understood and committed to by all. Training your teams in ATDD pays back in the short term. Excellent ATDD training pays back before the course is even over.

While automating acceptance test execution is helpful, it is not an immediate requirement. An organization can begin to use ATDD without any automation and still get significant value from the process.

In the beginning, automation might pose too great of a challenge; don’t let that stop you from adopting ATDD and making sure everyone knows how to do it properly. The automation can be added later if desired; but even then, acceptance tests will not run particularly quickly and that is OK because they are not run very frequently – perhaps part of a nightly build.

In the beginning, it will likely not be clear what form of automation should be used. There are many different ATDD automation tools and frameworks available. While any tool could be used to automate any form of expression, some tools are better than others given the nature of that expression. If a textual form, like Gherkin, is determined to be the clearest and least ambiguous given the nature of the stakeholders involved, then an automation tool like Cucumber (Java) or Specflow (.Net) is natural and low-cost. If a different representation makes better sense, then another tool will be easier and cheaper to use.

The automation tool should never dictate the way in which acceptance tests are expressed. Automation should follow expression. This may require the organization to invest in the effort to create its own tools or enhancements to existing tools. This is a one-time cost that will have return on the investment indefinitely. In ATDD, the clarity and accuracy of the expression of value is what is important; automation is a "nice to have."

UTDD Automation

UTDD requires automation right from the outset. It is not optional. Without automation, UTDD could scarcely be recommended.

Unit tests are run very frequently, often every few minutes; therefore, if they are not efficient, it will be far too expensive (in terms of time and effort) for the team to run them. Running a suite of unit tests should appear to cost nothing to the team. (Of course, this is not literally true but that is the attitude you want the developers to have.)

Unit tests must be extremely fast, and many aspects of UTDD training should ensure that the developers know how to make them extremely painless to execute. Developers must know how to manage dependencies in the system and how to craft tests in such a way that they execute in the least time possible, without sacrificing clarity.

Since unit test are intimately connected to the system, most teams find that it makes sense to write them in the same programming language in which the system is being written. This means that the developers do not have to do any context-switching when moving back and forth between tests and production code, which they will do in a very tight loop.

Unlike ATDD, the unit-testing automation framework must be an early decision in UTDD, as it will match/drive the technology used to create the system itself. One benefit of this is that the skills developers have acquired for one purpose are highly valuable for the other. This value flows in both directions: writing unit tests makes you a better developer, and writing production code makes you a better tester.

Also, if writing automated unit tests is difficult or painful to the developers, this is nearly always a strong indicator of weakness in the product design. The details are beyond the scope of this article, but suffice it to say that bad design is notoriously hard to test, and bad design should be rooted out early and corrected before any effort is wasted on implementing it.

Similarity: Tests ensure a detailed and specific, shared understanding of behavior

If properly trained, those who create these tests will subject their knowledge to a rigorous and unforgiving standard; it is hard if not impossible to create a high-quality test about something you do not know enough about. It is also unwise to try and implement behavior before you have that understanding. TDD ensures that you have sufficient and correct understanding by placing tests in the primary position. This is equally true for acceptance tests as it is for unit tests, it's just that the audience for the conversation is different.

Similarity: Tests capture knowledge that would otherwise be lost

Many organizations have large, complex legacy systems that no one understands very well. The people who designed and built them have often retired or moved on to other positions, and the highly valuable knowledge they possessed left with them. If tests are written to capture this knowledge (assuming that those who write them are properly trained), then not only is this knowledge retained, but also its accuracy can be verified at any time in the future by simply executing the tests. This is true whether or not the tests are automated although automation would offer a big advantage. This leads us to view tests, when written up-front, as specifications. They hold the value that specifications hold, but add the ability to verify accuracy in the future.

Furthermore, if any change to the system is required, TDD mandates that the tests are updated before the production code as part of the cadence of the work. This ensures that the changes are correct, and that the specification never becomes outdated. Only TDD can do this.

Similarity: Tests ensure quality design

As anyone can tell you who has tried to add tests to a legacy system after the fact, bad design is notoriously difficult to test. If tests precede development, then design flaws are made clear early in the painful process of trying to test them. In other words, TDD will tell you early on if your design is weak because you will feel the pain when trying to test it.

TDD Does Not Replace Traditional Testing

TDD, whether ATDD or UTDD, does not replace traditional testing. The quality control/quality assurance process that has traditionally followed development is still needed. This is because TDD does not test all aspects of the system, only those needed to create it. Usability, scalability, security, and so on still need to be ensured through traditional testing. TDD does contribute some of the tests needed by QA, but certainly not all of them.

TDD Contributes to a Healthier Culture

Another benefit to the adoption of TDD is that it leads to a healthier culture. In many organizations, developers view testing as a source of either no news (the tests confirm the system is correct) or bad news (the tests point out flaws). Likewise, testers view the developers as a source of myriad problems they must detect and report.

When TDD is adopted, developers have a clearer understanding of the benefits of testing from their perspective. Often, TDD becomes the developers’ preferred way of working because it leads to a kind of certainty and confidence that developers are unaccustomed to. Likewise, testers begin to see the development effort as a source of many of the tests that they, in the past, had to retrofit onto the system. This frees them up to add the more interesting, more sophisticated tests (that require their experience and knowledge) which otherwise often end up being cut from the schedule due to lack of time. This, of course, leads to better and more robust products overall.

Conclusion

So, the answer is that TDD and ATDD are not distinct from one another, but part of an overall approach to software development that places “a shared understanding of the behavior needed to achieve business value” first and foremost in the process, and that ensures alignment with the understanding in every activity that consumes resources. It also ensures that the knowledge possessed by the organization today will not be lost in the future.

TDD helps everyone.