Friday, February 14th, 2020

Why You Shouldn't Use Cucumber for API Testing

Conference Speaker

Cucumber is a behavior-driven development (BDD) tool that enables tests written in plain, easy-to-understand language. Each of the parts of the test—the initial state, the action, and the consequent state, which are structured in a “given, when, then” scenario format—can be separately determined to have passed or failed (or be pending).

Cucumber works well in these situations because:

It usually takes several days to get a test passing because the test is driving the design of the code, and we want to see our progress in getting the scenarios passing (for example, we might see the “given” statement passing already but the “when” statement still pending)
The prose originates from the planning meeting, where the product owner, the tester, and the developer agree to a variety of scenarios that act as valid, thorough, and automatable examples of the feature
Prose-based tests enables future readers to understand the driving business reasons behind functionality

This provides coverage of the capabilities, gives traceability between requirements and the code, and creates a "live documentation" for the system. When planning focuses on creating scenarios of these types, it results in higher collaboration and a better shared understanding. In fact, many devotees of BDD will say that the discussion and mindset leading to greater quality is the most important part.

It's fine for BDD scenarios to use APIs to do their work. It's often a good approach! But API testing is not the primary focus in these situations—or for Cucumber.

API Testing versus BDD

When we specifically talk about API testing, we mean positive and negative coverage of the API endpoints. In that case, we typically care about communication protocol and data-driven testing at the integration-test level—that is, we have a known URL, we send in some data, get some data back, and check that we get what we expect. We care quite a bit about the particular way we communicate with the endpoint.

It is not uncommon for people to use Cucumber (or other BDD framework) to test API endpoints. However, the features of Cucumber, such as testing using plain language and independence of steps within a scenario, don't really apply to the needs of integration testing.

The use of plain language isn't necessary because it's not relevant for understanding an API endpoint. The details of how an API endpoint work are often more about details at the technical-solution level rather than overarching capabilities or business needs.

It's a matter of perspective. What are you focused on when testing APIs? If the goal is to provide coverage on how the APIs work, then it's usually giving consideration to the "in the weeds" technical details: If I send an empty value for [name, age, height, etc.], I should expect to see the error field populated with a message indicating as much.

Furthermore, the independence between “given, when, and then” statements isn't needed because an endpoint test can be created and passed quickly.

(On the other hand, if you are driving the creation of new functionality by originating through the APIs, then it would make sense to use a BDD framework to do so. But I wouldn't call that API testing.)

When doing API testing, we shouldn't be creating all the underlying functionality. Rather, we should be focused on creating an interface to already-created business code that deals with the fussy details of API protocols. Those tend to be created and passed quickly, often following test-driven development. Pragmatically, since we can write and pass the test in a matter of minutes or hours, there isn't a need to make the steps independent.

A Different Perspective

An example application, Demo, provides coverage of different test levels in different ways.

For the acceptance test level, which is capability-oriented, it uses Cucumber to support BDD. In this situation, we shouldn't care about a specific technology, but rather a capability provided to the customer. For example, an online seller wants the capability to sell stuff. How that gets done is a separate concern, but the underlying capability is to sell stuff.

Separately, for the integration level, there is a set of tests written in Python using the Pytest framework (basically, a Python unit test driver) that simply sends data to an endpoint and checks what it gets back (see test_api.py). This is where we want to write a multitude of tests that cover all the myriad ways an API endpoint may work, from both positive and negative sides.

The current tests in that file are primarily happy-path and business-oriented, but there are examples that push the technical bounds, like "test_math_api_with_non_numeric". Since this is more a toy than a true real-world enterprise app, the endpoints are simplistic, and thus there's not much sophistication to test. Real API endpoints tend to have more sophistication that requires significant testing to attain high coverage.

One other aspect that is unique to API testing (and distinct from BDD) is that coverage relates to how many of the endpoints have been hit. BDD coverage should always be about business capabilities. It's a different perspective.

The True Purpose of Cucumber

In BDD you want to have as few tests as possible oriented toward a technology, like a particular API, a particular field on a page, or a particular button.

When you focus on tech (rather than capability), it hurts your tests in a few ways:

Technology changes often. Capabilities don't change nearly so often. The better you tie your tests to unchanging aspects (like capabilities), the less maintenance will be needed
The prose acts as a documentation to your system's capabilities and should be accessible to all participants (especially the business, but it's also good for technical folks to understand business reasons behind things)
Practically speaking, Cucumber tests are a bit more difficult to work with than unit tests; they carry state between methods with class variables, many IDEs give less ability to debug on them, and it can be complicated to maintain the set of prose-to-code mappings

Many people misunderstand the purpose of Cucumber. One common paradigm I hope to see less often is using Cucumber as a general test vehicle because it seems to yield clearer, plain-language test scripts.

Testers are understandably yearning for test scripts that better describe what is being tested and why. However, there is a hidden trap.

Writing Cucumber tests entails creating glue code that ties the English prose to actual code using regular expressions. Creating that glue code is a small price when you are testing capabilities, but it’s very expensive when your goal is to test all the finicky, small-scale details. Glue code is extra code to be written and maintained and can easily be a source of disconnect (like when the glue-code writer doesn't really understand what was desired in the prose).

Ultimately, Cucumber is a BDD framework, and it should be used solely to support BDD. API testing focuses on coverage of the API endpoints and is more oriented to the technical solution, unlike BDD testing, which is oriented to business capabilities. By focusing on the underlying capabilities of the system, BDD tests won’t represent an ongoing maintenance cost to the team.

Even though it’s tempting, you should choose a tool other than Cucumber for your API tests.

This article was originally published July 22, 2019, on StickyMinds.com.