Testing Behaviours — Writing A Good Gherkin Script
Gherkin has been a choice of tool for a long time within the test community for implementing automated tests.
When I got first introduced to gherkin several years ago, I was fascinated with the idea of my test suites being driven by a gherkin layer. I first used gherkin using specflow for c#. There were 2 main things that attracted me to this:
- Being able to explain a functionality in a simple format helped establish a common language between those who code and those who didn’t.
- I also had better control over understanding my test coverage as in I knew which features had been automated and I could co-ordinate better with the exploratory testers in my team. Jenkins has good plugins for generating reports of the test run and this made it easier to establish a better form of communication with the team.
However, eventually I realised that maintainability of the Gherkin test suite was proving to be a challenge . There were too many scenarios , all of which looked like they were doing the same thing.I couldn’t differentiate actions from behaviours. I discussed this with the wider community, and their experiences mostly reflected my own. I looked up various test frameworks set up in my organisation that used gherkin-I found few patterns and realised that with few changes we could actually make the gherkin test suite more useful. So here are few tips.
- A TEST SUITE NAMED BDD
Often a gherkin driven test suite is called BDD and sometimes the test project is called testing . The idea of describing a scenario using the term “Given When Then” is not BDD. Gherkin is just a layer on our test framework. The test framework could be used to run functional tests/api tests/acceptance tests or regression tests. To call it BDD is misleading.
- REUSABLE STEP DEFINITIONS
Let’s curb the urge to create reusable step definitions . Despite how tempting it may seem, in the end, it’s pointless. A behaviour of a feature undergoing testing is verified and validated by a group of actions . The behaviour being tested is unique from one scenario to another. We often create reusable step definitions for actions and forget about the behaviour. For example, if you are creating a step definition in python:
@step('I click on the button : {button_name}')
The above definition can be used for all the click actions on the button. It seems very flexible but it could potentially destroy my gherkin tests — I could end up using all these buttons in a scenario to validate a behaviour but I don’t need to describe those actions in gherkin .
And I click on the button : submit
And I click on the button : save
And I click on the button : start
And I click on the button : continue
If the test fails on any one of those numerous button clicks there is no way to tell why this happened , nor which scenario caused this failure. It is very rarely because a button went missing.
So, filter out actions from behaviours, and only create reusable step definitions if it is absolutely necessary , they should not be the focus of designing a test suite driven via gherkin.
BEHAVIOURS AND ACTIONS
A good gherkin scenario always describes a behaviour. Following what is stated above, the focus should be on the behaviour — not on the actions that achieve that behaviour, as actions can be achieved programmatically; They don’t need to be described using gherkin.
The below example might help understand the context better:
A gherkin script that focuses on the behaviour and where all the actions are achieved programmatically.
Scenario: A prime user can opt for same day delivery option
Given that a prime user is accessing their amazon account
When the user purchases a product eligible for prime delivery
Then the user has an option to choose same day delivery at the checkout
A gherkin script that achieves the same thing as above but has too many actions described and implemented as a step definition:
Scenario: A prime user can opt for same day delivery option
Given that I login as an amazon prime user
Then I'm on the amazon home page
When I enter cat food in the search box
And I click the search button
Then I'm presented with several choices
When I click the cat food blah with prime delivery
Then I'm taken to the cat food blah page
When I add the food into my basket
And I click view my basket
And I click proceed to checkout
Then I have an option for same day delivery
- INTERDEPENDENT SCENARIOS WITHIN A FEATURE
A feature file in a test suite could have several scenarios . Each of these scenarios should be an individual , independent test. To elaborate, at any given point, a particular scenario should be able to be tagged and ran. I have seen test suites implemented such that the result of first scenario in the feature becomes the set up of the following scenario.
This is a very dangerous set up as a failure in one of your scenario would break the entire test suite and it would be very difficult to debug where the error happened.
- TOO MANY ASSERTS
I have seen “Then” statements after “Given” terms to assert that the set up done in “Given” was successful.
For example, if the test was to verify that an application can be successfully submitted , The expected “assert” statement is expected to be written as:
Then the user is notified of successful submission
On the other hand, a temptation to write the statements similar to the one written below is common:
Then the user is notified of successful submission
And the submit button is disabled
And the home link is available
And the user has an option to start a new application
Too many asserts defeats the focus of what you are testing. If the tests are focussing on the functional aspect of the system under testing then the assert should aim at validating those. Mixing all these checks together makes the gherkin look ugly and doesn’t help distinguish the type of test you plan to run for your application and again takes the focus away from behaviour.
GHERKIN FOR EVERYTHING
Sometimes we don’t need to use gherkin and that’s okay. The aim is to have a robust test suite that provides faster feedback and it doesn’t necessarily have to be driven by a gherkin layer.
Overall, my take is that all the layers within a system work towards achieving an agreed behaviour . So, if gherkin scripts are meant to help describing that behaviour in a simple language , then the same gherkin script could be used to run any test across the test pyramid , without having to change the scripts.