Measuring Quality

Jo Mahadevan
4 min readDec 1, 2019

In software development we always talk about measures. There are several key performance indicators (KPI) and metrics in the agile software development process which provide guidance for strategic planning , evaluation and improving operational process. Some of the most common KPIs in agile are cumulative flow, sprint burn down, velocity, cycle time, code coverage, automated vs manual tests, et cetera. I was in a workshop recently which was dedicated to measures; There were several types of measures discussed, which got me thinking more deeply about them.

Measures are important and don’t need to be associated with a number. Like happiness, love , feeling appreciated or valued — everything has a measure. So at the workshop , I kept reflecting on how measuring something is such an important aspect of how far you are away from an objective/goal, provided it is measured properly, regularly and if these measures feed into the maturity and pace of your journey towards your goal.

The journey towards a goal is not about intensity but rather the accumulation of lots of little things that by themselves, are innocuous; But those boring things — when done consistently and measured regularly — helps you pace your journey towards the goal. For instance, one of the scrum masters I and my team worked with would insist on adding appropriate comments in the story before progressing the story to the next stage. We, as engineers, found this completely pointless and chose to just not do it, but the scrum master would be persistent and would ensure that we did the task, until it became a habit. We were not doing continuous deployment at that time but when we did plan a release, after every 2 sprints, the comments served as the micro documentation and provided good trace ability. The measure here was more fuzzy, we as a team agreed that writing comments on the story was a good practice; And thus, it became a part of our definition of a done review.

Going back to the workshop - where I was wrapped in the conundrum of what exactly these measures mean to me. I thought about how I measured quality of software thus far — and how I would’ve liked to measure quality going forward.

I never measured quality based on the number of post live issues, number of automated vs manual tests, the time taken to deliver something, how well the automation test pyramid was established in the teams, although objectively decent indicators of where we were in the journey, in my opinion, they were never a good measure of quality for me. I measured quality by the purpose of planned tests, the frequency in which those test ran, and the time taken to get feedback. The quality of the tests themselves were measured based on if they failed for the right reasons, i.e. their maintainability and their re usability.

The type of tests a team decides for ensuring quality is not just about fitting it into the pyramid, such as 70% unit tests, 20% service layer test and 10% UI driven tests, these are guidelines - not a required measure of quality!

Consider a simple architecture, a web page that interacts with an API; On a successful post request, the the users of the web page would get an order number on a modal window on the same page, and the database gets updated. The order was further processed by a desktop application, and the API also consumed a third party API to get a specific type of data.

So, here there was no need for a UI driven automated test for a web page. It would be exploratory tested for typos, compatibility and accessibility. The API did most of the work, so while all the parameters were validated by unit tests, the behaviour of the API was tested on a dev environment. These tests were then decided to be run after every new commit triggered a deployment into the dev. The journey around completing the order was tested once - for regression manually, prior to releasing on a staging environment, as the desktop application was not owned by us and our web page didn’t change any process in the desktop application.

So now we had a measure.

  • Unit tests — Validated each unit of the code (purpose) — Ran after every commit (Frequency)
  • Api Tests — Integration / smoke test (purpose) — Ran after every deployment to dev (Frequency)
  • UI Tests — Sanity(purpose) — was run only once as there no changes made
  • Regression tests — validation of end to end journey(purpose) — before a planned release as the risk of failure was less.

As the application grew, we added more layers; Every time we added a new layer of tests, it served a purpose — to ensure quality. And the frequency was decided based on the risk of something breaking and the impact it would have. The more frequent tests (fast tests) were run earlier on in the pipeline to production. The production was then monitored and the metrics then fed into the maturity of our tests.

In conclusion, our approach to quality changes as our application grows. Proper measures help us improve, and set us in the right path. The aim is not build a test pyramid — or automate everything, whereas, the aim is to build a robust quality approach such that changes to the software could be accommodated for, whilst not impacting delivery.