Part 1: Instrumentation
I have always been fascinated by the idea of being able to measure code quality. Recently, I had a chance to dig in and understand how a code coverage tool works under the hood. It was a fun exercise — so, I thought that I’ll share.
Code Coverage is a measure of the percentage of your codebase covered by your test-suite. It is a simple metric but arguably a strong indicator of code quality.
In this post (and the next), we are going to try to understand how code coverage is measured and reported. We will learn this by building a simple code coverage tool along the way. Excited? Let’s begin.
It is intuitive to think of code coverage as a process of finding out if all the lines in our source code are covered by our tests. Lines don’t mean much though. A more fundamental structural unit of a program that is relevant to us is a statement.
Before we measure coverage, we need to first find out the total number of statements in the source. As an exercise, look at the code above and try to estimate the number of statements in it.
Done? Now is the hard part: Figuring out how many of those statements are actually evaluated when our tests run. At this point, if you are wondering — “Can we simply add counters? One for each statement?” — that is exactly what we are going to do.
This is the idea that is central to code coverage: We are going to take the source program, modify it (without altering its behaviour) to add counters and use the modified source for testing in place of the original one.
There are three steps involved in the instrumentation process:
- Parsing our source into an intermediate code representation that is suitable for modification.
- Modifying the intermediate code representation as we need
- Regenerating our source from the intermediate code representation.
If you come from a web development or web content extraction background (or have read Praveen’s previous post), you may be familiar with the DOM — a tree representation of the HTML document that, with its APIs, lends itself much more nicely to analysis and mutation of a web page when compared to the raw HTML document itself.
The idea behind parsing the source code is something similar. As our intent here is to make modifications to the code in a safe way, we need a data structure that is suitable for querying and mutation. The source, by itself, fails short of meeting this goal.
A parser takes the source program, uses the language’s grammar to validate its syntax and build what is called an Abstract Syntax Tree (similar to the DOM Tree). This is the intermediate code representation that I alluded to in the previous section.
This is how the code to parse the source will look like:
You can visualise and interact with the AST generated from our source here.
This is the fun part!
Let us traverse the syntax tree handed over to us by the parser and try to add in our counters and the counting logic. We are going to use two functions to understand how to do this: onEachPath() & onExitProgram():
Let us assume that our onEachPath() function will be called once for each path (between two nodes) in the AST. Once we confirm that we are indeed dealing with a statement, we do four things:
- Label our statement with a serial number — a statement ID.
- Initialise a counter for our statement
- Stash the location information for our statement. This comes in handy during the Visualization phase when we will want to highlight uncovered statements in the code.
- Insert an update expression before our statement (can you think of reasons why we don’t add after?) that increments our counter: __coverage__.c[x]++ where x is our statement ID.
Let us assume that our onExitProgram() function will be called after the traversal of the entire tree is complete. At this point, we stuff the collected statement locations and counter initialisations into our program, at the top.
We pick a variable starting with a double underscore — __coverage__ as our coverage variable to keep the odds of it colliding with the name of another variable in the source low.
Now that we have modified our syntax tree the way that we wanted to, we can generate the instrumented source code from it. All together now:
And this is how our instrumented source code looks like:
And that’s a wrap!
In this post, we learnt how to take our single-file source program and instrument it to be able to able to measure useful information at run-time like the number of evaluated statements.
In the next post, we’ll learn how to:
- Integrate our instrumented source code with our test runner.
- Report the collected coverage metrics.
- Think about extending our tool to cover block statements.
If you are interested in the source code for this exercise, you can find it on GitHub.