Test Driven Development in the Clang Compiler

A while back, I participated in a software engineering capstone with a group of other computer science students to complete my degree. Our project was to create a from-scratch implementation of grsecurity’s “randstruct” GCC plugin for the Clang compiler. Long story short, we ended up sending out a request for comments (RFC) on the initial draft that we produced during the capstone. A number of Clang/LLVM contributors took the time to review what we made and kindly suggested some changes for a future revision.

Since then, not much has happened with the Clang Randstruct RFC. We (the members of the capstone team) have all gone our separate ways and have been busy settling into post-university life.

Over the past week or so I’ve been re-familiarizing myself with the project and working on a new revision that addresses all of the comments from the RFC. The primary concern with the RFC was test coverage — something that we kind of scratched our heads over during the implementation of our RFC draft.

The entire time we had been compiling source code with our feature and without the feature and manually inspecting structure layouts with either pahole, gdb, or even just a test program that calculated and printed the offset of a structure’s field(s). The expectation of that last manual test method being that it is–of course–a different offset than the unmodified compiler produces.

Can we even automate that? What do we do? Do we compile something and compare a diff of the assembly? A diff of the pahole output? Is that even accurate? What kinds of inconsistencies or “noise” would throw that off? I mean, we could certainly compare the outputs of the offset calculation test programs, but what kind of coverage would you call that? Toy programs don’t even scratch the surface of complexity of many “real” software projects.

It turns out we were overcomplicating it. I don’t mean that in a rude way. Often times when I’m introduced to a problem area for the first time I’ve found that my first solutions often do overcomplicate things. This is a normal learning experience.

We felt slightly lost in the massive, complex project that is the Clang compiler. We weren’t completely in the dark, but we were holding a candle, not a lighthouse (granted, one could never hold a lighthouse, the analogy here is about illumination). Clang is huge and complex, but it also exposes powerful interfaces into its internal machinery.

Last week I started poking around the unittests/ folder to get a feeling for how the tests are written in the first place. After a little bit of searching, I found this:

AST0 = tooling::buildASTFromCodeWithArgs(Code0, Args, InputFileName);

You can’t see me, but I’m wiping a tear from my eye. It’s beautiful. This function will compile the code that lives in the string called Code0 with the compiler arguments stored in Args (the InputFileName is unused for my purposes, since the code is in the string). It returns the abstract syntax tree that results from compiling the code. It is the star of this blog post.

Our entire project is predicated around manipulating the representation of a RecordDecl (Structure) in Clang’s Abstract Syntax Tree. This is a huge win.

I like test driven development. I also know that test coverage is primarily what needs to be addressed for the next revision of this if there’s any hope of getting this code upstream. So I’m thinking: why not outline all of the tests that were suggested by the reviewers and try to “rewrite” Clang Randstruct following the principles of test driven development? This way I can more easily address and iterate on all of the concerns outlined for this next revision while increasing test coverage the whole time! (And maybe fix a bug or two along the way.)

So, without further ado, I made a file in the unittests/AST/ directory and wrote these two functions:

static std::unique_ptr<ASTUnit> MakeAST(const std::string& SourceCode, Language Lang)
{
    auto Args = getBasicRunOptionsForLanguage(Lang);
    auto AST = tooling::buildASTFromCodeWithArgs(SourceCode, Args, "input.cc");
    return AST;
}

static RecordDecl* GetRecordDeclFromAST(const ASTContext& C, const std::string& Name)
{
    return FirstDeclMatcher<RecordDecl>().match(C.getTranslationUnitDecl(), recordDecl(hasName(Name)));
}

The first function taps into Clang’s tooling library to produce an AST for some code. The second function simply fetches the first RecordDecl from the AST with the matching name.

Here’s an example. This is one of the first tests I wrote when I was ready to begin porting over the randomization code from our RFC (since this is a TDD rewrite, after all):

TEST(StructureLayoutRandomization,
StructuresLayoutFieldLocationsCanBeRandomized)
{
    std::string Code =
        R"(
        struct test_struct {
            int a;
            int b;
            int c;
            int d;
            int e;
            int f;
        };
        )";

    auto AST = MakeAST(Code, Lang_C);
    auto RD = GetRecordDeclFromAST(AST->getASTContext(), "test_struct");
    RandomizeStructureLayout(AST->getASTContext(), RD);
    std::vector<std::string> before = {"a", "b", "c", "d", "e", "f"};
    std::vector<std::string> after = GetFieldNamesFromRecord(RD);

    ASSERT_NE(before, after);
}

Obviously, this won’t compile. I don’t want to belabor the principles of test driven development here, but the test is written first, then we write just enough implementation code to get the test to compile and pass.

To build enough of Clang just for our unit tests to run, make -j4 ASTTests will suffice.

The Clang test suite is massive, so running make clang-test is a great way to lose an afternoon to compilation times.

Similarly, running all of ASTTests will age you. We can run just our tests by passing a gtest filter:

$ ./tools/clang/unittests/AST/ASTTests --gtest_filter=StructureLayoutRandomization*

Now, rinse and repeat.

This has resulted in what I consider to be a much tighter and more responsive development cycle than before. It is much faster to compile the ASTTests than it is to compile Clang and point it at a test program manually. It has eliminated manual structure layout inspection. Failing an assertion results in a more human-friendly “this does not equal that” type message. Since each change was motivated by the appearance of a unit test it’s much easier to track down where a bug was introduced.

Practicing test driven development inside Clang has been a really motivating experience. I’ve rewritten most of the functionality with automated test coverage executing every code path (which is a sharp increase from 0%). Not only that, but adding some of the unit tests suggested by the upstream contributors has helped guide the implementation and verify correctness. I was able to refactor some of the randomization algorithm without breaking a sweat since I already had test coverage to help verify my assumptions about how everything should work!

Not to mention, the test suite provides a much shorter feedback time. It’s also no longer required to maintain small test programs and compile one with the feature enabled and one version with Randstruct disabled. Since it’s automated, there is no manual structure layout inspection required, which has helped me iterate on new features much quicker. This is an important factor that helps me stay motivated.

Every large, unfamiliar code base seems nebulous at first and many of its helpful internal features are hard to discover sometimes. This post is an ode to Clang’s internals. I’m that happy to have stumbled upon this foothold for helping me work on this next revision.