How Canaries Help Us Merge Good Pull Requests

At WordPress.com we strive to provide a consistent and reliable user experience as we merge and release hundreds of code changes each week.

We run automated unit and component tests for our Calypso user interface on every commit against every pull request (PR).

We also have 32 automated end-to-end (e2e) test scenarios that, until recently, we would only automatically run across our platform after merging and deploying to production. While these e2e scenarios have found regressions fairly quickly after deploying (the 32 scenarios execute in parallel in just 10 minutes), they don’t prevent us from merging and releasing regressions to our customer experience.

Introducing our Canaries

Earlier this year we decided to identify three of our 32 automated end-to-end test scenarios that would act as our “canaries”: a minimal subset of automated tests to quickly tell us if our most important flows are broken. These tests execute after a pull request is merged and deployed to our staging environment, but before we deploy the changes to all our customers in production.

These canaries have been very successful in preventing us from deploying regressions to production, however, running these after merging to master (and automatically deploying code to staging) means we’d have to revert code changes if something was wrong. This wasn’t good enough.

Last month we took our canaries to the next level. Instead of just running canaries on merging to master, we now execute canaries against live pull requests and provide feedback to the pull request itself about the canary test status.

How does it work?

Our process is that if you’re a developer working on a pull request for Calypso and it’s ready to review, you add the “[Status] Needs Review” label to alert someone to review your code. Adding this label automatically triggers the e2e canary tests against your pull request:

The results are separate from the unit and component tests which already run against every pull request (on every push).

How does this technically work?

Our automated e2e tests are open-source, but they reside separately from our Calypso GitHub code repository. This is because the e2e scenarios represent the entire WordPress.com customer experience: they’re not just automated Calypso user interface tests. For example, our tests include verifying that our customers receive appropriate emails that are not part of the Calypso code base.

We “connect” our two projects using CircleCI builds and a custom “bridge” written in Node.js (which is also open-source). This bridge provides webhooks for GitHub pull requests to execute CircleCI builds using the CircleCI API. It reports the status of these builds using the GitHub status API. We do apply a little bit of cleverness in that we can match branch names so we can make changes to our e2e tests that correspond to changes to our Calypso changes. Our bridge runs on Automattic’s VIP Go platform.

A summary and what’s next?

Running our canaries on pull requests has been a great success. Developers love the confidence the canaries give them in knowing that our key end-to-end scenarios won’t regress when introducing changes rapidly.

We’d now like to expand the bridge’s scope to optionally run the full set of 32 end-to-end automated tests on pull requests that have a broader impact, changes like upgrading a dependency or refactoring a framework design pattern. This again will give our developers even greater confidence in the ability to merge code and provide a consistent and reliable experience to our customers.

Get involved!

Feel free to check out our e2e tests repository, or our bridge repository, make a fork, and provide us with any feedback or suggestions. Pull requests are always welcomed.

***

Alister Scott is an Excellence Wrangler for Automattic and blogs regularly about software testing at his blog WatirMelon.