Augur Development Docs for Contributors

Thank you for helping us to improve Augur! This document describes:

  • Getting Started

  • Contributing code

    • Running local code changes

    • Testing

    • Releasing

    • Maintaining Bioconda package

    • Continuous integration

  • Contributing documentation

    • Formats (Markdown and reStructuredText)

    • Documentation structure

    • Building documentation

Getting started

To be an effective, productive contributor, please start by reading the Nextstrain contributing guide for useful information about how to pick an issue, submit your contributions, and so on.

This project strictly adheres to the Contributor Covenant Code of Conduct.

Please see the project board for currently available issues.

Contributing code

We currently target compatibility with Python 3.6 and higher. As Python releases new versions, the minimum target compatibility may be increased in the future.

Versions for this project, Augur, from 3.0.0 onwards aim to follow the Semantic Versioning rules.

Running local changes

While you are making code changes, you will want to run augur to see it behavior with those changes. To to test your local changes (without installing them to your system), run the following convenience script from the root of your cloned git repository:

./bin/augur

Note that the ./bin/augur convenience script is not installing augur system-wide with pip.

As an alternative to using the convenience script, you can install augur from source as an editable package so that your global augur command always uses your local source code copy:

pip install -e '.[dev]'

Using an “editable package” is not recommended if you want to be able to compare output from a stable, released version of augur with your development version (e.g. comparing output of augur installed with pip and ./bin/augur from your local source code).

Testing

Writing good tests and running tests helps maintain code quality and eases future refactoring. We use pytest and Cram to test augur. This section will describe briefly:

  • Writing tests

    • Unit tests

    • Doctests

    • Functional tests

  • Running tests

    • Locally

    • Continuous Integration

Writing Tests

It’s good practice to write unit tests for any code contribution. The pytest documentation and Python documentation are good references for unit tests. Augur’s unit tests are located in the tests directory and there is generally one test file for each code file.

On the other hand, doctests are a type of tests that are written within a module’s docstrings. They can be helpful for testing a real-world example and determining if a regression is introduced in a particular module.

A pull request that contributes new code should always contain unit tests. Optionally, a pull request may also contain doctests if the contributor believes a doctest would improve the documentation and execution of a real world example.

We test augur’s command line interface with functional tests implemented with the Cram framework. These tests complement existing unit tests of individual augur Python functions by running augur commands on the shell and confirming that these commands:

  1. execute without any errors

  2. produce exactly the expected outputs for the given inputs

These tests can reveal bugs resulting from untested internal functions or untested combinations fo internal functions.

Functional tests should either:

  • suitably test a single augur command with an eponymously named Cram file in tests/functional/ (e.g., mask.t for augur mask)

OR

  • test a complete build with augur commands with an appropriately named Cram file in tests/builds/ (e.g., zika.t for the example Zika build)

Functional tests of specific commands

Functional tests of specific commands consist of a single Cram file per test and a corresponding directory of expected inputs and outputs to use for comparison of test results.

The Cram file should test most reasonable combinations of command arguments and flags.

Functional tests of example builds

Functional tests of example builds use output from a real Snakemake workflow as expected inputs and outputs. These tests should confirm that all steps of a workflow can execute and produce the expected output. These tests reflect actual augur usage in workflows and are not intended to comprehensively test interfaces for specific augur commands.

The Cram file should replicate the example workflow from start to end. These tests should use the output of the Snakemake workflow (e.g., files in zika/results/ for the Zika build test) as the expected inputs and outputs.

Comparing outputs of augur commands

Compare deterministic outputs of augur commands with a diff between the expected and observed output files. For extremely simple deterministic outputs, use the expected text written to standard output instead of creating a separate expected output file.

To compare trees with stochastic branch lengths:

  1. provide a fixed random seed to the tree builder executable (e.g., --tree-builder-args "-seed 314159" for the “iqtree” method of augur tree)

  2. use scripts/diff_trees.py instead of diff and optionally provide a specific number to --significant-digits to limit the precision that should be considered in the diff

To compare JSON outputs with stochastic numerical values, use scripts/diff_jsons.py with the appropriate --significant-digits argument.

Both tree and JSON comparison scripts rely on deepdiff for underlying comparisons.

Running Tests

You’ve written tests and now you want to run them to see if they are passing. First, you will need to install the complete Nextstrain environment and augur dev dependencies as described above. Next, run all augur tests with the following command from the root, top-level of the augur repository:

./run_tests.sh

For rapid execution of a subset of unit tests (as during test-driven development), the -k argument will disable code coverage and functional tests and pass directly to pytest to limit the tests that are run. For example, the following command only runs unit tests related to augur mask.

./run_tests.sh -k test_mask

Troubleshooting tip: As tests run on the development code in the augur repository, your environment should not have an existing augur installation that could cause a conflict in pytest.

We use continuous integration with Travis CI to run tests on every pull request submitted to the project. We use codecov to automatically produce test coverage for new contributions and the project as a whole.

Releasing

Before you create a new release, run all tests from a fresh conda environment to verify that nothing has broken since the last CI build on GitHub. The following commands will setup the equivalent conda environment to the Travis CI environment, run unit and integration tests, and deactivate the environment.

conda env create -f environment.yml
conda activate augur
python3 -m pip install -e .[dev]

./run_tests.sh
bash tests/builds/runner.sh

conda deactivate
conda env remove -n augur

New releases are tagged in git using an “annotated” tag. If the git option user.signingKey is set, the tag will also be signed. Signed tags are preferred, but it can be hard to setup GPG correctly. The release branch should always point to the latest release tag. Source and wheel (binary) distributions are uploaded to the nextstrain-augur project on PyPi.

There is a ./devel/release script which will prepare a new release from your local repository. It ends with instructions for you on how to push the release commit/tag/branch and how to upload the built distributions to PyPi. You’ll need a PyPi account and twine installed to do the latter.

After you create a new release and before you push it to GitHub, run all tests again as described above to confirm that nothing broke with the new release. If any tests fail, run the ./devel/rewind-release script to undo the release, then fix the tests before trying again.

Maintaining Bioconda package

Bioconda hosts augur’s conda package and defines augur’s dependencies in a conda recipe YAML file. New releases on GitHub automatically trigger a new Bioconda release.

To modify augur’s dependencies or other aspects of its conda environment, follow Bioconda’s contributing guide. You will need to update the existing recipe YAML locally and create a pull request on GitHub for testing and review. Add your GitHub username to the recipe_maintainers list, if this is your first time modifying the augur recipe. After a successful pull request review, Bioconda will automatically update the augur package that users download.

Travis CI

Branches and PRs are tested by Travis CI jobs configured in .travis.yml.

Our Travis config uses two build stages: test and deploy. Jobs in the test stage always run, but deploy jobs only run sometimes (see below).

The set of test jobs are explicitly defined instead of auto-expanded from the implicit job property matrix. Since top-level properties are inherited by all jobs regardless of build stage, making the matrix explicit is less confusing and easier to reason about. YAML’s anchor (&foo) and alias merge key (<<: *foo) syntax let us do this without repeating ourselves unnecessarily.

New releases, via pushes to the release branch, trigger a new docker-base build to keep the Docker image up-to-date. This trigger is implemented in the deploy stage, which is implicitly conditioned on the previous test stage’s successful completion and explicitly conditioned on a non-PR trigger on the release branch. Note that currently we cannot test this deploy stage without making a release.

It can sometimes be useful to verify the config is parsed as you expect using https://config.travis-ci.com/explore.

Contributing documentation

Documentation is built using Sphinx and hosted on Read The Docs. Versions of the documentation for each augur release and git branch are available and preserved. Read The Docs is updated automatically from commits and releases on GitHub.

Doc Formats

Documentation is mostly written as reStructuredText (.rst) files, but they can also be Markdown (.md) files. There are advantages to both formats:

  • reStructuredText enables python-generated text to fill your documentation as in the auto-importing of modules or usage of plugins like sphinx-argparse (see below).

  • Markdown is more intuitive to write and is widely used outside of python development.

  • If you don’t need autogeneration of help documentation, then you may want to stick with writing Markdown.

Sphinx, coupled with reStructuredText, can be tricky to learn. Here’s a subset of reStructuredText worth committing to memory to help you get started writing these files.

Many Sphinx reStructuredText files contain a directive to add relations between single files in the documentation known as a Table of Contents Tree (TOC Tree).

Human-readable augur and augur subcommand documentation is written using a Sphinx extension called sphinx-argparse.

Folder structure

The documentation source-files are located in ./docs, with ./docs/index.rst being the main entry point. Each subsection of the documentation is a subdirectory inside ./docs. For instance, the tutorials are all found in ./docs/tutorials and are included in the documentation website via the directive in ./docs/index.rst.

Building documentation

Building the documentation locally is useful to test changes. First, make sure you have the development dependencies of augur installed:

pip install -e '.[dev]'

This installs packages listed in the dev section of extras_require in setup.py, as well as augur’s dependencies as necessary.

Sphinx and make are used when building documentation. Here are some examples that you may find useful:

Build the HTML output format by running:

make -C docs html

Sphinx can build other formats, such as epub. To see other available formats, run:

make -C docs help

To update the API documentation after adding or removing an augur submodule, autogenerate a new API file as follows.

sphinx-apidoc -T -f -MeT -o docs/api augur

To make doc rebuilds faster, Sphinx caches built documentation by default, which is generally great, but can cause the sidebar of pages to be stale. You can clean out the cache with:

make -C docs clean

To view the generated documentation in your browser, Mac users should run:

open docs/_build/html/index.html

Linux users can view the docs by running:

xdg-open docs/_build/html/index.html

This will open your browser for you to see and read your work.