Augur v6 Release Notesļƒ

Augur v6 was released on 2019-12-10. This release contains a number of changes from Augur v5, including feature additions and bugfixes. The biggest change is related to how Augur exports files for visualisation by Auspice. Weā€™ve written an extensive guide explaining our motivations here, what has changed, how to upgrade, and how this interfaces with Auspice.


Export JSONs for specific versions of Auspiceļƒ

Probably the biggest (breaking) change youā€™ll encounter is that augur export no longer works! See the migration guide for a detailed explanation of this.

Breaking change: augur export no longer works, and now requires a further positional argument to define which version of Auspice you wish to target. augur export v1 should behave the same as previous versionsā€™ augur export.

Reference sequence outputļƒ

The export command now accepts a flag to export the reference/root sequence relative to which mutations are called, see here for more detail.

Change in augur ancestralā€™s argumentsļƒ

augur ancestral, which reconstructs mutations across a tree, now supports two forms of output and the arguments have become more descriptive.

  1. JSON output, including mutations for each branch and (inferred) ancestral sequences. This is specified by the --output-node-data argument.

  2. FASTA output of reconstructed ancestral sequences. This had previously been available for VCF-inputs, but now works for any input. Users can ask for this output and specify a file name using --output-sequences.

Deprecation warning: The argument --output is now deprecated. Please use --output-node-data instead.

Import BEAST MCC treesļƒ

We now have instructions and functionality to import BEAST trees, see here.

Prettifying of stringsļƒ

Previous auspice version ā€œprettifiedā€ metadata strings (like changing ā€˜north_americaā€™ to ā€˜North Americaā€™). Auspice v2 no longer does this, see here for more detail. The parse command now accepts and argument to apply string prettifying operations to metadata parsed from fasta headers.

Whitespace in colors and lat-longs TSVsļƒ

To allow whitespace in metadata, files specifying colors and geographic locations now need to be TAB delimited.

Move to GFF-style annotationsļƒ

Starting with Augur v6 we now use GFF coordinates: [one-origin, inclusive] as opposed to BED coordinates. Strands are represented by + or - rather than 1 or 0. Additionally, we export the seqid, but donā€™t use it in Auspice.

Improvements and usage of JSON validationļƒ

The export command will now validate the produced JSON against the schema.

Removal of non-modular Augur and old buildsļƒ

Augur has been a dynamic, shapeshifting beast. It started as scripts for Nextflu, took on more and more pathogens, was refactored into ā€œprepareā€ and ā€œprocessā€ steps, and refactored again into the ā€œmodularā€ Augur we now have. Earlier incarnations of Augur have now been removed from the GitHub repo (./base/*).

Paralleling the different incarnations of Augur was a move to ā€œbuildsā€ being their own self-contained repos. We think this has been remarkably successful, and de-couples the bioinformatics tooling from a pathogen build. With this release of Augur weā€™ve now removed these builds from the Augur GitHub repo, and the only builds that remain are the test ones.

Test buildsļƒ

There have been a number of test builds in the Augur repo and we have leaned heavily on them while we developed this version of Augur as well as Auspice v2. They are all self contained within ./tests/builds and can all be run and examined in Auspice via

cd tests/builds
bash runner.sh # creates output in ./auspice
auspice view --datasetDir auspice

(See the Auspice docs for Auspice-specific questions.)

Documentation improvementsļƒ

Documentation has always been a bit hit-or-miss with Nextstrain projects. Weā€™ve tried to make Augurā€™s read-the-docs documentation more comprehensive, with better flow. This entails new sections, with each Augur command having its own page. Weā€™ve tried to use redirects to ensure that all the old links continue to work.

Miscellaneousļƒ

  • augur filter: More interpretable output of how many sequences each stage has filtered out.

  • augur filter: Additional flag --subsample-seed to seed the random number generator and thereby make subsampling reproducible.

  • augur sequence-traits: Numerical output as originally intended, but required an Auspice bugfix.

  • augur traits: Explanation of what is considered missing data & how it is interpreted.

  • augur traits: GTR models are exported in the output JSON for better accountability & reproducibility.

  • Errors in formatting of input files (e.g. metadata files, Auspice config files) werenā€™t handled nicely, often resulting in hard-to-interpret stack traces. We now try to catch these and print an error indicating the offending file.

  • Tests using Python version 2 have now been removed.