Augur v6 Release Notesļ
Augur v6 was released on 2019-12-10. This release contains a number of changes from Augur v5, including feature additions and bugfixes. The biggest change is related to how Augur exports files for visualisation by Auspice. Weāve written an extensive guide explaining our motivations here, what has changed, how to upgrade, and how this interfaces with Auspice.
Export JSONs for specific versions of Auspiceļ
Probably the biggest (breaking) change youāll encounter is that
augur export
no longer works! See the migration guide for a detailed explanation of this.
Warning
Breaking change: augur export
no longer works,
and now requires a further positional argument to define
which version of Auspice you wish to target. augur
export v1
should behave the same as previous versionsā
augur export
.
Reference sequence outputļ
The export command now accepts a flag to export the reference/root sequence relative to which mutations are called, see here for more detail.
Change in augur ancestral
ās argumentsļ
augur ancestral
, which reconstructs mutations across a tree, now
supports two forms of output and the arguments have become more
descriptive.
JSON output, including mutations for each branch and (inferred) ancestral sequences. This is specified by the
--output-node-data
argument.FASTA output of reconstructed ancestral sequences. This had previously been available for VCF-inputs, but now works for any input. Users can ask for this output and specify a file name using
--output-sequences
.
Warning
Deprecation warning: The argument --output
is now
deprecated. Please use --output-node-data
instead.
Import BEAST MCC treesļ
We now have instructions and functionality to import BEAST trees, see here.
Prettifying of stringsļ
Previous auspice version āprettifiedā metadata strings (like changing
ānorth_americaā to āNorth Americaā). Auspice v2 no longer does this,
see here for more detail. The
parse
command now accepts and argument to apply string prettifying
operations to metadata parsed from fasta headers.
Whitespace in colors and lat-longs TSVsļ
To allow whitespace in metadata, files specifying colors and geographic locations now need to be TAB delimited.
Move to GFF-style annotationsļ
Starting with Augur v6 we now use GFF coordinates: [one-origin,
inclusive] as opposed to BED coordinates. Strands are represented by
+
or -
rather than 1
or 0
. Additionally, we export the
seqid
, but donāt use it in Auspice.
Improvements and usage of JSON validationļ
The export command will now validate the produced JSON against the schema.
Removal of non-modular Augur and old buildsļ
Augur has been a dynamic, shapeshifting beast. It started as scripts
for Nextflu, took on more and more pathogens, was refactored into
āprepareā and āprocessā steps, and refactored again into the āmodularā
Augur we now have. Earlier incarnations of Augur have now been removed
from the GitHub repo (./base/*
).
Paralleling the different incarnations of Augur was a move to ābuildsā being their own self-contained repos. We think this has been remarkably successful, and de-couples the bioinformatics tooling from a pathogen build. With this release of Augur weāve now removed these builds from the Augur GitHub repo, and the only builds that remain are the test ones.
Test buildsļ
There have been a number of test builds in the Augur repo and we have
leaned heavily on them while we developed this version of Augur as
well as Auspice v2. They are all self contained within
./tests/builds
and can all be run and examined in Auspice via
cd tests/builds
bash runner.sh # creates output in ./auspice
auspice view --datasetDir auspice
(See the Auspice docs for Auspice-specific questions.)
Documentation improvementsļ
Documentation has always been a bit hit-or-miss with Nextstrain projects. Weāve tried to make Augurās read-the-docs documentation more comprehensive, with better flow. This entails new sections, with each Augur command having its own page. Weāve tried to use redirects to ensure that all the old links continue to work.
Miscellaneousļ
augur filter
: More interpretable output of how many sequences each stage has filtered out.augur filter
: Additional flag--subsample-seed
to seed the random number generator and thereby make subsampling reproducible.augur sequence-traits
: Numerical output as originally intended, but required an Auspice bugfix.augur traits
: Explanation of what is considered missing data & how it is interpreted.augur traits
: GTR models are exported in the output JSON for better accountability & reproducibility.Errors in formatting of input files (e.g. metadata files, Auspice config files) werenāt handled nicely, often resulting in hard-to-interpret stack traces. We now try to catch these and print an error indicating the offending file.
Tests using Python version 2 have now been removed.