Migrating from augur v5 to v6ď
Augur is a versatile bioinformatics toolkit in its own right, but it
is developed in conjunction with Auspice, the
interactive visualisation tool behind what you see on nextstrain.org. Because of this, build pipelines often
finish with augur export
which produces JSONs that Auspice can
visualise.
As Auspice grows and improves, the format of the JSON files is
changing too, to allow more functionality and flexibility. This means
Augurâs export function needs to change as well. This page details how
the new augur export v2
works and what you need to do to start
using it.
Some important points:
âOldâ JSON files (v1, made with Augur v5) still work with the new (v2) Auspice for the time being - but may not be supported in future versions.
When you upgrade to Augur v6,
augur export
will no longer work. Youâll need to specify which version of Auspice (v1 or v2) your JSON files should be.
Compatibility between Augur & Auspice versionsď
Augur v5 simply used augur export
to produce two JSONs (tree +
meta) for Auspice v1 - so weâll call these JSONs âv1â JSONs.
The new Augur (v6) can still create âv1â JSONS, but can also create JSONs that work with the latest Auspice release - Auspice v2. The new format combines the tree and metadata into one JSON, which weâll call a âv2â JSON.
This page has the most up-to-date compatibility information between different Augur and Auspice versions.
We understand how important backwards compatibility is - so for the time being âv1â JSONs will continue to work with Auspice v2. However, we recommend switching to v2 JSONs - they have more features, are easier to work with, and future versions of Auspice may not support v1 JSONs!
Motivation behind changing JSON formatsď
With the release of Auspice v2, we made some breaking changes to the JSON schema. Why change formats? We were motivated by:
Compactness: Tree and Meta JSON files are now combined, so you only have to worry about one output file
Flexibility: The new v2 JSONs allow us flexibility to include more features and data, and will let us move towards getting in line with existing conventions like GFF and BibTex
Ease of use: Users commonly got confused by the âconfigâ file. For basic runs you can now specify everything you need to see your data right in the command-line - no âconfigâ file needed! For more advanced exports, you can still specify a config file with more detail. (See âUsing a Config Fileâ)
I just need my old run to work right now!ď
We completely understand you may not have time to make the change right this second, and that thereâs nothing more frustrating than having a run break right before a presentation or deadline!
If you want to keep using the old version for now, replace augur
export
with augur export v1
- everything else remains the same.
You can use Auspice v1 or Auspice v2 (see compatibility).
To use the new version, use augur export v2
. Youâll need to make a
few small changes, but youâll be future-proofing your runs. (Future
you will thank past you!)
What needs changing to use augur export v2
?ď
Note
Helpful hint: You can always get a full overview of the
arguments for export v2 with augur export v2 --help
Whatâs the sameď
You still pass in your tree, metadata, and node-data files with
--tree
, --metadata
, and --node-data
- just like in export v1
.
Similarly, you can pass in files containing colors, latitude, and
longitude data using --colors
and --lat-longs
, respectively.
Different outputsď
Instead of specifying two output files (--output-tree
and
--output-meta
) you now only need to specify one with --output
.
For example, if your old files were auspice/virus_AB_tree.json
and
auspice/virus_AB_meta.json
, you might want to call the single
output auspice/virus_AB.json
- or if you want to tell it apart
from your v1 export, you might call it auspice/virus_ABv2.json
.
To export the reference sequence relative to which mutations have been
identified, specify the --include-root-sequence
flag. This flag
writes a JSON whose name is relative to the stem of the main output
JSON. For VCF input, this file will contain the reference sequence to
which the VCF is mapped. For example, if the main output is called
auspice/virus_AB.json
, the root sequence will be saved to
auspice/virus_AB_root-sequence.json
.
Other changed argumentsď
The --tree-name
argument has been removed, as auspice v2 no longer
uses this. See the auspice docs
for more information about how second trees are specified and
displayed.
Command Line Options instead of (or in addition to) a config fileď
One of the biggest changes in augur export v2
is that you can pass
much more in using the command-line, meaning âconfigâ files are no
longer required. The âconfigâ or âAuspice configâ file defines a
number of visualisation settings such as title, default displays, and
which colorings to use. However, itâs been a source of pain for many
users!
Many of these things can now be passed in by the command-line, but
some options are only possible using the config file. You can always
continue to put most things in the config file if you prefer. If you
want to use a config file, you can pass
this in with --auspice-config
, but the format of this has changed
(see the link).
Itâs important to note that generally any command line options you use will override the same option in your config file.
Coloring traits is smarterď
Previously, anything you wanted to color by had to be in the config
file. You always had to include a gt
and num_date
entry, and
remember to add anything new to the file.
Weâve made this smarter - augur export v2
now automatically detects
some traits and you can specify others on the command line. You can
also control the color options in more detail using a config file.
Weâll cover how coloring works on the command line and how it works in config files in more detail below.
Traits display exactly how you wantď
Previously, auspice tried to make traits and locations look âprettyâ by auto-capitalizing them and removing underscores (which were required in multi-word traits). Auspice no longer does this for v2 JSONS, so youâll need to ensure your traits look exactly how you want them to display in auspice. You can read more about that here.
Terminologyď
Traitsď
Traits is the general term for certain data associated with nodes in
the tree, for example âcountryâ, âserotypeâ, or âageâ. These may have
been inferred for internal nodes by Augur functions like augur
traits
(confusingly named!) and augur clades
, or they may only be
available for tips and provided by the metadata TSV file.
Geographic Traitsď
Certain traits have a geographic interpretation, e.g. âcountryâ. Auspice will attempt to display these traits on a map (and provide a drop-down to switch between them if there are more than one).
Note
Make sure that these have corresponding entry in the
lat-longs TSV file supplied to export
. See how to do this
here.
Using command-line options to customise the visualisationď
As mentioned above, you can now replace most of the functionality of the âAuspice configâ file with command line options. We hope that for most users this means the config file isnât necessary (but itâs always there is you need its advanced functionality).
Note
Remember that generally any command line options you use will override the same option in your config file.
Titleď
Set the title displayed by Auspice via --title
(previously this
was the âtitleâ field in your v1 config file). If running directly
from the command line, put your title in quotes (ex: --title
"Phylodynamics of my Pathogen"
). If you are using Snakemake and
passing the value using params
, youâll need to double-quote the
title using single and double quotes. For example:
params:
title = "'Phylodynamics of my Pathogen'"
shell:
"augur export v2 --title {params.title} ..."
Maintainersď
The maintainer(s) are displayed in the footer of Auspice and may have
associated links. These can be specified with --maintainers
and
you can now have more than one maintainer associated with your run.
Previously this was set by the âmaintainerâ field in your v1 config
file and was limited to a single entry.
If running directly from the command line, put each maintainer in
quotes (ex: --maintainers "Jane Doe" "Ravi Kupra"
). If you have a
URL associated with a maintainer (completely optional), then you can
add them like so:
--maintainers "Jane Doe <mailto:jane.doe@example.com>" "Ravi Kupra <https://github.com/ravikupra"
If you are using Snakemake and passing the value using params
,
youâll need to put the whole list in double quotes, and each person in
single quotes. For example:
params:
maints = "'Jane Doe' 'Ravi Kupra <github.com/ravikupra>'"
shell:
"augur export v2 --maintainers {params.maints} ..."
You will need to use quotes in the same way even if you only have one maintainer!
Build URLď
Set the build URL displayed by Auspice via --build-url
. If running
directly from the command line, input your build URL directly (ex:
--build-url https://github.com/nextstrain/zika
).
Descriptionď
Set the description and/or acknowledgments in the footer of Auspice
via --description
. This option expects a Markdown file containing
description to be displayed.
Panelsď
Auspice will, by default, try to show the tree, map, and entropy
panels. You can customise this with the --panels
option, which was
previously the âpanelsâ field in the your v1 config file. Options are
âtreeâ, âmapâ, âentropyâ, and âfrequenciesâ (e.g: --panels tree map
entropy
).
Note
If you want to display the frequencies panel, you must
specify âfrequenciesâ and ensure a tip frequency file is
available for auspice
to access.
Traitsď
(Whatâs a trait?) Traits will become coloring options in Auspice. Some are automatically included, and some can be defined on the command line. The following rules are followed for which traits will be exported:
Genotype and date (if present) are always automatically included as coloring options - you donât need to include them. (Previously these were âgtâ and ânumdateâ in the âcolor_optionsâ section of your v1 config file.)
Traits contained in the node-data JSONs handed to
augur export
(using--node-data
) will automatically be included. These are often generated from the Augur commandstraits
,clades
orseqtraits
.Traits present in the metadata file can be included by specifying them with
metadata-color-by
(e.g:--metadata-color-by country age host
). (These must match column names of your metadata file.)
The changes hopefully make things a little easier to use â
previously, if you had run augur clades
, you had to remember to add
clade_membership
to the config file, and if youâd run augur
seqtraits
you had to add every resulting option. Now, theyâll be
automatically included. If you donât want them as a coloring option,
donât pass in the files.
Note
You canât specify the title or type of a colouring option
using just command-line - but export v2
will make its
best guess using the following rules: Excluding missing
data, if a trait contains only âTrueâ, âFalseâ, âYesâ, âNoâ,
â0â or â1â, it will be set to âboolean.â If it contains only
numbers (integers and/or decimals), it will be set to
âcontinuous.â Otherwise, it will be set as âdiscrete.â If
you want to have more control over how your trait is
interpreted, you should use a config file (see below).
Geographic Traitsď
Specify these traits using --geo-resolutions
, e.g.
--geo-resolutions country region
. Previously these were defined by
the âgeoâ field in your v1 config file.
Whatâs not possible to set without a config fileď
The command line arguments cover everything you need to get a basic
run working in augur
and auspice
. However, there are still
some features that offer more options or are only available when you
use a config file.
Currently, using command line arguments:
It is not possible to set the default view options using only command-line arguments in
export v2
. You can read more about the defaults (and how to change them using a config file) here.When using
export v2
with only command-line arguments, every trait thatâs a coloring option and is either categorical or boolean will automatically be available to filter by. Find out how to specify what is a filter using a config file here.
Using a config file to customise the visualisationď
Traditionally you had to use an âAuspice config fileâ to customise the visualisation. This is still available as an option, but you can now choose between exporting using just the command-line, or using a combination of the command-line and config file.
Note
Anything you can specify using the command-line arguments above can be done using a config file instead.
This section will detail the config file provided to augur export v2
by the --auspice-config
argument. The format of the new config file
differs slightly from previous versions of Augur. If you try to
use a previous version of the config file it should mostly still
work, but will print out warnings where keys have changed.
Config file priorityď
It is important to remember that if you set an option both in the
config file and in the command line, the command line option will
override the config file option. For example, if you set "title"
in your config file as âA Title About Applesâ, and then import this
config file using --auspice-config
and use --title "Better Title
Befitting Bears"
, the title displayed by Auspice will be âBetter
Title Befitting Bearsâ. To use the one in the config file, donât use
--title
in the command line.
There are a couple of exceptions to this:
There is no way to set default display views using command line only, so using âdisplay_defaultsâ in your config file will set this.
There is no way to modify the default filters displayed when using command line only, so using âfiltersâ in your config file will set this.
If you set color-by options in command-line using
--metadata-color-by
and pass in a config file, only the things listed in--metadata-color-by
will be coloring options, but if they have a âtitleâ and âtypeâ set in the config file, these will be used.
Config file formatď
The config file is a JSON file, and as such itâs important that everything in your config file is enclosed in one pair of curly brackets. These can be on a separate line at the very top and very bottom of your file. Syntax is important - if you are getting errors, ensure all your brackets and quotation marks match up, and that commas separate items in the same pair of brackets.
Export v2 config files are generally very similar to export v1, but there are a few changes. They are explained in detail below, or you can see an example of converting a v1 config to v2. For more details, see the complete JSON schema for v2 config files.
Here are the top-level keys of the config JSON in plain English:
titleď
The title to be displayed by Auspice, unchanged from previous versions
of the config file. E.g. "title": "Phylodynamics of my Pathogen"
.
maintainersď
You can now have more than one maintainer associated with your run!
Specify one or as many maintainers as you wish via the following
structure (The url
keys are optional):
"maintainers": [
{"name": "Jane Doe", "url": "www.janedoe.com"},
{"name": "Ravi Kupra", "url": "www.ravikupra.co.uk"}
]
Previously this was the âmaintainerâ field in your v1 config file and used a different structure.
build-urlď
The build / repository URL to be displayed by Auspice, a new
functionality in augur export v2
, e.g. "build_url":
"https://github.com/nextstrain/zika"
. This is an optional field.
panelsď
Optional and unchanged from previous versions of the config file, this
defines the panels that Auspice will display. If not set, Auspice will
by default try to show the tree, map, and entropy panels, if data is
available. Options are âtreeâ, âmapâ, âentropyâ, and âfrequenciesâ
(e.g: "panels": ["tree", "map"]
).
Note
If you want to display the frequencies panel, you must
specify âfrequenciesâ and ensure a tip frequency file is
available for auspice
to access.
coloringsď
These are a list of the traits which Auspice should display as options to color the tree & map. In previous versions of the config file this was âcolor_optionsâ and the current structure is similar, but hopefully easier to understand!
For each trait you include, you can define:
A required âkeyâ, which is used to lookup the values via node-data JSONs or other provided metadata.
An optional âtitleâ which will shown by Auspice when referring to this trait â for instance you may have a trait called âab1â which you want to show as âAge bracket 1â in the drop-down menus, legend, and filter.
An optional, but highly recommended âtypeâ which can be either âordinalâ, âbooleanâ, âcontinuousâ, or âcategoricalâ. If you donât provide a type, augur will try to guess it (see how it guesses here).
Unless you want to change the name displayed, you no longer need to
include gt
, num_date
, clade_membership
, or augur
seqtraits
output (like clade or drug resistance information) in your
config file - if that information is present, it will automatically be
included. To exclude it, donât pass in the corresponding file to
--node-data
.
Note
Remember that if you are using --metadata-color-by
on the
command-line, only the traits given there will be color-by
options! To include everything in your config file, donât
use --metadata-color-by
, but include all traits you want
as coloring options in âcoloringsâ in the config file.
Put another way, if a trait is listed in
--metadata-color-by
and not in the config, it will be
included. If a trait is in the config but not in
--metadata-color-by
it will be excluded. If a trait is
in both, but has "title"
and "type"
information in
the config file, this information will be used by export v2.
In short, if using a config file and the command line, ensure
everything you want as a coloring option is in --metadata-color-by
.
You only need to also include it "colorings"
in the config file if
you want to set the "title"
and/or "type"
.
geo_resolutionsď
This specifies the geographical traits you want Auspice to use. You can pass this in the same way as in the v1 config file, or you can now specify a title to be displayed by option, using a slightly different structure.
For example, for many users, these might be âcountryâ and âregionâ,
i.e. "geo_resolutions: ["country", "region"]
. If you want to give
them new titles, use the format "geo_resolutions": [{"key":
"country", "title": "Areas"}, {"key": "region", "title": "Global"}]
.
You can also mix the two, if you just want a title for one location:
"geo_resolutions": [ {"key": "country", "title": "Areas"}, "region"]
filtersď
This specifies which traits you can filter by in Auspice. E.g.
"filters": ["country", "region", "symptom", "age"]
. If you donât
include this option in your config file, all non-continuous traits
that are coloring options will be included as filters. If you donât
want any filter options, include this option with an empty list, i.e.
"filters": []
. This is the same as the âfiltersâ field in previous
config files, but the behavior has changed slightly.
display_defaultsď
This allows you to specify the default view that users will see when they visualise the data in Auspice. There are five options you can set here â note they are similar to those in the previous config files but we have now standardised them to snake_case:
geo_resolution
- Sets which of the âgeo_resolutionsâ should be shown. Default: countrycolor_by
- Sets what trait should be used for coloring. Default: countrydistance_measure
- Sets whether tree branch lengths are intime
ordivergence
. Options arenum_date
(time, default if available) ordiv
(divergence).layout
- Sets how the tree is visualised. Options arerect
(rectangular, default),radial
,unrooted
, andclock
, corresponding to the four options normally shown on the left in Auspice.map_triplicate
- Sets whether the map is extended / wrapped around, which can be useful if transmissions are worldwide. Set totrue
orfalse
. Default: false
Config file examplesď
Here is an example of how all of the above options would fit into a config file:
1{
2 "title": "Phylodynamics of my Pathogen",
3 "maintainers": [
4 {"name": "Jane Doe", "url": "www.janedoe.com"},
5 {"name": "Ravi Kupra", "url": "www.ravikupra.co.uk"}
6 ],
7 "build_url": "https://github.com/nextstrain/zika",
8 "colorings": [
9 {
10 "key": "age",
11 "title": "Host age",
12 "type": "continuous"
13 },
14 {
15 "key": "hospitalized",
16 "type": "boolean"
17 },
18 {
19 "key": "country",
20 "type": "categorical"
21 },
22 {
23 "key": "region",
24 "type": "categorical"
25 }
26 ],
27 "geo_resolutions": [
28 {"key":"country", "title": "Areas"},
29 "region"
30 ],
31 "panels": ["tree", "map"],
32 "filters": ["country","region","symptom","age"],
33 "display_defaults": {
34 "color_by": "symptom",
35 "geo_resolution": "region",
36 "distance_measure": "div",
37 "map_triplicate": "true"
38 }
39}
If you want some examples of the new config files used in practice, you can see some in these builds:
- TO DO
Updating your config fileď
Itâs fairly easy to convert old export v1 config files to work with export v2.
Hereâs an export v1 config file on the left, and an export v2 config file on the right. Weâve tried to line them up to highlight the differences:
Export v1 config: 1{
2 "title": "Phylodynamics of Virus A",
3 "color_options": {
4 "gt": {
5 "menuItem": "genotype",
6 "legendTitle": "Genotype",
7 "type": "discrete",
8 "key": "genotype"
9 },
10 "num_date": {
11 "menuItem": "date",
12 "legendTitle": "Sampling date",
13 "type": "continuous",
14 "key": "num_date"
15 },
16 "age": {
17 "menuItem": "Host age",
18 "legendTitle": "Host Age (years)",
19 "type": "continuous",
20 "key": "age"
21 },
22 "host": {
23 "menuItem": "Animal",
24 "legendTitle": "Animal",
25 "type": "discrete",
26 "key": "host"
27 },
28 "country": {
29 "type": "discrete"
30 },
31 "region": {
32 "type": "discrete"
33 },
34 "clade_membership":{
35 "menuItem": "Clade",
36 "legendTitle": "Clade",
37 "type": "discrete",
38 "key": "clade_membership"
39 }
40 },
41 "geo": [
42 "country",
43 "region"
44 ],
45 "maintainer": [
46 "Hanna Kukk, Mohammad Fahir",
47 "http://vamuzlab.org"
48 ],
49 "filters": [
50 "country", "region"
51 ],
52 "defaults": {
53 "layout": "unrooted",
54 "colorBy": "age"
55 }
56}
|
Export v2 config: 1{
2 "title": "Phylodynamics of Virus A",
3 "colorings": [
4
5
6
7
8
9
10
11
12
13
14
15
16 {
17 "key": "age",
18 "title": "Host age",
19 "type": "continuous"
20 },
21 {
22 "key": "host",
23 "title": "Animal",
24 "type": "categorical"
25 },
26 {
27 "key": "country",
28 "type": "categorical"
29 },
30 {
31 "key": "region",
32 "type": "categorical"
33 }
34 ],
35
36
37
38
39
40
41 "geo_resolutions": [
42 "country",
43 "region"
44 ],
45 "maintainers": [
46 {"name": "Hanna Kukk", "url": "http://vamuzlab.org"},
47 {"name": "Mohammad Fahir", "url": "http://mfahir.co.uk"}
48 ],
49 "filters": [
50 "country", "region"
51 ],
52 "display_defaults": {
53 "layout": "unrooted",
54 "color_by": "age"
55 }
56}
|
Vaccine choicesď
In previous versions of augur, certain strains could be defined in the
config file as vaccine_choices
(auspice would display this as a
cross over the tip in the tree). This functionality is now specified
via a node-data JSON (see the v6 release notes).
Prettifying metadata fieldsď
In Auspice v1, we automatically âprettifiedâ many metadata values. For example, a country value of ânew_zealandâ would display as âNew Zealandâ, and a metadata column called âage_rangeâ would display as âAge Rangeâ.
This worked well most of the time, but meant that users couldnât intentionally keep underscores or lower-case values. It also meant we had to detect exception cases like turning âusaâ into âUSAâ rather than âUsaâ.
In Auspice v2, all values are now displayed exactly as they arrive, allowing users to ensure every gene and abbreviation displays just as it should. However, this means that you should ensure your data looks exactly how youâd like it to display - change any ânew_zealandâs in your metadata to âNew Zealandâ!
Donât forget to also change them in any custom lat-long and/or coloring files you are using. Weâve also become stricter about the format of the files that pass in color and lat-long information. Previously, it didnât matter if columns were separated by spaces or tabs - now, they must be separated by tabs.
You can find out more about how to add custom coloring and lat-long values.
If you use the command parse
to generate a metadata table from
fields in a fasta header, you can use the flag --prettify-fields
to
apply some prettifying operations to specific metadata entries,
see the documentation for parse
.