Importing BEAST MCC trees into augur
This documentation details how to import BEAST MCC trees using
augur import beast.
Currently this is most useful for producing auspice-compatable output using
augur export, however in the future we will provide instructions on how to perform additional analysis using other augur tools.
BEAST 1 & 2 are extremely versitile tools. We have tested augur on a number of BEAST runs, using both BEAST & BEAST 2, however there may be issues with your particular run. Please get in touch if you encounter any issues.
Parse the BEAST tree using
augur import beast. Most of the options are explained below, but run with
--helpto see them all. This produces a newick tree file and a node-data JSON file containing BEAST traits as well as (temporal) branch lengths.
It may be necessary to modify the format of the traits written to the node-data JSON. These are extracted directly from the BEAST-created annotations in the NEXUS file. For example, if you have encoded location or host as integers, then you should map these back to their true values now.
auspice-config.jsonfile, which is needed for various display options in auspice. A template is provided as terminal output from step (1), however there is not enough information in the MCC tree to do this automatically. Pay particular attention to the color-variable types, which can either be “continuous” or “discrete”.
Extra metadata can be included here – either as an additional node-data JSON file, or in TSV format. Any additional metadata must be both specified in the
auspice-config.jsonfile and provided to
Export auspice-compatible JSONs using
augur export. A basic example of what options to supply to this command is provided as terminal output from step (1).
The BEAST MCC tree, in NEXUS format, may have
Taxlabels encoded as integers or strings – the latter is normally accompanied by a
Translate block mapping tip names to integers which are used in the actual tree block.
Calculating the root date
BEAST trees are dated in “time units” from the root date.
Helpfully for us, the sample-date is often encoded in the tip name – for instance they may follow the format
sample_name|accession|host_name|YYYY-MM-DD – and we typically utilise this to calculate the root date.
If the dates are not encoded in the tip names, or no tip-names are used, then you will need to provide the date of the most recent tip (in decimal format) via the
If the dates are provided in the tip names, then we use a regex to extract this (
--tip-date-regex, the default finds “YYYY-MM-DD” at the end of the tip name). The date format may also be specified if needed via
--tip-date-format, which is interpreted by the python datetime module, see here for help with these formats.
BEAST inferred traits
These are all extracted and stored in the node-data JSON. The terminal output will list the tratis found, e.g.:
Parsed BEAST traits:
name n(internal) n(terminal)
type 273 274
type_confidence 273 274
height 273 274
height_median 273 272
height_confidence 273 272
posterior 273 0
augur import beast --mcc data/MERS_CoV_mcc.tree --output-tree results/mers.new
augur export v1 --tree results/mers.new --node-data results/beast_data.json
--output-tree auspice/mers_tree.json --output-meta auspice/mers_meta.json
augur import beast --mcc data/beast.mcc.nex --output-tree results/mers.new
A full example build can be found here.