augur traits


Infer ancestral traits based on a tree.

usage: augur traits [-h] --tree TREE --metadata FILE
                    [--metadata-delimiters METADATA_DELIMITERS [METADATA_DELIMITERS ...]]
                    [--metadata-id-columns METADATA_ID_COLUMNS [METADATA_ID_COLUMNS ...]]
                    [--weights WEIGHTS] --columns COLUMNS [COLUMNS ...]
                    [--confidence]
                    [--sampling-bias-correction SAMPLING_BIAS_CORRECTION]
                    [--output-node-data OUTPUT_NODE_DATA]

Named Arguments

--tree, -t

tree to perform trait reconstruction on

--metadata

table with metadata

--metadata-delimiters

delimiters to accept when reading a metadata file. Only one delimiter will be inferred.

Default: (',', '\t')

--metadata-id-columns

names of possible metadata columns containing identifier information, ordered by priority. Only one ID column will be inferred.

Default: ('strain', 'name')

--weights

tsv/csv table with equilibrium probabilities of discrete states

--columns

metadata fields to perform discrete reconstruction on

--confidence

record the distribution of subleading mugration states

Default: False

--sampling-bias-correction

a rough estimate of how many more events would have been observed if sequences represented an even sample. This should be roughly the (1-sum_i p_i^2)/(1-sum_i t_i^2), where p_i are the equilibrium frequencies and t_i are apparent ones.(or rather the time spent in a particular state on the tree)

--output-node-data

name of JSON file to save trait inferences to

Note that missing data must be represented by a ? character. Missing data will currently be inferred.

What about missing data?

If you have strains with missing data and you want them to be reconstructed, then you must give them the value ?. For example, if you are running a reconstruction of country and you don’t know the country for a particular strain, you should set country to ? in the metadata file for that strain. Then, traits will estimate the most likely country value for any strains where you have provided ?.

If you do not want these traits to be reconstructed (you would like it to remain clear that the country is unknown for this sample), then simply leave this field blank in the metadata file.

Note that each value – empty strings, NA, unknown– will be interpretted as a valid value! So, it’s best to be consistant with whatever you use for missing values, or those with NA will be shown as different from those with unknown!