augur refine


Refine an initial tree using sequence metadata.

usage: augur refine [-h] [--alignment ALIGNMENT] --tree TREE [--metadata FILE]
                    [--metadata-delimiters METADATA_DELIMITERS [METADATA_DELIMITERS ...]]
                    [--metadata-id-columns METADATA_ID_COLUMNS [METADATA_ID_COLUMNS ...]]
                    [--output-tree OUTPUT_TREE]
                    [--output-node-data OUTPUT_NODE_DATA] [--use-fft]
                    [--max-iter MAX_ITER] [--timetree]
                    [--coalescent COALESCENT] [--gen-per-year GEN_PER_YEAR]
                    [--clock-rate CLOCK_RATE] [--clock-std-dev CLOCK_STD_DEV]
                    [--root ROOT [ROOT ...]] [--keep-root] [--covariance]
                    [--no-covariance]
                    [--keep-polytomies | --stochastic-resolve | --greedy-resolve]
                    [--precision {0,1,2,3}] [--date-format DATE_FORMAT]
                    [--date-confidence] [--date-inference {joint,marginal}]
                    [--branch-length-inference {auto,joint,marginal,input}]
                    [--clock-filter-iqd CLOCK_FILTER_IQD]
                    [--vcf-reference VCF_REFERENCE]
                    [--year-bounds YEAR_BOUNDS [YEAR_BOUNDS ...]]
                    [--divergence-units {mutations,mutations-per-site}]
                    [--seed SEED] [--verbosity VERBOSITY]

Named Arguments

--alignment, -a

alignment in fasta or VCF format

--tree, -t

prebuilt Newick

--metadata

sequence metadata

--metadata-delimiters

delimiters to accept when reading a metadata file. Only one delimiter will be inferred.

Default: (',', '\t')

--metadata-id-columns

names of possible metadata columns containing identifier information, ordered by priority. Only one ID column will be inferred.

Default: ('strain', 'name')

--output-tree

file name to write tree to

--output-node-data

file name to write branch lengths as node data

--use-fft

produce timetree using FFT for convolutions

Default: False

--max-iter

maximal number of iterations TreeTime uses for timetree inference

Default: 2

--timetree

produce timetree using treetime, requires tree where branch length is in units of average number of nucleotide or protein substitutions per site (and branch lengths do not exceed 4)

Default: False

--coalescent

coalescent time scale in units of inverse clock rate (float), optimize as scalar (‘opt’), or skyline (‘skyline’)

--gen-per-year

number of generations per year, relevant for skyline output(‘skyline’)

Default: 50

--clock-rate

fixed clock rate

--clock-std-dev

standard deviation of the fixed clock_rate estimate

--root

rooting mechanism (‘best’, least-squares’, ‘min_dev’, ‘oldest’, ‘mid_point’) OR node to root by OR two nodes indicating a monophyletic group to root by. Run treetime -h for definitions of rooting methods.

Default: 'best'

--keep-root

do not reroot the tree; use it as-is. Overrides anything specified by –root.

Default: False

--covariance

Account for covariation when estimating rates and/or rerooting. Use –no-covariance to turn off.

Default: True

--no-covariance

Default: True

--keep-polytomies

Do not attempt to resolve polytomies

Default: False

--stochastic-resolve

Resolve polytomies via stochastic subtree building rather than greedy optimization

Default: False

--greedy-resolve

Default: True

--precision

Possible choices: 0, 1, 2, 3

precision used by TreeTime to determine the number of grid points that are used for the evaluation of the branch length interpolation objects. Values range from 0 (rough) to 3 (ultra fine) and default to ‘auto’.

--date-format

date format

Default: '%Y-%m-%d'

--date-confidence

calculate confidence intervals for node dates

Default: False

--date-inference

Possible choices: joint, marginal

assign internal nodes to their marginally most likely dates, not jointly most likely

Default: 'joint'

--branch-length-inference

Possible choices: auto, joint, marginal, input

branch length mode of treetime to use

Default: 'auto'

--clock-filter-iqd

clock-filter: remove tips that deviate more than n_iqd interquartile ranges from the root-to-tip vs time regression

--vcf-reference

fasta file of the sequence the VCF was mapped to

--year-bounds

specify min or max & min prediction bounds for samples with XX in year

--divergence-units

Possible choices: mutations, mutations-per-site

Units in which sequence divergences is exported.

Default: 'mutations-per-site'

--seed

seed for random number generation

--verbosity

treetime verbosity, between 0 and 6 (higher values more output)

Default: 1

Guides

See How do I specify refine rates?.