Infer ancestral sequences based on a tree.

The ancestral sequences are inferred using TreeTime. Each internal node gets assigned a nucleotide sequence that maximizes a likelihood on the tree given its descendants and its parent node. Each node then gets assigned a list of nucleotide mutations for any position that has a mismatch between its own sequence and its parent’s sequence. The node sequences and mutations are output to a node-data JSON file.


The mutation positions in the node-data JSON are one-based.

augur.ancestral.ancestral_sequence_inference(tree=None, aln=None, ref=None, infer_gtr=True, marginal=False, fill_overhangs=True, infer_tips=False)

infer ancestral sequences using TreeTime

  • tree (Bio.Phylo tree or str) – tree or filename of tree

  • aln (Bio.Align.MultipleSeqAlignment or str) – alignment or filename of alignment

  • infer_gtr (bool, optional) – Description

  • marginal (bool, optional) – Description

  • fill_overhangs (bool) – In some cases, the missing data on both ends of the alignment is filled with the gap character (‘-‘). If set to True, these end-gaps are converted to “ambiguous” characters (‘N’ for nucleotides, ‘X’ for aminoacids). Otherwise, the alignment is treated as-is

  • infer_tips (bool) – Since v0.7, TreeTime does not reconstruct tip states by default. This is only relevant when tip-state are not exactly specified, e.g. via characters that signify ambiguous states. To replace those with the most-likely state, set infer_tips=True


treetime.TreeAnc instance

Return type:


augur.ancestral.collect_mutations_and_sequences(tt, infer_tips=False, full_sequences=False, character_map=None, mask_ambiguous=True)

iterates of the tree and produces dictionaries with mutations and sequences for each node.

  • tt (treetime) – instance of treetime with valid ancestral reconstruction

  • infer_tips (bool, optional) – if true, request the reconstructed tip sequences from treetime, otherwise retain input ambiguities

  • full_sequences (bool, optional) – if true, add the full sequences

  • character_map (None, optional) – optional dictionary to map characters to a custom set.


dictionary of mutations and sequences

Return type: