Infer ancestral sequences based on a tree.
The ancestral sequences are inferred using TreeTime. Each internal node gets assigned a nucleotide sequence that maximizes a likelihood on the tree given its descendants and its parent node. Each node then gets assigned a list of nucleotide mutations for any position that has a mismatch between its own sequence and its parent’s sequence. The node sequences and mutations are output to a node-data JSON file.
The mutation positions in the node-data JSON are one-based.
- augur.ancestral.ancestral_sequence_inference(tree=None, aln=None, ref=None, infer_gtr=True, marginal=False, fill_overhangs=True, infer_tips=False)
infer ancestral sequences using TreeTime
tree (Bio.Phylo.BaseTree.Tree or str) – tree or filename of tree
aln (Bio.Align.MultipleSeqAlignment or str) – alignment or filename of alignment
infer_gtr (bool, optional) – Description
marginal (bool, optional) – Description
fill_overhangs (bool) – In some cases, the missing data on both ends of the alignment is filled with the gap character (‘-‘). If set to True, these end-gaps are converted to “ambiguous” characters (‘N’ for nucleotides, ‘X’ for aminoacids). Otherwise, the alignment is treated as-is
infer_tips (bool) – Since v0.7, TreeTime does not reconstruct tip states by default. This is only relevant when tip-state are not exactly specified, e.g. via characters that signify ambiguous states. To replace those with the most-likely state, set infer_tips=True
- Return type:
- augur.ancestral.collect_mutations_and_sequences(tt, infer_tips=False, full_sequences=False, character_map=None, mask_ambiguous=True)
iterates of the tree and produces dictionaries with mutations and sequences for each node.
tt (treetime.TreeTime) – instance of treetime with valid ancestral reconstruction
infer_tips (bool, optional) – if true, request the reconstructed tip sequences from treetime, otherwise retain input ambiguities
full_sequences (bool, optional) – if true, add the full sequences
character_map (None, optional) – optional dictionary to map characters to a custom set.
dictionary of mutations and sequences
- Return type: