augur.align module¶
Align multiple sequences from FASTA.
-
exception
augur.align.
AlignmentError
¶ Bases:
Exception
-
augur.align.
check_arguments
(args)¶
-
augur.align.
check_duplicates
(*values)¶
-
augur.align.
ensure_reference_strain_present
(ref_name, existing_alignment, seqs)¶
-
augur.align.
generate_alignment_cmd
(method, nthreads, existing_aln_fname, seqs_to_align_fname, aln_fname, log_fname)¶
-
augur.align.
make_gaps_ambiguous
(aln)¶ replace all gaps by ‘N’ in all sequences in the alignment. TreeTime will treat them as fully ambiguous and replace then with the most likely state. This modifies the alignment in place.
- Parameters
aln (MultipleSeqAlign) – Biopython Alignment
-
augur.align.
prettify_alignment
(aln)¶ Converts all bases to uppercase and removes auto reverse-complement prefix (_R_). This modifies the alignment in place.
- Parameters
aln (MultipleSeqAlign) – Biopython Alignment
-
augur.align.
prune_seqs_matching_alignment
(seqs, aln)¶ Return a set of seqs excluding those set via exclude & print a warning message for each sequence which is exluded.
-
augur.align.
read_alignment
(fname)¶
-
augur.align.
read_reference
(ref_fname)¶
-
augur.align.
read_sequences
(*fnames)¶
-
augur.align.
register_arguments
(parser)¶
-
augur.align.
run
(args)¶ - Parameters
args (namespace) – arguments passed in via the command-line from augur
- Returns
returns 0 for success, 1 for general error
- Return type
int
-
augur.align.
strip_non_reference
(aln, reference, keep_reference=False)¶ return sequences that have all insertions relative to the reference removed. The alignment is read from file and returned as list of sequences.
- Parameters
aln (MultipleSeqAlign) – Biopython Alignment
reference (str) – name of reference sequence, assumed to be part of the alignment
keep_reference (bool, optional) – by default, the reference sequence is removed after stripping non-reference sequence. To keep the reference, use keep_reference=True
- Returns
list – list of trimmed sequences, effectively a multiple alignment
Tests
—–
>>> [s.name for s in strip_non_reference(read_alignment(“tests/data/align/test_aligned_sequences.fasta”), “with_gaps”, keep_reference=False)]
Trimmed gaps in with_gaps from the alignment
[‘no_gaps’, ‘some_other_seq’, ‘_R_crick_strand’]
>>> [s.name for s in strip_non_reference(read_alignment(“tests/data/align/test_aligned_sequences.fasta”), “with_gaps”, keep_reference=True)]
Trimmed gaps in with_gaps from the alignment
[‘with_gaps’, ‘no_gaps’, ‘some_other_seq’, ‘_R_crick_strand’]
>>> [s.name for s in strip_non_reference(read_alignment(“tests/data/align/test_aligned_sequences.fasta”), “no_gaps”, keep_reference=True)]
No gaps in alignment to trim (with respect to the reference, no_gaps)
[‘with_gaps’, ‘no_gaps’, ‘some_other_seq’, ‘_R_crick_strand’]
>>> [s.name for s in strip_non_reference(read_alignment(“tests/data/align/test_aligned_sequences.fasta”), “no_gaps”, keep_reference=False)]
No gaps in alignment to trim (with respect to the reference, no_gaps)
[‘with_gaps’, ‘some_other_seq’, ‘_R_crick_strand’]
>>> [s.name for s in strip_non_reference(read_alignment(“tests/data/align/test_aligned_sequences.fasta”), “missing”, keep_reference=False)]
Traceback (most recent call last) – …
augur.align.AlignmentError (ERROR: reference missing not found in alignment)
-
augur.align.
write_seqs
(seqs, fname)¶ A wrapper around SeqIO.write with error handling