augur.io.sequences module
- augur.io.sequences.get_biopython_format(augur_format)
Validate sequence file format and return the inferred Biopython format.
- Return type:
- augur.io.sequences.read_sequences(*paths, format='fasta')
Read sequences from one or more paths.
Automatically infer compression mode (e.g., gzip, etc.) and return a stream of sequence records given the file format.
- augur.io.sequences.read_single_sequence(path, format='fasta')
Read a single sequence from a path.
Automatically infers compression mode.
- augur.io.sequences.write_records_to_fasta(records, fasta, seq_id_field='strain', seq_field='sequence')
Write sequences from dict records to a fasta file. Yields the records with the seq_field dropped so that they can be consumed downstream.
- Parameters:
- Yields:
dict – A copy of the record with seq_field dropped
- Raises:
AugurError – When the sequence id field or sequence field does not exist in a record
- augur.io.sequences.write_sequences(sequences, path_or_buffer, format='fasta')
Write sequences to a given path in the given format.
Automatically infer compression mode (e.g., gzip, etc.) based on the path’s filename extension.
- Parameters:
sequences (iterable of Bio.SeqRecord.SeqRecord) – A list-like collection of sequences to write
path_or_buffer (str or os.PathLike or io.StringIO) – A path to a file to write the given sequences in the given format.
format (str) – Format of input sequences matching any of those supported by BioPython (e.g., “fasta”, “genbank”, etc.)
- Returns:
Number of sequences written out to the given path.
- Return type: