augur.utils moduleο
- class augur.utils.AugurJSONEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)ο
Bases:
JSONEncoder
A custom JSONEncoder subclass to serialize data types used for various data stored in dictionary format.
- default(obj)ο
Implement this method in a subclass such that it returns a serializable object for
o
, or calls the base implementation (to raise aTypeError
).For example, to support arbitrary iterators, you could implement default like this:
def default(self, o): try: iterable = iter(o) except TypeError: pass else: return list(iterable) # Let the base class default method raise the TypeError return super().default(o)
- class augur.utils.BytesWrittenCounterIOο
Bases:
RawIOBase
Binary stream to count the number of bytes sent via write().
- write(b)ο
- writtenο
Number of bytes written.
- exception augur.utils.InvalidTreeErrorο
Bases:
Exception
Represents an error loading a phylogenetic tree from a filename.
- augur.utils.annotate_parents_for_tree(tree)ο
Annotate each node in the given tree with its parent.
Examples
>>> import io >>> tree = Bio.Phylo.read(io.StringIO("(A, (B, C))"), "newick") >>> not any([hasattr(node, "parent") for node in tree.find_clades()]) True >>> tree = annotate_parents_for_tree(tree) >>> tree.root.parent is None True >>> all([hasattr(node, "parent") for node in tree.find_clades()]) True
- augur.utils.available_cpu_cores(fallback=1)ο
Returns the number (an int) of CPU cores available to this process, if determinable, otherwise the number of CPU cores available to the computer, if determinable, otherwise the fallback number (which defaults to 1).
- Return type:
- augur.utils.first_line(text)ο
Returns the first line of the given text, ignoring leading and trailing whitespace.
- augur.utils.genome_features_to_auspice_annotation(features, ref_seq_name=None, assert_nuc=False)ο
- Parameters:
- Returns:
annotations β See schema-annotations.json for the schema this conforms to
- Return type:
- augur.utils.get_augur_version()ο
Returns a string of the current augur version.
- augur.utils.get_json_name(args, default=None)ο
- augur.utils.get_parent_name_by_child_name_for_tree(tree)ο
Return dictionary mapping child node names to parent node names
- augur.utils.json_size(data)ο
Return size in bytes of a Python object in JSON string form.
- augur.utils.json_to_tree(json_dict, root=True, parent_cumulative_branch_length=None)ο
Returns a Bio.Phylo tree corresponding to the given JSON dictionary exported by tree_to_json.
Assigns links back to parent nodes for the root of the tree.
Examples
Test opening a JSON from augur export v1.
>>> import json >>> json_fh = open("tests/data/json_tree_to_nexus/flu_h3n2_ha_3y_tree.json", "r") >>> json_dict = json.load(json_fh) >>> tree = json_to_tree(json_dict) >>> tree.name 'NODE_0002020' >>> len(tree.clades) 2 >>> tree.clades[0].name 'NODE_0001489' >>> hasattr(tree, "attr") True >>> "dTiter" in tree.attr True >>> tree.clades[0].parent.name 'NODE_0002020' >>> tree.clades[0].branch_length > 0 True
Test opening a JSON from augur export v2.
>>> json_fh = open("tests/data/zika.json", "r") >>> json_dict = json.load(json_fh) >>> tree = json_to_tree(json_dict) >>> hasattr(tree, "name") True >>> len(tree.clades) > 0 True >>> tree.clades[0].branch_length > 0 True
Branch lengths should be the length of the branch to each node and not the length from the root. The cumulative branch length from the root gets its own attribute.
>>> tip = [tip for tip in tree.find_clades(terminal=True) if tip.name == "USA/2016/FLWB042"][0] >>> round(tip.cumulative_branch_length, 6) 0.004747 >>> round(tip.branch_length, 6) 0.000186
- augur.utils.load_features(reference, feature_names=None)ο
Parse a GFF/GenBank reference file. See the docstrings for _read_gff and _read_genbank for details.
- Parameters:
- Returns:
features β keys: feature names, values:
Bio.SeqFeature.SeqFeature
Note that feature names may not equivalent to GenBank feature keys- Return type:
- Raises:
AugurError β If the reference file doesnβt exist, or is malformed / empty
- augur.utils.load_mask_sites(mask_file)ο
Load masking sites from either a BED file or a masking file.
- augur.utils.nthreads_value(value)ο
Argument value validation and casting function for βnthreads.
- augur.utils.parse_genes_argument(input)ο
- augur.utils.read_bed_file(bed_file)ο
Read a BED file and return a list of excluded sites.
Note: This function assumes the given file is a BED file. On parsing failures, it will attempt to skip the first line and retry, but no other error checking is attempted. Incorrectly formatted files will raise errors.
- augur.utils.read_colors(overrides=None, use_defaults=True)ο
- augur.utils.read_entries(*files, comment_char='#')ο
Reads entries (one per line) from one or more plain text files.
Entries can be commented with full-line or inline comments. For example, the following is a valid file:
# this is a comment at the top of the file strain1 # exclude strain1 because it isn't sequenced properly strain2 # this is an empty line that will be ignored.
- augur.utils.read_lat_longs(overrides=None, use_defaults=True)ο
- augur.utils.read_mask_file(mask_file)ο
Read a masking file and return a list of excluded sites.
Masking files have a single masking site per line, either alone or as the second column of a tab-separated file. These sites are assumed to be one-indexed, NOT zero-indexed. Incorrectly formatted lines will be skipped.
- augur.utils.read_node_data(fnames, tree=None, validation_mode=ValidationMode.ERROR)ο
- augur.utils.read_tree(fname, min_terminals=3)ο
Safely load a tree from a given filename or raise an error if the file does not contain a valid tree.
- Parameters:
- Raises:
InvalidTreeError β If the given file exists but does not seem to contain a valid tree format.
- Returns:
BioPython tree instance
- Return type:
- augur.utils.write_json(data, file, indent=2, include_version=True)ο
Write
data
as JSON to the givenfile
, creating parent directories if necessary. The augur version is included as a top-level key βaugur_versionβ.- Parameters:
data (dict) β data to write out to JSON
file β file path or handle to write to
indent (int or None, optional) β JSON indentation level. Default is None if the environment variable
AUGUR_MINIFY_JSON
is truthy, else 1include_version (bool, optional) β Include the augur version. Default: True.
- Raises:
OSError β