augur.utils module

exception augur.utils.AugurException

Bases: Exception

exception augur.utils.InvalidTreeError

Bases: Exception

Represents an error loading a phylogenetic tree from a filename.

augur.utils.ambiguous_date_to_date_range(mydate, fmt, min_max_year=None)
augur.utils.annotate_parents_for_tree(tree)

Annotate each node in the given tree with its parent.

>>> import io
>>> tree = Bio.Phylo.read(io.StringIO("(A, (B, C))"), "newick")
>>> not any([hasattr(node, "parent") for node in tree.find_clades()])
True
>>> tree = annotate_parents_for_tree(tree)
>>> tree.root.parent is None
True
>>> all([hasattr(node, "parent") for node in tree.find_clades()])
True
augur.utils.available_cpu_cores(fallback: int = 1) → int

Returns the number (an int) of CPU cores available to this process, if determinable, otherwise the number of CPU cores available to the computer, if determinable, otherwise the fallback number (which defaults to 1).

augur.utils.first_line(text)

Returns the first line of the given text, ignoring leading and trailing whitespace.

augur.utils.get_augur_version()

Returns a string of the current augur version.

augur.utils.get_json_name(args, default=None)
augur.utils.get_numerical_dates(meta_dict, name_col=None, date_col='date', fmt=None, min_max_year=None)
augur.utils.get_parent_name_by_child_name_for_tree(tree)

Return dictionary mapping child node names to parent node names

augur.utils.is_augur_version_compatable(version)

Checks if the provided version is the same major version as the currently running version of augur.

Parameters

version (str) – version to check against the current version

Returns

Return type

Bool

augur.utils.is_vcf(fname)

Convenience method to check if a file is a vcf file.

>>> is_vcf("./foo")
False
>>> is_vcf("./foo.vcf")
True
>>> is_vcf("./foo.vcf.GZ")
True
augur.utils.json_to_tree(json_dict, root=True)

Returns a Bio.Phylo tree corresponding to the given JSON dictionary exported by tree_to_json.

Assigns links back to parent nodes for the root of the tree.

Test opening a JSON from augur export v1.

>>> import json
>>> json_fh = open("tests/data/json_tree_to_nexus/flu_h3n2_ha_3y_tree.json", "r")
>>> json_dict = json.load(json_fh)
>>> tree = json_to_tree(json_dict)
>>> tree.name
'NODE_0002020'
>>> len(tree.clades)
2
>>> tree.clades[0].name
'NODE_0001489'
>>> hasattr(tree, "attr")
True
>>> "dTiter" in tree.attr
True
>>> tree.clades[0].parent.name
'NODE_0002020'
>>> tree.clades[0].branch_length > 0
True

Test opening a JSON from augur export v2.

>>> json_fh = open("tests/data/zika.json", "r")
>>> json_dict = json.load(json_fh)
>>> tree = json_to_tree(json_dict)
>>> hasattr(tree, "name")
True
>>> len(tree.clades) > 0
True
>>> tree.clades[0].branch_length > 0
True
augur.utils.load_features(reference, feature_names=None)
augur.utils.myopen(fname, mode)
augur.utils.nthreads_value(value)

Argument value validation and casting function for –nthreads.

augur.utils.open_file(fname, mode)

Open a file using either gzip.open() or open() depending on file name. Semantics identical to open()

augur.utils.print_error(message, **kwargs)

Formats message with kwargs using str.format() and textwrap.dedent() and uses it to print an error message to sys.stderr.

augur.utils.read_colors(overrides=None, use_defaults=True)
augur.utils.read_config(fname)
augur.utils.read_lat_longs(overrides=None, use_defaults=True)
augur.utils.read_metadata(fname)
augur.utils.read_node_data(fnames, tree=None)

parses one or more “node-data” JSON files and combines them using custom logic. Will exit with a (hopefully) helpful message if errors are detected.

For each JSON, we expect the top-level key “nodes” to be a dict. Generated-by fields will not be included in the returned dict of this function.

augur.utils.read_tree(fname, min_terminals=3)

Safely load a tree from a given filename or raise an error if the file does not contain a valid tree.

Parameters
  • fname (str) – name of a file containing a phylogenetic tree

  • min_terminals (int) – minimum number of terminals required for the parsed tree as a sanity check on the tree

Raises

InvalidTreeError – If the given file exists but does not seem to contain a valid tree format.

Returns

BioPython tree instance

Return type

Bio.Phylo

augur.utils.run_shell_command(cmd, raise_errors=False, extra_env=None)

Run the given command string via Bash with error checking.

Returns True if the command exits normally. Returns False if the command exits with failure and “raise_errors” is False (the default). When “raise_errors” is True, exceptions are rethrown.

If an extra_env mapping is passed, the provided keys and values are overlayed onto the default subprocess environment.

augur.utils.write_VCF_translation(prot_dict, vcf_file_name, ref_file_name)

Writes out a VCF-style file (which seems to be minimally handleable by vcftools and pyvcf) of the AA differences between sequences and the reference. This is a similar format created/used by read_in_vcf except that there is one of these dicts (with sequences, reference, positions) for EACH gene.

Also writes out a fasta of the reference alignment.

EBH 12 Dec 2017

augur.utils.write_json(data, file_name, indent=2, include_version=True)

Write data as JSON to the given file_name, creating parent directories if necessary. The augur version is included as a top-level key “augur_version”.

Parameters
  • data (dict) – data to write out to JSON

  • file_name (str) – file name to write to

  • indent (int or None, optional) – JSON indentation level. Default is None if the environment variable AUGUR_MINIFY_JSON is truthy, else 1

  • include_version (bool, optional) – Include the augur version. Default: True.

Raises

OSError