Reference treeļ
Nextclade Web (advanced mode): accepted in āReference treeā drag & drop box. A remote URL is also accepted in input-tree
URL parameter.
Nextclade CLI argument: --input-tree
/-a
Accepted formats: Auspice JSON v2 (description, schema) - this is the same format that is used in Nextstrain. It is produced by augur export and consumed by Nextstrain Auspice. Refer to Nextstrain documentation at https://docs.nextstrain.org and in particular the augur
documentation on how to build your own trees. Using augur
to make the reference tree is not a strict requirement, however the output tree must follow the Auspice JSON v2
schema.
The phylogenetic reference tree which serves as a target for phylogenetic placement (see Algorithm: Phylogenetic placement). Nearest neighbor information is used to assign clades (see Algorithm: Clade Assignment) and to identify private mutations, including reversions.
The tree must be rooted at the sample that matches the reference sequence. A workaround in case one does not want to root the tree to be rooted on the reference is to attach the mutational differences between the tree root and the reference on the branch leading to the root node. This can be accomplished by passing the reference sequence to augur ancestral
ās --root-sequence
argument (see the augur ancestral
docs).
The tree must contain a clade definition for every node (including internal): every node must have a value at node_attrs.clade_membership
(although it can be an empty string).
The tree should be sufficiently large and diverse to meet clade assignment expectations of a particular use-case, study or experiment. Only clades present on the reference tree can be assigned to query sequences.
š” Nextclade CLI supports file compression and reading from standard input. See section Compression, stdin for more details.
Extensionsļ
Auspice JSON trees prepared for usage in Nextclade can contain a set of extensions to the canonical Auspice JSON format. These extensions contain additional information that is used only in Nextclade and allows for more features during the analysis.
Clade-like attributesļ
For organisms with multiple concurrent nomenclatures (clades, lineages, variants etc.), in addition to clades (see Algorithm: Clade Assignment), dataset authors can choose to add extra clade-like attributes.
The clade-like attributes behave like built-in clades (.node_attrs.clade_membership
in every node) and are copied from the nearest node along with it.
Each declared attribute will result in a new column in the results table in Nextclade Web and in TSV/CSV output files, as well as a set of corresponding fields in the output JSON/NDJSON and output tree (the newly placed nodes).
Additionally, each of the attributes, unless excluded, participates in founder node search. For each attribute, Nextclade Web will display in the āRelative toā dropdown an additional entry named āā<attribute.displayName>ā founderā, and a set of columns/fields founderMuts
will be added to the outputs.
As a dataset author, in order to add clade-like attributes to your reference tree, modify the reference tree file as follows:
Add field
.meta.extensions.nextclade.clade_node_attrs
of array type, and declare the clade-like attributes you want to add.Example (for latest examples see nextstrain/nextclade_data):
{ "meta": { "extensions": { "nextclade": { "clade_node_attrs": [ { "name": "other-clade", "displayName": "Other clade", "description": "This long text goes into the tooltip. Explain what the clades are, who and where defined them.", "hideInWeb": false, "skipAsReference": true }, { "name": "my-lineage", "displayName": "My lineage", "description": "This long text goes into the tooltip. Explain what the lineages are, who and where defined them.", "hideInWeb": false, "skipAsReference": true } ] } } } }
Fields:
name
- (required) machine-readable identifier of the attribute. Should match the attribute on the tree nodes. Will be used to name fields/columns in JSON and TSV output files.displayName
- (optional) human-friendly name of the attribute. Will be shown in Nextclade Web.description
- (optional) human-friendly description of the attribute. Will be shown in Nextclade Web.hideInWeb
- (optional) set this totrue
to hide attributeās column from Nextclade WebskipAsReference
- (optional) - set this totrue
to no use the attribute for calculating clade founder nodes and relative mutations.
For each node in the tree, add node attribute with the same name as the
name
field in the attributeās description and with the value corresponding to the value of the clade, lineage etc. of this node:{ "node_attrs": { "clade_membership": {"value": "A1"}, "other-clade": {"value": "Lambda"}, "my-lineage": {"value": "A.1.2.3.4"} } }
Note that
clade_membership
attribute is treated separately (if present) and it does not need to be declared inclade_node_attrs
.Now when running Nextclade with this tree, you will notice additional columns in the outputs. Each entry in a column for a clade-like attribute corresponds to a clade value assigned to the query sequence.
For concrete examples of using clade-like attributes, check out official SARS-CoV-2 datasets: they assign Nextstrain clades, Pango lineages and WHO VOC/VOIs simultaneously.
Relative mutationsļ
Add object under .meta.extensions.nextclade.ref_nodes
:
{
"ref_nodes": {
"default": "__root__",
"search": [
{
"name": "JN.1",
"displayName": "JN.1 (24A)",
"description": "Variant recommended for the 2024/2025 COVID-19 vaccine",
"criteria": [
{
"qry": [
{
"clade": ["23I", "24A", "24B", "24C", "recombinant"]
}
],
"node": [
{
"name": ["JN.1"]
}
]
}
]
}
]
}
}
Properties:
default
: string, optional. Set default search to display in the Nextclade Web dropdown. Should correspond to one of thesearch[].name
fields or one of the special values__root__
for reference sequence (default),__parent__
for nearest node (private mutations),__clade_founder__
for founder of the clade.search
: array of objects, optional. Each object describes one search. Each search corresponds to an entry in the āRelative toā dropdown in the web app and a set of CSV/TSV columnsrelativeMutations['searchName']
. Note that these names no longer need to correspond to node names.search[].name
: required unique identifier of the search entrysearch[].displayName
,search.description
: optional friendly name and description to be displayed in the UI (dropdown)search[].criteria
: array of objects, optional. One or multiple search criteria. Criteria should be described such that during search run only one criterion matches a pair of query and node. If there are multiple matches, then one (unspecified) match is taken and a warning is emitted.search[].criteria[].qry
: object, optional, describing properties of query samples to select for this searchsearch[].criteria[].qry.clade
: array of strings, optional. Query names to consider for this search. At least one match is necessary for sample to match.search[].criteria[].qry.cladeNodeAttrs
: optional mapping from name of the clade-like attr to a list of searched values for this attr. At least one match is necessary for sample to match.
search[].criteria[].node
: object, optional, describing properties of ref node to search, as well as search algorithm. All of the properties should match.search[].criteria[].node.name
: array of strings, optional. Searched node names. At least one match.search[].criteria[].node.clade
: array of strings, optional. Searched node clades. At least one match is necessary for node to match.search[].criteria[].node.cladeNodeAttrs
: optional mapping from name of the clade-like attr to a list of searched values for this attr. At least one match is necessary for node to match.search[].criteria[].node.searchAlgo
: string, optional. Search algorithm to usefull
(default): simple loop over all nodes until first match is foundancestor-earliest
: start with the current sample and traverse the graph against edge directions, looking for matching nodes, until it reaches root node. The result is the last encountered matching node.ancestor-latest
: start with the current sample and traverse the graph against edge directions, looking for matching nodes. The first match is the result.