Reference treeļƒ

Nextclade Web (advanced mode): accepted in ā€œReference treeā€ drag & drop box. A remote URL is also accepted in input-tree URL parameter.

Nextclade CLI argument: --input-tree/-a

Accepted formats: Auspice JSON v2 (description, schema) - this is the same format that is used in Nextstrain. It is produced by augur export and consumed by Nextstrain Auspice. Refer to Nextstrain documentation at https://docs.nextstrain.org and in particular the augur documentation on how to build your own trees. Using augur to make the reference tree is not a strict requirement, however the output tree must follow the Auspice JSON v2 schema.

The phylogenetic reference tree which serves as a target for phylogenetic placement (see Algorithm: Phylogenetic placement). Nearest neighbor information is used to assign clades (see Algorithm: Clade Assignment) and to identify private mutations, including reversions.

The tree must be rooted at the sample that matches the reference sequence. A workaround in case one does not want to root the tree to be rooted on the reference is to attach the mutational differences between the tree root and the reference on the branch leading to the root node. This can be accomplished by passing the reference sequence to augur ancestralā€™s --root-sequence argument (see the augur ancestral docs).

The tree must contain a clade definition for every node (including internal): every node must have a value at node_attrs.clade_membership (although it can be an empty string).

The tree should be sufficiently large and diverse to meet clade assignment expectations of a particular use-case, study or experiment. Only clades present on the reference tree can be assigned to query sequences.

šŸ’” Nextclade CLI supports file compression and reading from standard input. See section Compression, stdin for more details.

Extensionsļƒ

Auspice JSON trees prepared for usage in Nextclade can contain a set of extensions to the canonical Auspice JSON format. These extensions contain additional information that is used only in Nextclade and allows for more features during the analysis.

Clade-like attributesļƒ

For organisms with multiple concurrent nomenclatures (clades, lineages, variants etc.), in addition to clades (see Algorithm: Clade Assignment), dataset authors can choose to add extra clade-like attributes.

The clade-like attributes behave like built-in clades (.node_attrs.clade_membership in every node) and are copied from the nearest node along with it.

Each declared attribute will result in a new column in the results table in Nextclade Web and in TSV/CSV output files, as well as a set of corresponding fields in the output JSON/NDJSON and output tree (the newly placed nodes).

Additionally, each of the attributes, unless excluded, participates in founder node search. For each attribute, Nextclade Web will display in the ā€œRelative toā€ dropdown an additional entry named ā€œā€™<attribute.displayName>ā€™ founderā€, and a set of columns/fields founderMuts will be added to the outputs.

As a dataset author, in order to add clade-like attributes to your reference tree, modify the reference tree file as follows:

  1. Add field .meta.extensions.nextclade.clade_node_attrs of array type, and declare the clade-like attributes you want to add.

    Example (for latest examples see nextstrain/nextclade_data):

    {
      "meta": {
        "extensions": {
          "nextclade": {
            "clade_node_attrs": [
              {
                "name": "other-clade",
                "displayName": "Other clade",
                "description": "This long text goes into the tooltip. Explain what the clades are, who and where defined them.",
                "hideInWeb": false,
                "skipAsReference": true
              },
              {
                "name": "my-lineage",
                "displayName": "My lineage",
                "description": "This long text goes into the tooltip. Explain what the lineages are, who and where defined them.",
                "hideInWeb": false,
                "skipAsReference": true
              }
            ]
          }
        }
      }
    }
    

    Fields:

    • name - (required) machine-readable identifier of the attribute. Should match the attribute on the tree nodes. Will be used to name fields/columns in JSON and TSV output files.

    • displayName - (optional) human-friendly name of the attribute. Will be shown in Nextclade Web.

    • description - (optional) human-friendly description of the attribute. Will be shown in Nextclade Web.

    • hideInWeb - (optional) set this to true to hide attributeā€™s column from Nextclade Web

    • skipAsReference - (optional) - set this to true to no use the attribute for calculating clade founder nodes and relative mutations.

  2. For each node in the tree, add node attribute with the same name as the name field in the attributeā€™s description and with the value corresponding to the value of the clade, lineage etc. of this node:

    {
      "node_attrs": {
        "clade_membership": {"value": "A1"},
        "other-clade": {"value": "Lambda"},
        "my-lineage": {"value": "A.1.2.3.4"}
      }
    }
    

    Note that clade_membership attribute is treated separately (if present) and it does not need to be declared in clade_node_attrs.

  3. Now when running Nextclade with this tree, you will notice additional columns in the outputs. Each entry in a column for a clade-like attribute corresponds to a clade value assigned to the query sequence.

    For concrete examples of using clade-like attributes, check out official SARS-CoV-2 datasets: they assign Nextstrain clades, Pango lineages and WHO VOC/VOIs simultaneously.

Relative mutationsļƒ

Add object under .meta.extensions.nextclade.ref_nodes:

{
  "ref_nodes": {
    "default": "__root__",
    "search": [
      {
        "name": "JN.1",
        "displayName": "JN.1 (24A)",
        "description": "Variant recommended for the 2024/2025 COVID-19 vaccine",
        "criteria": [
          {
            "qry": [
              {
                "clade": ["23I", "24A", "24B", "24C", "recombinant"]
              }
            ],
            "node": [
              {
                "name": ["JN.1"]
              }
            ]
          }
        ]
      }
    ]
  }
}

Properties:

  • default: string, optional. Set default search to display in the Nextclade Web dropdown. Should correspond to one of the search[].name fields or one of the special values __root__ for reference sequence (default), __parent__ for nearest node (private mutations), __clade_founder__ for founder of the clade.

  • search: array of objects, optional. Each object describes one search. Each search corresponds to an entry in the ā€œRelative toā€ dropdown in the web app and a set of CSV/TSV columns relativeMutations['searchName']. Note that these names no longer need to correspond to node names.

    • search[].name: required unique identifier of the search entry

    • search[].displayName, search.description: optional friendly name and description to be displayed in the UI (dropdown)

    • search[].criteria: array of objects, optional. One or multiple search criteria. Criteria should be described such that during search run only one criterion matches a pair of query and node. If there are multiple matches, then one (unspecified) match is taken and a warning is emitted.

      • search[].criteria[].qry: object, optional, describing properties of query samples to select for this search

        • search[].criteria[].qry.clade: array of strings, optional. Query names to consider for this search. At least one match is necessary for sample to match.

        • search[].criteria[].qry.cladeNodeAttrs: optional mapping from name of the clade-like attr to a list of searched values for this attr. At least one match is necessary for sample to match.

      • search[].criteria[].node: object, optional, describing properties of ref node to search, as well as search algorithm. All of the properties should match.

        • search[].criteria[].node.name: array of strings, optional. Searched node names. At least one match.

        • search[].criteria[].node.clade: array of strings, optional. Searched node clades. At least one match is necessary for node to match.

        • search[].criteria[].node.cladeNodeAttrs: optional mapping from name of the clade-like attr to a list of searched values for this attr. At least one match is necessary for node to match.

        • search[].criteria[].node.searchAlgo: string, optional. Search algorithm to use

          • full (default): simple loop over all nodes until first match is found

          • ancestor-earliest: start with the current sample and traverse the graph against edge directions, looking for matching nodes, until it reaches root node. The result is the last encountered matching node.

          • ancestor-latest: start with the current sample and traverse the graph against edge directions, looking for matching nodes. The first match is the result.