6. Clade assignment

To simplify discussion of co-circulating virus variants, Nextstrain groups them into Clades, which are defined by specific combination of signature mutations. Clades are groups of related sequences that share a common ancestor. For SARS-CoV-2, we try to align these clades as much as possible with WHO variant designations.

In contrast to the analysis pipeline of Nextstrain.org, which requires setting up and running a heavy computational job to assign clades, Nextclade takes a lightweight approach, and assigns your sequences to clades by placing sequences on a phylogenetic tree annotated with clade definitions. More specifically, Nextclade assigns the clade of the nearest reference node found during Phylogenetic placement step. This is an accuracy-to-runtime-performance trade-off - Nextclade provides almost instantaneous result, but is expected to be slightly less accurate than the full pipeline. For more details see Phylogenetic placement: Known limitations section.

⚠️ Nextclade only considers those clades which are present in the input reference tree. Only only one of these clades, and no others, can be assigned to the analysed sequences. It is important to make sure that every clade that you expect to find in the results is well represented in the tree.

If unsure, use one of the trees from the default Nextclade datasets or any other well-known, up-to-date, sufficiently large and diverse tree.

💡 For regional, focused studies, it is recommended to use a tree which includes clades that are specific to your region.

Nextstrain clades for SARS-CoV-2

By the end of 2020, Nextstrain had defined 11 major clades (see this blog post for details):

  • 19A and 19B emerged in Wuhan and have been dominating the early outbreak

  • 20A emerged from 19A out of dominated the European outbreak in March and has since spread globally

  • 20B and 20C are large genetically distinct subclades 20A emerged in early 2020

  • 20D to 20I have emerged over the summer of 2020 and include two “variants of concern” (VOC) with signature mutations S:N501Y.

You can find the exact, up-to-date clade definitions in github.com/nextstrain/ncov.


Clades are reported in the “Clade” column in the results table of Nextclade Web as well as in the analysis results JSON, CSV and TSV files generated by Nextclade CLI and in the “Download” dialog of Nextclade Web.