Analysis results (tabular)

Nextclade Web: download nextclade.tsv or nextclade.csv

Nextclade CLI flags: --output-tsv/-t, --output-csv/-c

The results of mutation calling, clade assignment, quality control and PCR primer changes can be obtained in either tabular (TSV, CSV) or JSON (classic JSON or NDJSON) formats.

This section describes tabular output.

TSV and CSV files are equivalent and only differ in the column delimiter (tabs vs semicolons). Tabular format of TSV/CSV files is somewhat human-friendly and convenient for the immediate inspection (e.g. in Excel or other spreadsheet software) and for simple automated processing.

⚠️ Note, in CSV and TSV outputs, all positions are 1-based, and all ranges are closed (they include both left and right boundaries).

⚠️ Note, all positions are in reference coordinates, that is after all insertions relative to reference are stripped from the alignment.

⚠️ Note that, for historical reasons, we use semicolon ; as the column separator in CSV files, because we have comma , as list separators within table cells and in early versions of Nextclade our CSV writer code was imperfect, so it was an easy solution. We recommend to use TSV format instead of CSV format. But if you are using CSV format, make sure that you configure your spreadsheet software or parser to use semicolons ; as column delimiters.

Every row in tabular output corresponds to 1 input sequence. The meaning of columns is described below:

Column name Meaning type Example
index Index (integer signifying location) of a corresponding record in the input fasta file(s) non-negative integer 0
seqName Name of the sequence (as provided in the input file) string hCoV-19/USA/SEARCH-4652-SAN/2020
clade Assigned clade string 20A
qc.overallScore Overall quality control score float 23.5
qc.overallStatus Overall quality control status string: good\|mediocre\|bad mediocre
totalSubstitutions Total number of detected nucleotide substitutions non-negative integer 2
totalDeletions Total number of deleted nucleotide bases non-negative integer 15
totalInsertions Total number of inserted nucleotide bases non-negative integer 3
totalFrameShifts Total number of detected frame shifts non-negative integer 0
totalAminoacidSubstitutions Total number of detected aminoacid substitutions non-negative integer 1
totalAminoacidDeletions Total number of deleted amino acid residues non-negative integer 7
totalAminoacidInsertions Total number of inserted amino acid residues non-negative integer 8
totalMissing Total number of detected missing nucleotides (nucleotide character N) non-negative integer 238
totalNonACGTNs Total number of detected ambiguous nucleotides (nucleotide characters that are not A, C, G, T, N) non-negative integer 2
totalUnknownAa Total number of unknown aminoacids (aminoacid character X) non-negative integer 0
totalPcrPrimerChanges Total number of nucleotide mutations detected in PCR primer regions non-negative integer 0
substitutions List of detected nucleotide substitutions comma separated list of strings C241T,C2061T,C11514T,G23012A
deletions List of detected nucleotide deletion ranges comma separated list of strings 201,28881-28882
insertions List of detected inserted nucleotide fragments comma separated list of strings 248:G,21881:GAG
privateNucMutations.reversionSubstitutions List of detected private mutations that are reversions to reference comma separated list of strings C241T
privateNucMutations.labeledSubstitutions List of detected private mutations that are to a genotype that has been labeled in virus_properties.json comma separated list of strings C11514T|21I&20C,C2061T|21E
privateNucMutations.unlabeledSubstitutions List of detected private mutations that are neither reversions nor labeled comma separated list of strings G23012A
privateNucMutations.totalReversionSubstitutions Total number of private mutations that are reversions to reference non-negative integer 1
privateNucMutations.totalLabeledSubstitutions Total number of private mutations that are to a genotype that has been labeled in virus_properties.json non-negative integer 2
privateNucMutations.totalUnlabeledSubstitutions Total number of private mutations that are neither reversions nor labeled non-negative integer 1
privateNucMutations.totalPrivateSubstitutions Total number of private mutations overall non-negative integer 4
frameShifts List of detected frame shifts comma separated list of strings N:33-420
aaSubstitutions List of detected aminoacid substitutions comma separated list of strings E:T9I,N:R203K
aaDeletions List of detected aminoacid deletions comma separated list of strings N:E31-,N:E32-
aaInsertions List of detected aminoacid insertions comma separated list of strings S:214:EPE
missing List of detected missing nucleotides (nucleotide character N) comma separated list of strings 704-726,4248
nonACGTNs List of detected ambiguous nucleotides (nucleotide characters that are not A, C, G, T, N) comma separated list of strings Y:27948,K:3877
unknownAaRanges List of detected contiguous ranges of unknown aminoacid (aminoacid character X) comma separated list of strings E:1-12,E:29
pcrPrimerChanges List of detected PCR primer changes comma separated list of strings
alignmentScore Alignment score non-negative integer 88237
alignmentStart Beginning of the sequenced region non-negative integer 1
alignmentEnd End of the sequenced region non-negative integer 29903
qc.missingData.missingDataThreshold Threshold that was used for "Missing data" QC rule int 3000
qc.missingData.score Score for "Missing data" QC rule float 0.5
qc.missingData.status Status for "Missing data" QC rule string: good\|mediocre\|bad mediocre
qc.missingData.totalMissing Total number of missing nucleotides used in "Missing data" QC rule non-negative integer 238
qc.mixedSites.mixedSitesThreshold Threshold used for "Mixed sites" QC rule int 10
qc.mixedSites.score Score for "Mixed sites" QC rule float 0.5
qc.mixedSites.status Status for "Mixed sites" QC rule string: good\|mediocre\|bad good
qc.mixedSites.totalMixedSites Total number of ambiguous nucleotides used for "Mixed sites" QC rule non-negative integer 2
qc.privateMutations.cutoff Cutoff parameter used for "Private mutations" QC rule int 3
qc.privateMutations.excess Excess parameter used for "Private mutations" QC rule int 1
qc.privateMutations.score Score for "Private mutations" QC rule float 0.5
qc.privateMutations.status Status for "Private mutations" QC rule string: good\|mediocre\|bad good
qc.privateMutations.total Weighted sum of private mutations used for "Private mutations" QC rule non-negative integer 4
qc.snpClusters.clusteredSNPs Clustered SNP detected for "SNP clusters" QC rule comma separated list of strings C241T,C2061T
qc.snpClusters.score Score for "SNP clusters" QC rule float 0.5
qc.snpClusters.status Status for "SNP clusters" QC rule string: good\|mediocre\|bad bad
qc.snpClusters.totalSNPs Total number of SNPs for "SNP clusters" QC rule non-negative integer 2
qc.frameShifts.frameShifts List of detected frame shifts in "Frame shifts" QC rule (excluding ignored) comma separated list of strings N:33-420
qc.frameShifts.totalFrameShifts Total number of detected frame shifts in for "Frame shifts" QC rule (excluding ignored) non-negative integer 1
qc.frameShifts.frameShiftsIgnored List of frame shifts detected, but ignored due to ignore list comma separated list of strings ORF8:109-111
qc.frameShifts.totalFrameShiftsIgnored Total number of frame shifts detected, but ignored due to ignore list non-negative integer 1
qc.frameShifts.score Score for "Frame shifts" QC rule float 0.5
qc.frameShifts.status Status for "Frame shifts" QC rule string: good\|mediocre\|bad bad
qc.stopCodons.stopCodons List of detected stop codons in "Stop codons" QC rule comma separated list of strings ORF1a:4715,ORF1a:4716
qc.stopCodons.totalStopCodons Total number of detected stop codons in "Stop codons" QC rule non-negative integer 2
qc.stopCodons.score Score for "Stop codons" QC rule float 0.5
qc.stopCodons.status Status for "Stop codons" QC rule string: good\|mediocre\|bad bad
isReverseComplement Whether query sequences were transformed using reverse complement operation before alignment boolean false
errors List of errors during processing comma separated list of strings
warnings List of warnings during processing comma separated list of strings
failedCdses List of CDS that failed translation comma separated list of strings

⚠️ Note that sequence names (seqName column) are not guaranteed to be unique (and in practice are not unique very often). So indices is the only way to reliably link together inputs and outputs.

The table can contain additional columns for every clade-like attribute defined in reference tree in meta.extensions.clade_node_attrs and in the node attributes. For example, the default SARS-CoV-2 datasets define Nextclade_pango attribute which signifies a Pango lineage assigned by Nextclade (see Nextclade as pango lineage classifier: Methods and Validation).

⚠️Note that if nucleotide alignment or analysis of an individual sequence fails, alignment and translations are omitted from the output fasta files (see above), but the corresponding entry is still present in most of the other output files. In this case the errors column/field contain details about why the processing failed.


If translation, alignment or analysis of an individual CDS fails, the corresponding peptide cannot be analyzed, and therefore no details about aminoacid mutations, deletions, insertions, frame shifts etc. will be available. In this case warning and failedCdses columns/fields contain details about which CDS failed and why.


Care should be taken to check for errors, warnings and failedCdses columns or fields, to avoid treating missing or empty entries incorrectly. For example if and errors column is non-empty in the TSV output file, it means that the sequence processing failed completely, and treating the empty substitutions column as if no mutations detected is incorrect.


See descriptions of individual outputs and Errors and warnings section for more details.