A Getting Started Guide to the Genomic Epidemiology of SARS-CoV-2

This template and tutorial will walk you through the process of running a basic phylogenetic analysis on SARS-CoV-2 data. We’ve created these resources with the goal of enabling Departments of Public Health to start using Nextstrain to understand their SARS-CoV-2 genomic data within 1-2 hours. In addition to the phylogenetic analysis described here, you can use our “drag-and-drop” tool for a clade assignment, mutations calling, and basic sequence quality checks at clades.nextstrain.org.

We also recommend this 1-hour video overview by Heather Blankenship on how to deploy Nextstrain for a Public Health lab.

Overview: complete walkthrough

Getting started with analysis

The starting point for this section is a FASTA file with sequence data + a TSV file with metadata. You can alternately use our example data to start.

  1. Setup and installation

  2. Preparing your data

  3. Orientation: analysis workflow

  4. Orientation: which files should I touch?

  5. Running & troubleshooting

  6. Customizing your analysis

  7. Customizing your visualization

Getting started with visualization & interpretation

The starting point for this section is a JSON file. You can alternately use our examples to start.

  1. Options for visualizing and sharing results

  2. Interpreting your results

  3. Writing a narrative to highlight key findings


If something in this tutorial is broken or unclear, please open an issue so we can improve it for everyone.

If you have a specific question, post a note over at the discussion board – we’re happy to help!