This page gives an overview of the files in your local
User files are not tracked by version control, meaning they are either provided by the user or generated by the workflow.
An analysis directory is a non-tracked directory which contains user-defined customization files.
In the tutorials, the analysis directory is
ncov-tutorial/. Follow these steps to create your own analysis directory.
Previously, we recommended using Snakemake profiles under a
my_profiles/ analysis directory. We now recommend using Snakemake config files directly via the
--configfile parameter. You can still use existing profiles via
Learn how to prepare input files with Data preparation guide.
A few example input files are provided when you clone
ncov/ locally, under
Metadata file (e.g.
data/example_metadata.tsv): tab-delimited description of strain (i.e., sample) attributes
Sequences file (e.g.
data/example_sequences.fasta.gz): genomic sequences whose ids must match the
straincolumn in the metadata file.
Output files and directories
These are generated by the workflow.
auspice/<dataset_name>.json: output file for visualization in Auspice where
<dataset_name>is the name of your output dataset in the workflow configuration file used by
results/aligned.fasta, etc.: raw results files (dependencies) that are shared across all datasets.
results/<dataset_name>/: raw results files (dependencies) that are specific to a single dataset.
logs/: Log files with error messages and other information about the run.
benchmarks/: Run-times (and memory usage on Linux systems) for each rule in the workflow.
These files are not intended for modification. See Workflow config file guide on how to configure workflow behavior.
Default workflow customization files
defaults/parameters.yaml: default config file. Override these settings using
defaults/auspice_config.json: default Auspice config file. Override these settings using
defaults/include.txt: default strain names to include during subsampling and filtering.
defaults/exclude.txt: default strain names to exclude during subsampling and filtering.
Workflow definition files
Snakefile: entry point for Snakemake commands that also validates inputs.
workflow/snakemake_rules/main_workflow.smk: defines rules for running each step in the analysis. Modify your workflow config file, rather than hardcode changes into the snakemake file itself.
workflow/envs/nextstrain.yaml: specifies computing environment needed to run workflow with the
workflow/schemas/config.schema.yaml: defines format (e.g., required fields and types) for workflow config files.
scripts/: helper scripts for common tasks.
These files are used to generate the workflow documentation.
Nextstrain user files
The Nextstrain team maintains user files in the
ncov/ repo, under