Overview of this repository (i.e., what do these files do?)¶
The files in this repository fall into one of these categories:
Input files
Output files and directories
Workflow configuration files we might want to customize
Workflow configuration files we don’t need to touch
Documentation
We’ll walk through all of the files one by one, but here are the most important ones for your reference:
Category | Directory | File | Description | Configuration |
---|---|---|---|---|
Input file | ./data/ |
sequences.fasta |
Genomic sequences; IDs must match strain column in metadata.tsv |
See 'Preparing your data' |
Input file | ./data/ |
metadata.tsv |
Tab-delimited description of strain (i.e., sample) attributes | See 'Preparing your data' |
Output file | ./auspice/ |
buildName.json |
Output file for visualization in auspice | |
Customizable workflow file | ./my_profiles/<mybuildname>/ |
builds.yaml |
Define and parameterize all the builds you'd like to run | See our customization guide |
Input files¶
Directory | File | Description | Configuration |
---|---|---|---|
./data/ |
sequences.fasta |
Genomic sequences; IDs must match strain column in metadata.tsv |
See 'Preparing your data' |
./data/ |
metadata.tsv |
Tab-delimited description of strain (i.e., sample) attributes | See 'Preparing your data' |
./defaults/ |
include.txt |
List of strain names to include during subsampling and filtering | One strain name per line |
./defaults/ |
exclude.txt |
List of strain names to exclude during subsampling and filtering | One strain name per line |
Output files and directories¶
Directory | File | Description |
---|---|---|
./auspice/ |
buildName.json |
Output file for visualization in auspice |
./results/ |
aligned.fasta , sequence-disagnostics.tsv , etc. |
Raw results files (dependencies) that are shared across all builds |
./results/<buildName>/ |
tree.nwk , aa_mutations.json , etc. |
Raw results files (dependencies) that are specific to a single build |
./logs/ |
.log files |
Error messages and other information about the run |
Workflow configuration files we might want to customize¶
Directory | File | Description | Configuration |
---|---|---|---|
./my_profiles/<mybuildname>/builds.yaml |
Define and configure all the builds you'd like to run | See our customization guide | |
./my_profiles/<mybuildname>/config.yaml |
Workflow configuration file; set the number of cores, etc. | See our customization guide | |
./defaults/ |
parameters.yaml |
Default analysis configuration file | Override these settings in ./my_profiles/.../builds.yaml |
./defaults/ |
auspice_config.json |
Default visualization configuration file | Override these settings in ./my_profiles/.../auspice_config.yaml |
Workflow configuration files we don’t need to touch¶
Directory | File | Description | Configuration |
---|---|---|---|
./ |
Snakefile |
Entry point for snakemake commands; validates input. |
No modification needed |
./workflow/snakemake_rules/ |
main_workflow.smk |
Defines rules for running each step in the analysis | Modify your builds.yaml file, rather than hardcode changes into the snakemake file itself |
./workflow/envs/ |
nextstrain.yaml |
Specifies computing environment needed to run workflow with the --use-conda flag |
No modification needed |
./workflow/schemas/ |
config.schema.yaml |
Defines format (e.g., required fields and types) for config.yaml files. |
Useful reference, but no modification needed. |
./scripts/ |
add_priorities_to_meta.py , etc. |
Helper scripts for common tasks | No modification needed |