Share via Community on GitHub

One of the ways we allow researchers to share their analyses through Nextstrain is via GitHub. This allows dataset JSONs and/or narrative markdown files to be stored in your own GitHub repos and accessed through nextstrain.org/community URLs. This gives you complete control, ownership, and discretion over your data. All that is required for this funcitonality is for files to conform to a specific naming scheme (see below). There is no need to get in touch with the Nextstrain team to allow access to the dataset, but if you would like your dataset featured on the front page or to be listed along with all available SARS-CoV-2 builds then please let us know!

P.S. For help with running your analysis, see the bioinformatics introduction.

Technical details

Given a github organisation <ORG> and repository <REPO>, dataset files should be stored in a folder named auspice. The filename must have the format <REPO>[_<NAME1>[_<NAME2>[...]]].json, where underscore-separated dataset-specific names (e.g. <NAME1>) are optional. Such datasets will be available at nextstrain.org/community/<ORG>/<REPO>[/<NAME1>/<NAME2>/...]. Note that dataset names are /-separated in the URL. See the table below for examples.

Git Branches In the above description, files are assumed to reside on the master branch. It is possible to access files on a different branch, <BRANCH> by specifying the branch in the URL via nextstrain.org/community/<ORG>/<REPO>@<BRANCH>[/<NAME1>/...]. Note that if the default branch on your repo is main then you must specify this in the URL, e.g. nextstrain.org/community/<ORG>/<REPO>@main. See the table below for examples.

Listing of all datasets and narratives If a dataset file exists at auspice/<REPO>.json (i.e. there are no dataset specific names in the filename) then visiting nextstrain.org/community/<ORG>/<REPO> will automatically load that dataset. If such a file does not exist (i.e. all the datasets have at least one <NAME> in their filenames) then visiting that URL will list the available datasets and narratives.

Narratives The above naming scheme is the same for narratives, with a few small changes. Files should be located in the narratives folder (not auspice), they should have a .md suffix (not .json) and are accessed through URLS nextstrain.org/community/narratives/<ORG>/<REPO>[/<NAME1>/...]. See the table below for examples. See the table below for an example.

v1 (deprecated) datasets work the same way, except that there are two JSONs required, auspice/<REPO>[_<NAME1>...]_tree.json and auspice/<REPO>[_<NAME1>...]_meta.json. Note that if there is a unified dataset also available (auspice/<REPO>[_<NAME1>...].json) then this will be preferentially used. See “zika-colombia” in the table below as an example.

Examples

Datasets

(GitHub) Org	Repository	branch	File(s) in repository	Nextstrain URL
`<ORG>`	`<REPO>`	master	`auspice/<REPO>.json`	`nexstrain.org/community/<ORG>/<REPO>`
`<ORG>`	`<REPO>`	`<BRANCH>`	`auspice/<REPO>.json`	`nexstrain.org/community/<ORG>/<REPO>@<BRANCH>`
`<ORG>`	`<REPO>`	`<BRANCH>`	`auspice/<REPO>_<NAME1>_<NAME2>.json`	`nexstrain.org/community/<ORG>/<REPO>@<BRANCH>/NAME1/NAME2`
blab	sars-like-cov	master	auspice/sars-like-cov.json	https://nextstrain.org/community/blab/sars-like-cov
emmahodcroft	cov	master	N/A	https://nextstrain.org/community/emmahodcroft/cov (lists available datasets)
emmahodcroft	cov	master	auspice/cov_229E_spike.json	https://nextstrain.org/community/emmahodcroft/cov/229E/spike
emmahodcroft	cov	master	auspice/cov_OC43_spike.json	https://nextstrain.org/community/emmahodcroft/cov/OC43/spike
jameshadfield	scratch	test-branch	auspice/scratch_placentalia.json	https://nextstrain.org/community/jameshadfield/scratch@test-branch/placentalia
blab	zika-colombia	master	auspice/zika-colombia_meta.json, auspice/zika-colombia_tree.json	https://nextstrain.org/community/blab/zika-colombia

Narratives

(GitHub) Org	Repository	branch	File(s) in repository	Nextstrain URL
`<ORG>`	`<REPO>`	master	`narratives/<REPO>.json`	`nexstrain.org/community/<ORG>/<REPO>`
ESR-NZ	GenomicsNarrativeSARSCoV2	master	narratives/GenomicsNarrativeSARSCoV2_2020-10-01.md	https://nextstrain.org/community/narratives/ESR-NZ/GenomicsNarrativeSARSCoV2/2020-10-01
blab	ebola-narrative-ms	master	narratives/ebola-narrative-ms_2019-09-13-sit-rep-ENGLISH.md	https://nextstrain.org/community/narratives/blab/ebola-narrative-ms/2019-09-13-sit-rep-ENGLISH

For more examples please see the Nextstrain front page and the listing of all SARS-CoV-2 builds.