augur.mask module

Mask specified sites from a VCF or FASTA file.

augur.mask.get_chrom_name(vcf_file)

Read the CHROM field from the first non-header line of a vcf file.

Returns: str or None: Either the CHROM field or None if no non-comment line could be found.

augur.mask.mask_fasta(mask_sites, in_file, out_file)

Mask the provided site list from a FASTA file and write to a new file.

Masked sites are overwritten as “N”s.

:
mask_sites: list[int]

A list of site indexes to exclude from the FASTA.

in_file: str

The path to the FASTA file you wish to mask.

out_file: str

The path to write the resulting FASTA to

augur.mask.mask_vcf(mask_sites, in_file, out_file, cleanup=True)

Mask the provided site list from a VCF file and write to a new file.

This function relies on ‘vcftools –exclude-positions’ to mask the requested sites.

mask_sites: list[int]

A list of site indexes to exclude from the vcf.

in_file: str

The path to the vcf file you wish to mask.

out_file: str

The path to write the resulting vcf to

cleanup: bool

Clean up the intermediate files, including the VCFTools log and mask sites file

augur.mask.read_bed_file(mask_file)

Read the full list of excluded sites from the BED file.

Second column is chromStart, 3rd is chromEnd. Generate a range from these two columns.

augur.mask.register_arguments(parser)
augur.mask.run(args)

Mask specified sites from the VCF or FASTA.

For VCF files, his occurs by removing them entirely from the VCF, essentially making them identical to the reference at the locations.

For FASTA files, masked sites are replaced with “N”.

If users don’t specify output, will overwrite the input file.