augur.mask module¶
Mask specified sites from a VCF or FASTA file.
-
augur.mask.
get_chrom_name
(vcf_file)¶ Read the CHROM field from the first non-header line of a vcf file.
Returns: str or None: Either the CHROM field or None if no non-comment line could be found.
-
augur.mask.
mask_fasta
(mask_sites, in_file, out_file, mask_from_beginning=0, mask_from_end=0)¶ Mask the provided site list from a FASTA file and write to a new file.
Masked sites are overwritten as “N”s.
- :
- mask_sites: list[int]
A list of site indexes to exclude from the FASTA.
- in_file: str
The path to the FASTA file you wish to mask.
- out_file: str
The path to write the resulting FASTA to
- mask_from_beginning: int
Number of sites to mask from the beginning of each sequence (default 0)
- mask_from_end: int
Number of sites to mask from the end of each sequence (default 0)
-
augur.mask.
mask_vcf
(mask_sites, in_file, out_file, cleanup=True)¶ Mask the provided site list from a VCF file and write to a new file.
This function relies on ‘vcftools –exclude-positions’ to mask the requested sites.
- mask_sites: list[int]
A list of site indexes to exclude from the vcf.
- in_file: str
The path to the vcf file you wish to mask.
- out_file: str
The path to write the resulting vcf to
- cleanup: bool
Clean up the intermediate files, including the VCFTools log and mask sites file
-
augur.mask.
register_arguments
(parser)¶
-
augur.mask.
run
(args)¶ Mask specified sites from the VCF or FASTA.
For VCF files, his occurs by removing them entirely from the VCF, essentially making them identical to the reference at the locations.
For FASTA files, masked sites are replaced with “N”.
If users don’t specify output, will overwrite the input file.