mecfs_bio.build_system.task.assign_rsids_via_snp151_task
Assign RSIDs to variants via joining a database file. Only works for single-nucleotide variations.
Classes:
-
AssignRSIDSToSNPsViaSNP151Task–Assigns RSIDS to the SNP genetic variants in a file of Gwas summary statistics
Functions:
AssignRSIDSToSNPsViaSNP151Task
Bases: Task
Assigns RSIDS to the SNP genetic variants in a file of Gwas summary statistics Uses SNP151 database file Assumes the GWASLAB naming conventions are used in the summary statistics file Assumes that both input files are in parquet format
Note that non-SNP variations (e.g. insertions or deletions) are excluded. This operates exclusively on SNPs
Methods:
Attributes:
-
chrom_replace_rules(Mapping[str, int]) – -
database_id(AssetId) – -
database_meta(Meta) – -
deps(list[Task]) – -
meta(Meta) – -
raw_snp_data_task(Task) – -
snp151_database_file_task(Task) – -
snp_data_id(AssetId) – -
snp_data_meta(Meta) – -
valid_chroms(list[str]) –
create
classmethod
create(
snp151_database_file_task: Task,
raw_snp_data_task: Task,
asset_id: str,
valid_chroms: list[str],
chrom_replace_rules: Mapping[str, int],
)
Source code in mecfs_bio/build_system/task/assign_rsids_via_snp151_task.py
execute
Source code in mecfs_bio/build_system/task/assign_rsids_via_snp151_task.py
create_new_meta
create_new_meta(
source_meta: Meta,
asset_id: str,
format: DataFrameFormat = DataFrameParquetFormat(),
extension=".parquet",
) -> Meta