Skip to content

mecfs_bio.asset_generator.h_magma_asset_generator

Asset generator for running H-MAGMA against GWAS summary statistics.

H-MAGMA (Sey et al. 2020) maps SNPs to genes using tissue-specific Hi-C chromatin interaction data. The upstream project at https://github.com/thewonlab/H-MAGMA/tree/master/Input_Files publishes six pre-built .genes.annot files, one per tissue/cell type (adult brain, fetal brain, cortical neurons, midbrain dopaminergic neurons, iPSC-derived astrocytes, iPSC-derived neurons). This generator runs the MAGMA gene analysis step against each of those six annotation files and produces a gene-level Manhattan plot for each.

H-MAGMA's pre-built annotations replace the usual magma --annotate step, so this generator skips :class:MagmaAnnotateTask and feeds the static annotation directly into :meth:MagmaGeneAnalysisTask.create_with_prebuilt_annotation.

The H-MAGMA annotation files are aligned to GRCh37/hg19 and key SNPs by RSID, so the standard EUR build-37 1000 Genomes LD reference is used.

Classes:

Functions:

Attributes:

H_MAGMA_ANNOTATION_TASKS module-attribute

H_MAGMA_ANNOTATION_TASKS: list[tuple[str, Task]] = [
    ("adult_brain", ADULT_BRAIN_H_MAGMA_ANNOT_RAW),
    ("cortical_neuron", CORTICAL_NEURON_H_MAGMA_ANNOT_RAW),
    ("fetal_brain", FETAL_BRAIN_H_MAGMA_ANNOT_RAW),
    ("midbrain_da", MIDBRAIN_DA_H_MAGMA_ANNOT_RAW),
    (
        "ipsc_derived_astro",
        IPSC_DERIVED_ASTRO_H_MAGMA_ANNOT_RAW,
    ),
    (
        "ipsc_derived_neuro",
        IPSC_DERIVED_NEURO_H_MAGMA_ANNOT_RAW,
    ),
]

HMagmaTasks

The aggregate result of running H-MAGMA against all six annotations.

Methods:

Attributes:

p_value_task instance-attribute

p_value_task: MagmaSNPFileTask

per_annotation instance-attribute

per_annotation: list[HMagmaTasksForAnnotation]

labeled_by_annotation

labeled_by_annotation() -> dict[
    str, HMagmaTasksForAnnotation
]
Source code in mecfs_bio/asset_generator/h_magma_asset_generator.py
def labeled_by_annotation(self) -> dict[str, HMagmaTasksForAnnotation]:
    return {t.annotation_name: t for t in self.per_annotation}

terminal_tasks

terminal_tasks() -> list[Task]
Source code in mecfs_bio/asset_generator/h_magma_asset_generator.py
def terminal_tasks(self) -> list[Task]:
    return [t.gene_manhattan_plot_task for t in self.per_annotation]

HMagmaTasksForAnnotation

All tasks produced for a single H-MAGMA tissue annotation.

Attributes:

annotation_name instance-attribute

annotation_name: str

gene_analysis_task instance-attribute

gene_analysis_task: MagmaGeneAnalysisTask

gene_manhattan_plot_task instance-attribute

gene_manhattan_plot_task: GeneManhattanPlotTask

generate_h_magma_tasks

generate_h_magma_tasks(
    base_name: str,
    gwas_parquet_with_rsids_task: Task,
    sample_size: int,
    pipes: list[DataProcessingPipe] | None = None,
) -> HMagmaTasks

Generate one MAGMA gene analysis task and one gene-level Manhattan plot task per H-MAGMA tissue annotation (six in total).

gwas_parquet_with_rsids_task must produce a parquet with the GWASLAB column names plus an RSID column (the standard input to MAGMA in this repository).

Source code in mecfs_bio/asset_generator/h_magma_asset_generator.py
def generate_h_magma_tasks(
    base_name: str,
    gwas_parquet_with_rsids_task: Task,
    sample_size: int,
    pipes: list[DataProcessingPipe] | None = None,
) -> HMagmaTasks:
    """Generate one MAGMA gene analysis task and one gene-level Manhattan plot
    task per H-MAGMA tissue annotation (six in total).

    ``gwas_parquet_with_rsids_task`` must produce a parquet with the GWASLAB
    column names plus an RSID column (the standard input to MAGMA in this
    repository).
    """
    p_value_task = MagmaSNPFileTask.create_for_magma_snp_p_value_file_compute_if_needed(
        gwas_parquet_with_rsids_task=gwas_parquet_with_rsids_task,
        asset_id=base_name + "_h_magma_snp_p_values",
        pipes=pipes,
    )

    per_annotation: list[HMagmaTasksForAnnotation] = []
    for annotation_name, annotation_task in H_MAGMA_ANNOTATION_TASKS:
        gene_analysis_task = MagmaGeneAnalysisTask.create_with_prebuilt_annotation(
            asset_id=f"{base_name}_h_magma_{annotation_name}_gene_analysis",
            magma_annotation_task=annotation_task,
            magma_p_value_task=p_value_task,
            magma_binary_task=MAGMA_1_1_BINARY_EXTRACTED,
            magma_ld_ref_task=MAGMA_EUR_BUILD_37_1K_GENOMES_EXTRACTED,
            ld_ref_file_stem="g1000_eur",
            sample_size=sample_size,
            sub_dir_suffix=PurePath("h_magma") / annotation_name,
        )
        gene_manhattan_plot_task = GeneManhattanPlotTask.create(
            asset_id=f"{base_name}_h_magma_{annotation_name}_gene_manhattan_plot",
            source=MagmaGeneSource(
                magma_task=gene_analysis_task,
                gene_thesaurus_task=GENE_THESAURUS,
                genome_build=_H_MAGMA_GENOME_BUILD,
            ),
        )
        per_annotation.append(
            HMagmaTasksForAnnotation(
                annotation_name=annotation_name,
                gene_analysis_task=gene_analysis_task,
                gene_manhattan_plot_task=gene_manhattan_plot_task,
            )
        )

    return HMagmaTasks(
        p_value_task=p_value_task,
        per_annotation=per_annotation,
    )