Skip to content

mecfs_bio.build_system.task.mixer.mixer_univariate_combine

Task to combine MiXeR run outputs to produce a single result.

Classes:

Attributes:

COMBINED_FIT_FILENAME_PREFIX module-attribute

COMBINED_FIT_FILENAME_PREFIX = 'trait1.fit'

COMBINED_TEST_FILENAME_PREFIX module-attribute

COMBINED_TEST_FILENAME_PREFIX = 'trait1.test'

MixerRunSource

Attributes:

rep instance-attribute

rep: int

task instance-attribute

task: Task

MixerUnivariateCombine

Bases: Task

Task to combine MiXeR run outputs to produce a single result.

The MiXeR authors have split up the genetic variants in their reference panel into 20 random subsets. The recommended MiXeR workflow is to run MiXeR on your GWAS data using each of these 20 random subsets, then combine the results.

Use this task to combine the results of these runs

Methods:

Attributes:

deps property

deps: list[Task]

meta property

meta: Meta

mixer_source_runs instance-attribute

mixer_source_runs: Sequence[MixerRunSource]

trait_name instance-attribute

trait_name: str

create classmethod

create(
    asset_id: str,
    mixer_source_runs: Sequence[MixerRunSource],
    trait_name: str,
)
Source code in mecfs_bio/build_system/task/mixer/mixer_univariate_combine.py
@classmethod
def create(
    cls, asset_id: str, mixer_source_runs: Sequence[MixerRunSource], trait_name: str
):
    assert len(mixer_source_runs) >= 1
    source_meta = mixer_source_runs[0].task.meta
    meta: Meta
    if isinstance(source_meta, SimpleDirectoryMeta):
        meta = SimpleDirectoryMeta(
            id=AssetId(asset_id),
        )
    elif isinstance(source_meta, ResultDirectoryMeta):
        meta = ResultDirectoryMeta(
            id=AssetId(asset_id),
            trait=source_meta.trait,
            project=source_meta.project,
            sub_dir=source_meta.sub_dir,
        )
    else:
        raise NotImplementedError(f"Unknown source meta: {source_meta}")
    return cls(
        mixer_source_runs=mixer_source_runs,
        meta=meta,
        trait_name=trait_name,
    )

execute

execute(scratch_dir: Path, fetch: Fetch, wf: WF) -> Asset
Source code in mecfs_bio/build_system/task/mixer/mixer_univariate_combine.py
def execute(self, scratch_dir: Path, fetch: Fetch, wf: WF) -> Asset:
    with tempfile.TemporaryDirectory() as tmpdir_name:
        tmp_path = Path(tmpdir_name)
        agg_mounts = {tmp_path.resolve(): _CONTAINER_AGGREGATION_DIR}
        for source_run in self.mixer_source_runs:
            source_asset = fetch(source_run.task.asset_id)
            assert isinstance(source_asset, DirectoryAsset)
            shutil.copytree(source_asset.path, tmp_path, dirs_exist_ok=True)
            _edit_json_to_fix_trait_path(
                tmp_path / MIXER_FIT_JSON_PATTERN.replace("@", str(source_run.rep)),
                trait_name=self.trait_name,
            )
            _edit_json_to_fix_trait_path(
                tmp_path
                / MIXER_TEST_JSON_PATTERN.replace("@", str(source_run.rep)),
                trait_name=self.trait_name,
            )

        invoke_mixer_figures(
            args=[
                "combine",
                "--json",
                str(_CONTAINER_AGGREGATION_DIR / MIXER_FIT_JSON_PATTERN),
                "--out",
                str(_CONTAINER_AGGREGATION_DIR / COMBINED_FIT_FILENAME_PREFIX),
            ],
            extra_mounts=agg_mounts,
        )

        invoke_mixer_figures(
            args=[
                "combine",
                "--json",
                str(_CONTAINER_AGGREGATION_DIR / MIXER_TEST_JSON_PATTERN),
                "--out",
                str(_CONTAINER_AGGREGATION_DIR / COMBINED_TEST_FILENAME_PREFIX),
            ],
            extra_mounts=agg_mounts,
        )
        Path(tmp_path / (COMBINED_FIT_FILENAME_PREFIX + ".json")).rename(
            scratch_dir / (COMBINED_FIT_FILENAME_PREFIX + ".json")
        )
        Path(tmp_path / (COMBINED_TEST_FILENAME_PREFIX + ".json")).rename(
            scratch_dir / (COMBINED_TEST_FILENAME_PREFIX + ".json")
        )
        return DirectoryAsset(scratch_dir)