mecfs_bio.build_system.task.two_sample_mr_task
Task to apply two sample mendelian randomization to GWAS data, together with associated axillary functions.
Classes:
-
ClumpOptions– -
MRInputColSpec– -
MRReportOptions– -
SteigerFilteringOptions– -
TwoSampleMRConfig– -
TwoSampleMRResult– -
TwoSampleMRTask–Task to run mendelian randomization using the R package TwoSampleMR.
Functions:
-
convert_outcome_and_exposure_to_r– -
format_data– -
format_data_no_conversion– -
gen_mr_report– -
harmonize_data– -
harmonize_data_no_conversion– -
optionally_clump_exposure_data– -
optionally_clump_exposure_data_no_conversion– -
pre_filter_outcome_variants– -
run_tsmr_on_formatted_data– -
run_tsmr_on_harmonized_data– -
run_tsmr_on_harmonized_data_no_conversion– -
run_two_sample_mr– -
steiger_filtering_write_output–
Attributes:
-
GWASLAB_MR_INPUT_COL_SPEC– -
IgnoreOrRaise– -
MAIN_RESULT_DF_PATH– -
NEEDED_COLS– -
REPORT_SUBDIR_PATH– -
RPackageType– -
STEIGER_RESULT_PATH– -
SUN_ET_AL_MR_INPUT_COL_SPEC_hg37– -
TSM_BETA_COL– -
TSM_CHR_COL– -
TSM_EAF_COL– -
TSM_EFFECT_ALLELE_COL– -
TSM_N_CASE– -
TSM_N_CONTROL– -
TSM_OTHER_ALLELE_COL– -
TSM_OUTPUT_B_COL– -
TSM_OUTPUT_EXPOSURE_COL– -
TSM_OUTPUT_METHOD_COL– -
TSM_OUTPUT_NSNP_COL– -
TSM_OUTPUT_P_COL– -
TSM_OUTPUT_SE_COL– -
TSM_OUTPUT_STEIGER_DIR_COL– -
TSM_OUTPUT_STEIGER_P_COL– -
TSM_PHENOTYPE– -
TSM_POS_COL– -
TSM_P_VALUE– -
TSM_RSID_COL– -
TSM_SAMPLE_SIZE_COL– -
TSM_SE_COL– -
TSM_UNITS_COL– -
logger–
GWASLAB_MR_INPUT_COL_SPEC
module-attribute
GWASLAB_MR_INPUT_COL_SPEC = MRInputColSpec(
rsid_col=GWASLAB_RSID_COL,
beta_col=GWASLAB_BETA_COL,
se_col=GWASLAB_SE_COL,
ea_col=GWASLAB_EFFECT_ALLELE_COL,
nea_col=GWASLAB_NON_EFFECT_ALLELE_COL,
eaf_col=GWASLAB_EFFECT_ALLELE_FREQ_COL,
phenotype_col=None,
pos_col=GWASLAB_POS_COL,
chrom_col=GWASLAB_CHROM_COL,
n_case_col=GWASLAB_N_CASE_COL,
n_control_col=GWASLAB_N_CONTROL_COL,
pval_col=GWASLAB_P_COL,
errors="ignore",
)
NEEDED_COLS
module-attribute
SUN_ET_AL_MR_INPUT_COL_SPEC_hg37
module-attribute
SUN_ET_AL_MR_INPUT_COL_SPEC_hg37 = MRInputColSpec(
rsid_col="rsID",
beta_col="BETA",
se_col="SE",
ea_col="A1",
nea_col="A0",
eaf_col="A1FREQ",
phenotype_col="protein_exposure_id",
chrom_col="CHROM_hg37",
pos_col="GENPOS_hg37",
pval_col=GWASLAB_P_COL,
)
ClumpOptions
MRInputColSpec
Methods:
-
make_renamer–
Attributes:
-
beta_col(str) – -
chrom_col(str | None) – -
ea_col(str) – -
eaf_col(str | None) – -
errors(IgnoreOrRaise) – -
n_case_col(str | None) – -
n_control_col(str | None) – -
nea_col(str | None) – -
phenotype_col(str | None) – -
pos_col(str | None) – -
pval_col(str | None) – -
rsid_col(str) – -
sample_size_col(str | None) – -
se_col(str) –
make_renamer
Source code in mecfs_bio/build_system/task/two_sample_mr_task.py
MRReportOptions
SteigerFilteringOptions
Attributes:
-
drop_failures(bool) – -
p_value_thresh(float | None) –
TwoSampleMRConfig
Attributes:
-
clump_exposure_data(ClumpOptions | None) – -
pre_filter_outcome_variants(bool) – -
report_options(MRReportOptions | None) – -
steiger_filter(SteigerFilteringOptions | None) –
pre_filter_outcome_variants
class-attribute
instance-attribute
TwoSampleMRResult
TwoSampleMRTask
Bases: Task
Task to run mendelian randomization using the R package TwoSampleMR. This R package is accessed through Python via rpy2.
Note that some of the calls to the TSMR library below (like clumping) require access
to the OpenGWAS database. This in turn requires an access token.
You can get a token here: https://api.opengwas.io/
Add to your .Renviron file the following line:
OPENGWAS_JWT=
Methods:
Attributes:
-
config(TwoSampleMRConfig) – -
deps(list[Task]) – -
exposure_col_spec(MRInputColSpec | None) – -
exposure_data_task(Task) – -
exposure_id(AssetId) – -
exposure_meta(Meta) – -
exposure_pipe(DataProcessingPipe) – -
meta(Meta) – -
mr_method_list(list[str] | None) – -
outcome_col_spec(MRInputColSpec | None) – -
outcome_data_task(Task) – -
outcome_id(AssetId) – -
outcome_meta(Meta) – -
outcome_pipe(DataProcessingPipe) –
exposure_col_spec
class-attribute
instance-attribute
create
classmethod
create(
asset_id: str,
outcome_data_task: Task,
exposure_data_task: Task,
config: TwoSampleMRConfig,
exposure_pipe: DataProcessingPipe,
outcome_pipe: DataProcessingPipe,
exposure_col_spec: MRInputColSpec,
outcome_col_spec: MRInputColSpec,
method_list: list[str] | None = None,
)
Source code in mecfs_bio/build_system/task/two_sample_mr_task.py
execute
Source code in mecfs_bio/build_system/task/two_sample_mr_task.py
247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 | |
convert_outcome_and_exposure_to_r
convert_outcome_and_exposure_to_r(
exposure_df: DataFrame, outcome_df: DataFrame
) -> tuple[RDataFrame, RDataFrame]
Source code in mecfs_bio/build_system/task/two_sample_mr_task.py
format_data
format_data(
exposure_df: DataFrame,
outcome_df: DataFrame,
tsmr: RPackageType,
) -> tuple[pd.DataFrame, pd.DataFrame]
Source code in mecfs_bio/build_system/task/two_sample_mr_task.py
format_data_no_conversion
format_data_no_conversion(
exposure_rdf: DataFrame,
outcome_rdf: DataFrame,
tsmr: RPackageType,
) -> tuple[RDataFrame, RDataFrame]
Source code in mecfs_bio/build_system/task/two_sample_mr_task.py
gen_mr_report
harmonize_data
harmonize_data(
formatted_exposure: DataFrame,
formatted_outcome: DataFrame,
tsmr: RPackageType,
) -> pd.DataFrame
Source code in mecfs_bio/build_system/task/two_sample_mr_task.py
harmonize_data_no_conversion
harmonize_data_no_conversion(
formatted_exposure: DataFrame,
formatted_outcome: DataFrame,
tsmr: RPackageType,
) -> RDataFrame
Source code in mecfs_bio/build_system/task/two_sample_mr_task.py
optionally_clump_exposure_data
optionally_clump_exposure_data(
formatted_exposure: DataFrame,
clump_options: ClumpOptions | None,
tsmr: RPackageType,
)
Source code in mecfs_bio/build_system/task/two_sample_mr_task.py
optionally_clump_exposure_data_no_conversion
optionally_clump_exposure_data_no_conversion(
formatted_exposure: DataFrame,
clump_options: ClumpOptions | None,
tsmr: RPackageType,
) -> RDataFrame
Source code in mecfs_bio/build_system/task/two_sample_mr_task.py
pre_filter_outcome_variants
pre_filter_outcome_variants(
exposure_df: DataFrame,
outcome_df: DataFrame,
config: TwoSampleMRConfig,
) -> pd.DataFrame
Source code in mecfs_bio/build_system/task/two_sample_mr_task.py
run_tsmr_on_formatted_data
run_tsmr_on_formatted_data(
formatted_exposure: DataFrame,
formatted_outcome: DataFrame,
config: TwoSampleMRConfig,
tsmr: RPackageType,
) -> TwoSampleMRResult
Source code in mecfs_bio/build_system/task/two_sample_mr_task.py
run_tsmr_on_harmonized_data
Source code in mecfs_bio/build_system/task/two_sample_mr_task.py
run_tsmr_on_harmonized_data_no_conversion
run_tsmr_on_harmonized_data_no_conversion(
harmonized: DataFrame,
tsmr: RPackageType,
method_list: list[str] | None = None,
) -> RDataFrame
Source code in mecfs_bio/build_system/task/two_sample_mr_task.py
run_two_sample_mr
run_two_sample_mr(
exposure_df: DataFrame,
outcome_df: DataFrame,
config: TwoSampleMRConfig,
) -> TwoSampleMRResult
Source code in mecfs_bio/build_system/task/two_sample_mr_task.py
steiger_filtering_write_output
steiger_filtering_write_output(
harmonized: DataFrame,
options: SteigerFilteringOptions | None,
scratch_dir: Path,
tsmr: RPackageType,
) -> RDataFrame