Skip to content

mecfs_bio.build_system.task.pipes.winsorize_all

Classes:

WinsorizeAllPipe

Bases: DataProcessingPipe

Methods:

Attributes:

cols_to_exclude instance-attribute

cols_to_exclude: Sequence[str]

max_value instance-attribute

max_value: float

process

process(x: LazyFrame) -> narwhals.LazyFrame
Source code in mecfs_bio/build_system/task/pipes/winsorize_all.py
def process(self, x: narwhals.LazyFrame) -> narwhals.LazyFrame:
    x = x.collect().to_polars()
    schema = x.collect_schema()
    for col_name in schema.keys():
        if col_name not in self.cols_to_exclude:
            x = x.with_columns(
                pl.col(col_name).clip(upper_bound=self.max_value).alias(col_name)
            )

    return narwhals.from_native(x.lazy())