Skip to content

mecfs_bio.build_system.task.pipes.str_split_exact_col

Classes:

SplitExactColPipe

Bases: DataProcessingPipe

Methods:

Attributes:

col_to_split instance-attribute

col_to_split: str

new_col_names instance-attribute

new_col_names: tuple[str, ...]

split_by instance-attribute

split_by: str

process

process(x: LazyFrame) -> narwhals.LazyFrame
Source code in mecfs_bio/build_system/task/pipes/str_split_exact_col.py
def process(self, x: narwhals.LazyFrame) -> narwhals.LazyFrame:
    xp = (
        x.collect()
        .to_polars()
        .with_columns(
            pl.col(self.col_to_split)
            .str.split_exact(self.split_by, n=len(self.new_col_names) - 1)
            .struct.rename_fields(self.new_col_names)
            .alias("fields")
        )
        .unnest("fields")
    )
    return narwhals.from_native(xp).lazy()