English 中文(简体)
Python-polar 现有两个列的串联
原标题:python-polars string concatenation of two existing columns
I want to concatenate the last letter from two existing columns and create a new column from this using polars.LazyFrame for example in pandas can achieve this with the following code import pandas as pd df = pd.DataFrame({"col1":["abc","def"], "col2":["ghi","jkl"]}) df["last_letters_concat"]=df["col1"].str.strip().str[-1]+df["col2"].str.strip().str[-1] print(df) col1 col2 last_letters_concat 0 abc ghi ci 1 def jkl fl My attempt in polars import polars as pl from polars import col #using same df df.lazy().with_columns( (pl.col("col1")[-1] + pl.col( col2 ))[-1].alias("last_letters_concat") ).collect() How can i do this?
You can use the str.slice expression for that. Below I show to examples that produce the same result. df = pl.DataFrame({ "col1": ["abc","def"], "col2":["ghi","jkl"] }) # concat all last letters out1 = df.select( pl.concat_str(pl.col("col1").str.slice(-1), pl.col("col2").str.slice(-1)) ) # concat only two specific columns out2 = df.select( pl.col("col1").str.slice(-1) + pl.col("col2").str.slice(-1) ) assert out1.equals(out2) print(out1) shape: (2, 1) ┌──────┐ │ col1 │ │ --- │ │ str │ ╞══════╡ │ ci │ │ fl │ └──────┘ I recommend using the concat_str expression as this has O(n) complexity where n is the number of columns you add, whereas the addition operator has O(n^2) complexity. EDIT: as of polars >= 0.14.12 the optimizer will ensure it always is linear complexity


