Extracting substrings
Extracting substrings is a crucial component in string manipulation. It means deriving a portion of a string and using it as another column or as part of a transformation logic. Knowing how to extract substrings helps you clean, transform, and organize your data into a more useful format.
In this recipe, we’ll cover how to extract substrings by slicing and regex.
How to do it...
Here’s how you extract substrings from strings in Polars:
- Use
.str.slice()
to extract a substring. There are two available parameters in this method:offset
andlength
. The following example only specifies theoffset
:df.select( 'userName', pl.col('userName').str.slice(3).alias('4thCharAndAfter') ).head()
The preceding code will return the following output:
Figure 6.11 – A new column with userName after the 4t character
- You can specify...