Fundamentals of Regular Expressions (RegEx)
Regular expressions or regex are used to identify whether a pattern exists in a given sequence of characters a (string) or not. They help in manipulating textual data, which is often a prerequisite for data science projects that involve text mining.
Regex in the Context of Web Scraping
Web pages are often full of text and while there are some methods in BeautifulSoup or XML parser to extract raw text, there is no method for the intelligent analysis of that text. If, as a data wrangler, you are looking for a particular piece of data (for example, email IDs or phone numbers in a special format), you have to do a lot of string manipulation on a large corpus to extract email IDs or phone numbers. RegEx are very powerful and save data wrangling professional a lot of time and effort with string manipulation because they can search for complex textual patterns with wildcards of an arbitrary length.
RegEx is like a mini-programming language in itself and...