Working with Strings
Strings, in the programming world, are textual information: a single letter, a word, a phrase, or, more generally, anything that comes in between single or double quotes will be understood as a string by the computer once it is assigned to a variable. See the following code and comments:
# If not assigned to a variable, a text is just a comment. "This is a text." # These are strings my_string1 <- "a" my_string2 <- "Hello, World! I am learning!" my_string1 <- "42"
The manipulation of strings is a good skill to have due to the amount of good data that is found on the internet in textual format. Natural Language Processing (NLP) is one of the largest areas in data science, and a lot of it relies on wrangling strings.
Most of what can be done with strings involves tasks such as the following:
- Parsing: Separating parts of the text that are divided by a pattern, extracting parts of it and combining words...