Date and time using regular expressions (regexps)
The datetime functions in lubridate
can parse dates out of a good number of cases, even from phrases. Observe how the mdy()
function can correctly parse only the date, which is in a weird format, by the way:
# Lubridate parsing mdy("The championship starts on 10/11-2000") [1] "2000-10-11"
But certainly, that feature combined with regexp
is even more powerful. If we try to use the same mdy()
function, this time we will get an error message: Warning: All formats failed to parse. No formats found.
Regular expressions can pick every date from a text. Let’s create an example text to help illustrate this exercise:
# Text t <- "The movie was launched on 10/10/1980. It was a great hype at that time, being the most watched movie on the weeks of 10/10/1980, 10/17/1980, 10/24/1980. Around ten years later, it was chosen as the best picture of the decade. The cast received the prize on 09/20/1990."...