Working with date formats
Dates and times are often found in datasets and can present a few unique problems with data, becoming a huge thorn in a data scientist's side. There are many formats across the world, which differ across countries and systems. For example, the United States commonly uses the month/day/year format (mm/dd/yyyy), but in Europe, you are more likely to see day/month/year (dd/mm/yyyy).
Python has a built-in datetime
object, but we'll make use of pandas' built-in datetime
type as well. This will allow us to easily perform a few different operations on them, including grabbing just the month value, specifying a specific format, and other operations.
Time zones also come into play. There are many different rules across the world on what happens when. This is one reason UTC has become more common. UTC is a set standard that can be used no matter what your specific time zone is.
Specifying a date field in pandas
The easiest way to call out...