7.2.2 Approach
Dates and times often have bewildering formats. This is particularly true in the US, where dates are often written as numbers in month/day/year format. Using year/month/day puts the values in order of significance. Using day/month/year is the reverse order of significance. The US ordering is simply strange.
This makes it difficult to do inspections on completely unknown data without any metadata to explain the serialization format. A date like 01/02/03 could mean almost anything.
In some cases, a survey of many date-like values will reveal a field with a range of 1-12 and another field with a range of 1-31, permitting analysts to distinguish between the month and day. The remaining field can be taken as a truncated year.
In cases where there is not enough data to make a positive identification of month or day, other clues will be needed. Ideally, there’s metadata to define the date format.
The datetime.strptime()
function can be used to parse dates when the format...