We just saw that CSV data is read into Go as [][]string. However, Go is statically typed, which allows us to enforce strict checks for each of the CSV fields. We can do this as we parse each field for further processing. Consider some messy data that has random fields that don't match the type of the other values in a column:
4.6,3.1,1.5,0.2,Iris-setosa
5.0,string,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
5.3,3.7,1.5,0.2,Iris-setosa
5.0,3.3,1.4,0.2,Iris-setosa
7.0,3.2,4.7,1.4,Iris-versicolor
6.4,3.2,4.5,1.5,
6.9,3.1,4.9,1.5,Iris-versicolor
5.5,2.3,4.0,1.3,Iris-versicolor
4.9,3.1,1.5,0.1,Iris-setosa
5.0,3.2,1.2,string,Iris-setosa
5.5,3.5,1.3,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
4.4,3.0,1.3,0.2,Iris-setosa
To check the types of the fields in our CSV records, let's create a struct variable to hold successfully parsed values:
// CSVRecord contains a successfully parsed row of the CSV file.
type CSVRecord struct {
SepalLength float64
SepalWidth float64
PetalLength float64
PetalWidth float64
Species string
ParseError error
}
Then, before we loop over the records, let's initialize a slice of these values:
// Create a slice value that will hold all of the successfully parsed
// records from the CSV.
var csvData []CSVRecord
Now as we loop over the records, we can parse into the relevant type for that record, catch any errors, and log as needed:
// Read in the records looking for unexpected types.
for {
// Read in a row. Check if we are at the end of the file.
record, err := reader.Read()
if err == io.EOF {
break
}
// Create a CSVRecord value for the row.
var csvRecord CSVRecord
// Parse each of the values in the record based on an expected type.
for idx, value := range record {
// Parse the value in the record as a string for the string column.
if idx == 4 {
// Validate that the value is not an empty string. If the
// value is an empty string break the parsing loop.
if value == "" {
log.Printf("Unexpected type in column %d\n", idx)
csvRecord.ParseError = fmt.Errorf("Empty string value")
break
}
// Add the string value to the CSVRecord.
csvRecord.Species = value
continue
}
// Otherwise, parse the value in the record as a float64.
var floatValue float64
// If the value can not be parsed as a float, log and break the
// parsing loop.
if floatValue, err = strconv.ParseFloat(value, 64); err != nil {
log.Printf("Unexpected type in column %d\n", idx)
csvRecord.ParseError = fmt.Errorf("Could not parse float")
break
}
// Add the float value to the respective field in the CSVRecord.
switch idx {
case 0:
csvRecord.SepalLength = floatValue
case 1:
csvRecord.SepalWidth = floatValue
case 2:
csvRecord.PetalLength = floatValue
case 3:
csvRecord.PetalWidth = floatValue
}
}
// Append successfully parsed records to the slice defined above.
if csvRecord.ParseError == nil {
csvData = append(csvData, csvRecord)
}
}