Packt+ | Advance your knowledge in tech

You're reading from Machine Learning With Go Implement Regression, Classification, Clustering, Time-series Models, Neural Networks, and More using the Go Programming Language

Product type Paperback

Published in Sep 2017

Publisher Packt

ISBN-13 9781785882104

Length 304 pages

Edition 1st Edition

Languages

Concepts

Machine Learning

Author (1):

Joseph Langstaff Whitenack

View More author details

As you can see in the preceding section, Go itself provides us with an opportunity to maintain high levels of integrity in our data gathering, parsing, and organization. We want to ensure that we leverage Go's unique properties whenever we are preparing our data for machine learning workflows.

Generally, Go data scientists/analysts should follow the following best practices when gathering and organizing data. These best practices are meant to help you maintain integrity in your applications, and been able you to reproduce any analysis:

Check for and enforce expected types: This might seem obvious, but it is too often overlooked when using dynamically typed languages. Although it is slightly verbose, explicitly parsing data into expected types and handling related errors can save you big headaches down the road.
Standardize and simplify your data ingress/egress: There are many third-party packages for handling certain types of data or interactions with certain sources of data (some of which we will cover in this book). However, if you standardize the ways you are interacting with data sources, particularly centered around the use of stdlib, you can develop predictable patterns and maintain consistency within your team. A good example of this is a choice to utilize database/sql for database interactions rather than using various third-party APIs and DSLs.
Version your data: Machine learning models produce extremely different results depending on the training data you use, your choice of parameters, and input data. Thus, it is impossible to reproduce results without versioning both your code and data. We will discuss the appropriate techniques for data versioning later in this chapter.

If you start to stray from these general principles, you should stop immediately. You are likely to sacrifice integrity for the sake of convenience, which is a dangerous road. We will let these principles guide us through the book and as we consider various data formats/sources in the following section.

You're reading from Machine Learning With Go Implement Regression, Classification, Clustering, Time-series Models, Neural Networks, and More using the Go Programming Language

Table of Contents (11) Chapters

Best practices for gathering and organizing data with Go

Authors (1)

Other recommended products

Personalised recommendations for you