Downloading the sample dataset
In the succeeding sections of this chapter, we will work with a very simple synthetic dataset that contains only two columns – x and y. Here, x may represent an object’s relative position on the X-axis, while y may represent the same object’s position on the Y-axis. The following screenshot shows an example of what the data looks like:
Figure 2.20 – Sample dataset
ML is about finding patterns. With this dataset, we will build a model that tries to predict the value of y given the value of x later in this chapter. Once we’re able to build models with a simple example like this, it will be much easier to deal with more realistic datasets that contain more than two columns, similar to what we worked with in Chapter 1, Introduction to ML Engineering on AWS.
Note
In this book, we won’t limit ourselves to just tabular data and simple datasets. In Chapter 6, SageMaker Training and Debugging...