Creating the datasource
When working with Amazon ML, the data always resides in S3 and it is not duplicated in Amazon ML. A datasource is the metadata that indicates the location of the input data allowing Amazon ML to access it. Creating a datasource also generates descriptive statistics related to the data and a schema with information on the nature of the variables. Basically, the datasource gives Amazon ML all the information it requires to be able to train a model. The following are the steps you need to follow to create a datasource:
- Go to Amazon Machine Learning: https://console.aws.amazon.com/machinelearning/home.
- Click on getting started, you will be given a choice between accessing the
Dashboard
andStandard setup
. This time choose the standard setup:
Perform the following steps, as shown in the following screenshot:
- Choose an
S3
location. - Start typing the name of the bucket in the
s3 location
field, and the list folders and files should show up. - Select the
titanic_train.csv
file.
Â
- Give...