Machine learning is the process a machine follows to learn about various things. Some things are easier to learn than others.
Artificial Intelligence is a collection of such machine learnings that can be put to use in the real world to make decisions or to predict something.
Here is a diagram that shows how a typical machine learns:
We have a data-gathering source at one end, which gets data from various reliable sources, depending on the solution. This data has both features and labels. Features are columns of data that are taken as input to learning, and labels are the expected outcome for that set of features. Let's take a look at an example of weather station data:
Temperature
|
Humidity
|
Wind
|
Rainfall
|
17 degrees Celsius
|
87%
|
5 km per hour
|
10 mm
|
23 degrees Celsius
|
23%
|
1 km per hour
|
0 mm
|
The columns named Temperature, Humidity, and Wind are features, and Rainfall is a label, in our table. Using this type of supervised learning, we would build a data model from this data and ask a question such as: Given the following features, what is the chance of rain?
The data we gather is the most important part of machine learning, as the quality and quantity of data define the accuracy of prediction.
Once the data has been gathered, this data is then cleaned and normalized. The cleaned data is then split into two parts, training data and testing data. Training data is used to train the data model and testing data is used to cross-validate the accuracy of the data model.
Now, depending on the type of cognitive service we want to provide, we would use a machine learning algorithm and feed the training data to it, building something called a data model.
A data model is a snapshot of the learning and this snapshot is now tested against the testing data. This step is critical in analyzing the accuracy of the data model.
The model can be trained again and again with various sets of data to have a better accuracy. Once the model is completed, we host it as an API for other systems to query it, passing their features. Based on the prediction results from here on, we would refine the data model.
The previous process is how most cognitive services are built. Now, one of the key steps of data model accuracy depends on the quality and quantity of data.
The more accurate the data that is fed to the machine learning algorithm, the higher the quality of the data model.
Imagine a cognitive service such as explicit image detection built by you or your organization. We need data to train this cognitive service to start with. How many images can we feed it, 1 million, 2 million? Imagine the size of infrastructure needed for training about 10 million images.
Once the service is built, how many hits will your users make? 1 million requests per day? And will this be sufficient to know the accuracy of your model and improve it?
Now, on the other hand, consider data models built by the likes of Google, which pretty much has access to almost all the content of the internet. And imagine the number of people using this service, thus helping the cognitive service to learn by experience.
Within no time, a cognitive service like this will be far more accurate, not only for mainstream scenarios, but also corner cases.
In cognitive services, accuracy increases with the quality and quantity of data and this is one of the main things that adds value to cloud-based cognition over local intelligence.
Take a look at this video titled Inside Google Translate: https://www.youtube.com/watch?v=_GdSC1Z1Kzs, which explains how the Google Translate service works. This re-emphasizes the thought I expressed previously about how machines learn.
This concludes our section on why cognition on the cloud. In the next section, we are going to explore various Google Cloud AI services.