Google Cloud Platform (GCP) is considered to be one of the Big 3 cloud platforms among Microsoft Azure and AW. GCP is widely used cloud solutions supporting AI capabilities to design and develop smart models to turn your data into insights at a cheap, affordable cost.
The following excerpt is taken from the book 'Cloud Analytics with Google Cloud Platform' authored by Sanket Thodge.
GCP offers many machine learning APIs, among which we take a look at the 3 most popular APIs:
A powerful API from GCP! This enables the user to convert speech to text by using a neural network model. This API is used to recognize over 100 languages throughout the world. It can also support filter of unwanted noise/ content from a text, under various types of environments. It supports context-awareness recognition, works on any device, any platform, anywhere, including IoT. It has features like Automatic Speech Recognition (ASR), Global Vocabulary, Streaming Recognition, Word Hints, Real-Time Audio support, Noise Robustness, Inappropriate Content Filtering and supports for integration with other APIs of GCP.
The architecture of the Cloud Speech API is as follows:
In other words, this model enables speech to text conversion by ML.
The components used by the Speech API are:
The applications of the model include:
Now, as we have learned about the concepts and the applications of the model, let's learn some use cases where we can implement the model:
Natural language processing (NLP) is a part of artificial intelligence that focuses on Machine Translation (MT). MT has become the main focus of NLP group for many years. MT deals with translating text from the source language to text in the target language. Cloud Translation API provides a graphical user interface to translate an inputted string of a language to targeted language, it’s highly responsive, scalable and dynamic in nature.
This API enables translation among 100+ languages. It also supports language detection automatically with accuracy. It provides a feature to read a web page contents and translate to another language, and need not be text extracted from a document. The Translation API supports various features such as programmatic access, text translation, language detection, continuous updates and adjustable quota, and affordable pricing.
The following image shows the architecture of the translation model:
In other words, the cloud translation API is an adaptive Machine Translation Algorithm.
The components used by this model are:
The most important application of the model is the conversion of a regional language to a foreign language.
Now, as we have learned about the concepts and applications of the API, let's learn two use cases where it has been successfully implemented:
We will discuss each of these use cases in the following sections.
The steps to implement rule-based Machine Translation successfully are as follows:
We can learn about the Machine Translation process from the responses of a local tissue to injuries and trauma. The human body follows a process similar to Machine Translation when dealing with injuries. We can roughly describe the process as follows:
Cloud Vision API is powerful image analytic tool. It enables the users to understand the content of an image. It helps in finding various attributes or categories of an image, such as labels, web, text, document, properties, safe search, and code of that image in JSON. In labels field, there are many sub-categories like text, line, font, area, graphics, screenshots, and points. How much area of graphics involved, text percentage, what percentage of empty area and area covered by text, is there any image partially or fully mapped in web are included web contents.
The document consists of blocks of the image with detailed description, properties show that the colors used in image is visualized. If any unwanted or inappropriate content is removed from the image through safe search. The main features of this API are label detection, explicit content detection, logo and landmark detection, face detection, web detection, and to extract the text the API used Optical Character Reader (OCR) and is supported for many languages. It does not support face recognition system.
The architecture for the Cloud Vision API is as follows:
We can summarize the functionalities of the API as extracting quantitative information from images, taking the input as an image and the output as numerics and text.
The components used in the API are:
Applications of the API include:
This technique can be successfully implemented in:
We will discuss each of these use cases in the following topics.
Cloud Vision API can be successfully implemented to detect images using your smartphone. The steps to do this are simple:
Similarly, the API can also be used to analyze retinal images. The steps to implement this are as follows:
You can learn a lot more about the machine learning capabilities of GCP on their official documentation page.
If you found the above excerpt useful, make sure you check out our book 'Cloud Analytics with Google Cloud Platform' for more information on why GCP is a top cloud solution for machine learning and AI.
Google announces Cloud TPUs on the Cloud Machine Learning Engine (ML Engine)
How machine learning as a service is transforming cloud
Google announce the largest overhaul of their Cloud Speech-to-Text