Data science announcements at Amazon re:invent 2017

Continuing from our previous post, Amazon’s re:invent 2017 welcomed a lot of new announcements pertaining to three specific domains in data science: Databases, IoT, and Machine Learning.

Databases

Databases were one of the hot topics for the cloud giant. AWS released the preview of two new database services - Amazon Neptune and Amazon Aurora.

Amazon Neptune Preview

So what’s Amazon Neptune? A brand new database service from Amazon! It is a fully-managed, quick, and a reliable graph database service, which allows easy development and deployment of applications. It is built exclusively to cater a high-performance service for storing billions of relationships and for running queries within a millisecond.

Neptune is highly secure, with inbuilt support for encryption. Since it is fully managed, one should rest assured about the database management tasks.

Neptune backs the famous graph models such as Property Graph and W3C's RDF. It also supports their corresponding query languages such as Apache TinkerPop Gremlin and SPARQL. This allows customers to build queries with ease. Also, these queries can efficiently steer through highly associated datasets.

Some of its key benefits include:

high availability
point-in-time recovery
continuous backup to Amazon S3
replication across availability zones

Amazon Aurora

Amazon Aurora announced a preview of two of its new features at the Reinvent: Aurora Multi-Master and Aurora Serverless. Let’s take a brief look at what these two features have in store.

Aurora Serverless

It allows customers to create database instances that run only when required. This means, databases can be automatically scaled up or down based on demand, which will save a lot of your time.
It is designed to handle workloads that are highly variable and are liable to rapid changes.
Customers can pay for the resources they use on a second-by-second basis. This will save a lot of your money.

The preview of this serverless feature would be available for MySQL-compatible edition of Amazon Aurora.

Aurora Multi-Master

It allows customers to distribute writes for databases over several datacenters
It guarantees customers a zero application downtime to avoid failure of database nodes or availability zones
Customers can also leverage a faster write performance from the software

At present, Aurora Multi-Master preview is for a single region distribution. However, Amazon expects to put it to work between regions across the global physical infrastructure of AWS, by next year.

Internet of Things

The next technology Amazon rooted for this year was IoT. Here’s a list of announcements made for IoT applications.

AWS IoT Device Management

AWS IoT Device Management allows customers to load, set up, monitor, and remotely manage IoT devices securely, throughout the device’s entire lifecycle. Customers can easily log into the AWS IoT console in order to register devices, either individually or in bulk. Further, they can also upload attributes and certificates, and access policies. It also helps customers maintain an inventory, which has all the information related to the IoT devices, such as serial numbers or firmware versions, and so on. Using this information, one can easily track where troubleshooting is required. The devices can be managed individually, in parts, or as an entire fleet.

AWS Greengrass ML inference

AWS Greengrass ML inference preview lets customers deploy and run ML inferences locally on connected devices bringing in better and intelligent computing capabilities within the IoT devices. Carrying out such an inference on connected devices reduces latency and the cost associated with sending the device data to the cloud for prediction. AWS Greengrass ML inference allows app developers to incorporate machine learning within their devices; with no explicit ML skills required. It allows devices to run ML models locally, get the output, and make smart decisions rapidly; that too without being connected. It also performs explicit ML inference on connected devices without the need for sending the data to the cloud. Data is sent to the cloud only in cases that require more processing.

AWS IoT Analytics Preview

Re:invent gave us a preview of AWS IoT Analytics, a fully managed IoT analytics service that provides advanced data analysis of data collected from millions of IoT devices. This does not require added management of the hardware or the infrastructure.

Let’s look at some of its benefits:

Allows customers to have access to pre-built analytical functions, which help them with the predictive analysis of data.
Allows customers to visualize analytical output from the service
The tools required to clean up data have been provided
Aids in identifying patterns within the gathered data

In addition to this, the new AWS IoT Analytics feature offers visualization of your data through Amazon Quicksight. It also combines with Jupyter Notebooks to bring in the power of machine learning.

To know more about AWS IoT in detail, you can visit the link here.

Machine Learning

Re:invent introduced a variety of new platforms, tools, and frameworks to leverage Machine Learning.

AWS DeepLens

Amazon brings an innovative way to get a hands-on deep learning experience for data scientists and developers. Their new AWS DeepLens is an AI-enabled video camera that runs deep learning models locally on the camera to analyze and take action on what it sees. The technology enables developers to build apps while getting practical, hands-on examples for AI, IoT, and serverless computing.

The hardware boasts of a 4-megapixel camera that can capture 1080P video and a 2D microphone array. DeepLens has an Intel Atom® Processor with over 100 GLOPS of compute power, for processing deep learning predictions in real time. It also has built-in 8 GB memory for storing pre-trained models and codes.

On the software side, AWS DeepLens runs Ubuntu 16.04 and is preloaded with AWS Greengrass Core. Other frameworks such as TensorFlow and Caffe2, can also be used. DeepLens has The Intel® clDNN library and lets developers use AWS Greengrass, AWS Lambda, and other AWS AI and infrastructure services in their app.

Amazon Comprehend

Tagged as a continuously trained Natural Language Processing (NLP) service, Amazon Comprehend allows customers to analyze texts and find out everything within them. Be it the language used (from Afrikans to Yuroba and 98 more), the entities (people, places, products, etc), sentiments (positive, negative, and so on), key phrases, and much more from within the text provided. Comprehend also has a topic modeling service that extracts topics from a large set of documents for analysis and topic-based grouping.

Amazon Rekognition Video

With the Rekognition Video, Amazon now has a higher say among similar others in the market. Rekognition Video uses its deep learning capabilities to derive detailed and complete insights from the videos. It allows developers to get detailed information about the objects within the videos. This also includes getting to know the scenes that the videos are set in, the activities happening within them, and so on. It also supports a feature which aids in detecting a person, for instance, it is pre-trained to recognize famous celebrities. It can also track people via a video and can filter out any inappropriate content. In short, it can easily generate metadata from within the video files.

Amazon SageMaker

An end-to-end Machine learning service that aids developers and data scientists in building, training, and deploying machine learning models easily and quickly, with improved scalability. It consists of three modules:

Build - An environment to work with your data, experiment with the algorithms, and have a detailed output visualization.
Train - Allows one-click model training and tuning, at high-scale and low cost.
Deploy - Provides a managed environment, which allows customers to easily host their models and test them securely for inference, that too with low latency.

Amazon SageMaker eliminates machine learning complexities for developers. With Amazon SageMaker, customers can easily build and train their ML models in the cloud. Also, with some additional clicks, customers can also use the AWS Greengrass console in order to transfer the models to devices that they have selected.

To have a detailed view of how SageMaker works, visit the link here.

Amazon Translate Preview

Amazon also unveiled a preview of its 'Translate', a high-quality neural machine translation service. Amazon translate uses advanced machine learning features to enable faster language translation of text-based content. Translate uses neural networks to represent models trained to translate between language pairs and allows development of applications which can allow multilingual user experiences.

Organizations and businesses can highly benefit with Translate, as they can now market their products in different regions. This means product consumers can access the websites, the information, and the resources using their language of choice using automated language translations. Additionally, customers can also engage themselves in multiplayer chats, gather information from consumer forums, dive into educational documents, and even obtain reviews about hotels even if those resources are provided in a language they can’t readily understand.

Amazon Translate can be used with other Amazon services such as Amazon Polly, Amazon S3, AWS Elastic Search, Amazon Lex, AWS Lambda, and many others.

Amazon Translate service is currently in preview and can be used to translate text to and from English and the supported languages.

Amazon Transcribe Preview

Amazon launched the preview of its Transcribe, an Automatic Speech Recognition (ASR) service. ASR makes it easy for developers to enable the speech-to-text capability into their applications. An amazing feature of Transcribe is, it has an efficient and scalable API, saving developers from the expensive processes of manual transcription.

One can also analyze audio files stored on Amazon Simple Storage Service (S3) in different formats such as WAV, MP3, Flac, and so on. In fact, one can get detailed transcriptions along with the timestamps for each word, and the deduced punctuation.