Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Voice User Interface Projects

You're reading from   Voice User Interface Projects Build voice-enabled applications using Dialogflow for Google Home and Alexa Skills Kit for Amazon Echo

Arrow left icon
Product type Paperback
Published in Jul 2018
Publisher Packt
ISBN-13 9781788473354
Length 404 pages
Edition 1st Edition
Languages
Concepts
Arrow right icon
Author (1):
Arrow left icon
Henry Lee Henry Lee
Author Profile Icon Henry Lee
Henry Lee
Arrow right icon
View More author details
Toc

Table of Contents (12) Chapters Close

Preface 1. Introduction 2. Building an FAQs Chatbot FREE CHAPTER 3. Building a Fortune Cookie Application 4. Hosting, Securing, and Testing Fortune Cookie in the Cloud 5. Deploying the Fortune Cookie App for Google Home 6. Building a Cooking Application Using Alexa 7. Using Advanced Alexa Features for the Cooking App 8. Migrating the Alexa Cooking Application to Google Home 9. Building a Voice-Enabled Podcast for the Car 10. Hosting and Enhancing the Android Auto Podcast 11. Other Books You May Enjoy

Technological advancement of VUIs

In 1952, at Bell Labs, the engineers Davis, Biddulph, and Balashek built the Automatic Digit Recognizer (Audrey), a rudimentary voice recognition system. Audrey was limited by the technology of the time but was able to recognize the numbers 0 to 9. The Audrey system, which processed the 10 digits through voice recognition, was 6 feet tall and covered the walls of Bell Labs, containing large numbers of analog circuits with capacitors, amplifiers, and filters. Audrey did the following three things:

  • The Audrey system took the user's voice as input and put the voice into the machine's memory. The voice input was classified and pattern matching was performed against the predefined classes of voices for the numbers 0 to 9. Finally, the identified number was stored in memory.
  • It flashed a light that represented the matching number.
  • It was also able to communicate selected digits over the telephone.

Audrey performed what's known today as NLP, using ML with AI.

Although Audrey recognized voice input with an accuracy of 97% to 99%, it was very expensive and large in size, and it was extremely difficult to maintain its complex electronics. Thus, Audrey could not be commercialized. However, since the inception of Audrey, voice technology and research has continued to leap forward.

First-generation VUIs

The big break came in 1984, when SpeechWorks and Nuance introduced interactive voice response (IVR) systems. IVR systems were able to recognize human voices over the telephone and carried out tasks given to them (Roberto Pieraccini and Lawrence Rabiner 2012, The Voice in the Machine: Building Computers That Understand Speech). You will be able to recognize IVR systems today when you call major companies for support. For example, when you call to make a hotel reservation, you will be familiar with "Press 1 or say reservation, Press 2 or say check reservation, Press 3 or say cancel reservation, Press # or say main menu." In the '90s, I remember working on my first VUIs in an IVR system. To develop the IVR system, I had to work with the Microsoft Speech API (SAPI), http://bit.ly/2osJpdM. With SAPI, I was able to perform text to speech (TTS), where the voice received from the user was translated into text in order to evaluate the user's intent; then, after evaluating the user's intent, a text message was created and converted back to the voice to relay the message to the user on the telephone.

Boom of VUIs

In order to really appreciate the start of the emerging voice technology, first let's look at the year 2005. In 2005, Web 2.0 contributed to the increase in the volume of data. This increase brought about the creation of Hadoop and big data in order to meet the demand for storing, processing, and understanding data. Big data helps to advance data analytics, ML, and AI in order to identify patterns in data in business contexts. The same techniques as those used for big data, such as ML and AI, have helped in advancing NLP to recognize speech patterns and VUIs. The Web 2.0 big data boom kick-started the boom in the use of VUIs on smart phones, in the home, and in automobiles.

History of VUIs on mobile devices

In 2006, Apple introduced the concept of Siri, which allows users to interact with machines using their voice. In 2007, Google followed Apple and introduced voice searches. In 2011, Apple finally brought Siri concepts into reality by integrating Siri into iOS and iPhones. But unfortunately, with Steve Jobs' death that same year, the voice innovations from Apple slowed down, allowing others, such as Google and Amazon, to catch up. In 2015, Microsoft introduced Cortana for the Windows 10 operating system and smart phones (refer to the following screenshot). In 2016, Google introduced Google Assistant (refer to the following screenshot) to mobile devices. Later, from Chapter 3, Build a Fortune Cookie Application, to Chapter 5, Deploying the Fortune Cookie App to Google Home, you will learn how to create voice assistant applications for mobile devices. One of the major advantages of writing applications for Google Assistant is that the same applications you write for Google Assistant can also be deployed to Google Home.

The following illustration depicts screenshots of the mobile voice assistants Cortana, Siri, and Google Assistant:

Mobile voice assistants—Cortana, Siri, and Google Assistant

History of VUIs for Google Home

In 2014, Amazon introduced Amazon Echo (refer to the following screenshot), the first VUI device designed for consumers' home. In 2016, Google released Google Home (refer to the following screenshot). In 2017, Amazon and Google continued to compete against each other in the consumer marketplace with the Amazon Echo and Google Home devices. The competition between Amazon and Google with these home devices shared similarities with the competition between Apple's iPhone and Google's Android. Currently, these home devices lack the third party applications the consumers can use and, as such, huge start-up and entrepreneurial opportunities exist. Remember Angry Birds for iPhone and Android? What could be the next big hit in this untapped marketplace? Later, from Chapter 3, Building a Fortune Cookie Application, through Chapter 8, Migrating the Alexa Cooking Application to Google Home, you will learn how to develop applications for Amazon Echo and Google Home devices.

The following photo shows Amazon Echo:

Amazon Echo

The following is a photo of Google Home:

Google Home

History of VUIs in cars

In 2007, Microsoft partnered with Ford and integrated Microsoft Sync Framework, giving drivers hands-free interaction with their car's features of the car. In 2013, Apple introduced CarPlay for the cars, but only limited number of car manufacturers were willing to adopt CarPlay (https://www.apple.com/ios/carplay/). On the other hand, in 2018, major car manufacturers adopted Google Auto (https://www.android.com/auto/) because Google Auto is based on the Android operating system and already has huge developer ecosystems in the Android marketplace. Later, in Chapter 9, Building a Voice Enabled Podcast for the Car, and Chapter 10, Hosting and Enhancing the Android Auto Podcast, you will learn how to create your own podcast and stream your own content to cars through car dashboards that support Google Auto.

The following photo shows the voice assistant from Apple's CarPlay:

Apple CarPlay

The following screenshot shows Google Auto:

Google Auto
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime