Packt+ | Advance your knowledge in tech

You're reading from TensorFlow Machine Learning Projects Build 13 real-world projects with advanced numerical computations using the Python ecosystem

Product type Paperback

Published in Nov 2018

Publisher Packt

ISBN-13 9781789132212

Length 322 pages

Edition 1st Edition

Languages

Python

Tools

TensorFlow

Concepts

Machine Learning

Authors (2):

Ankit Jain

Dr. Amita Kapoor

Preface

1. Overview of TensorFlow and Machine Learning FREE CHAPTER

2. Using Machine Learning to Detect Exoplanets in Outer Space

3. Sentiment Analysis in Your Browser Using TensorFlow.js

4. Digit Classification Using TensorFlow Lite

5. Speech to Text and Topic Extraction Using NLP

6. Predicting Stock Prices using Gaussian Process Regression

7. Credit Card Fraud Detection using Autoencoders

8. Generating Uncertainty in Traffic Signs Classifier Using Bayesian Neural Networks

9. Generating Matching Shoe Bags from Shoe Images Using DiscoGANs

10. Classifying Clothing Images using Capsule Networks

11. Making Quality Product Recommendations Using TensorFlow

12. Object Detection at a Large Scale with TensorFlow

13. Generating Book Scripts Using LSTMs

14. Playing Pacman Using Deep Reinforcement Learning

15. What is Next?

16. Other Books You May Enjoy

Many cloud-based AI providers offer speech to text as a service:

Amazon's offering for speech recognition is known as Amazon Transcribe. Amazon Transcribe allows transcription of the audio files stored in Amazon S3 in four different formats: .flac, .wav, .mp4, and .mp3. It allows an audio file with a maximum of two hours in length and 1 GB in size. The results of the transcription are created as a JSON file in an Amazon S3 bucket.
Google offers speech to text as part of its Google Cloud ML Services. Google Cloud Speech to Text supports FLAC, Linear16, MULAW, AMR, AMR_WB, and OGG_OPUS file formats.
Microsoft offers a speech to text API as part of its Azure Cognitive Services platform, known as Speech Service SDK. The Speech Service SDK integrates with rest of the Microsoft APIs to transcribe recorded audio. It only allows the WAV or PCM file format with a single channel and sample rate of 6 kHz.
IBM offers a speech to text API as part if its Watson platform...