Milestone 9 – Building applications that demonstrate customized speech recognition
Now that our model has been fine-tuned let’s demonstrate how good it is at speech recognition (ASR)! We’ll use the Hugging Face Transformers pipeline to handle everything, from preparing the audio to decoding what the model thinks the audio says. For our demo, we’ll use Gradio, a tool that makes it super easy to build machine learning demos. You can create a demo with Gradio in just a few minutes!
Here is an example of a Gradio demo. In this demo, you can record speech using your computer’s microphone, after which the fine-tuned Whisper model will transcribe it into text:
from transformers import pipeline import gradio as gr pipe = pipeline(model="jbatista79/whisper-small-hi") # change to "your-username/the-name-you-picked" def transcribe(audio): text = pipe(audio)["text"] ...