Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
TinyML Cookbook

You're reading from   TinyML Cookbook Combine machine learning with microcontrollers to solve real-world problems

Arrow left icon
Product type Paperback
Published in Nov 2023
Publisher Packt
ISBN-13 9781837637362
Length 664 pages
Edition 2nd Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Gian Marco Iodice Gian Marco Iodice
Author Profile Icon Gian Marco Iodice
Gian Marco Iodice
Arrow right icon
View More author details
Toc

Table of Contents (16) Chapters Close

Preface 1. Getting Ready to Unlock ML on Microcontrollers FREE CHAPTER 2. Unleashing Your Creativity with Microcontrollers 3. Building a Weather Station with TensorFlow Lite for Microcontrollers 4. Using Edge Impulse and the Arduino Nano to Control LEDs with Voice Commands 5. Recognizing Music Genres with TensorFlow and the Raspberry Pi Pico – Part 1 6. Recognizing Music Genres with TensorFlow and the Raspberry Pi Pico – Part 2 7. Detecting Objects with Edge Impulse Using FOMO on the Raspberry Pi Pico 8. Classifying Desk Objects with TensorFlow and the Arduino Nano 9. Building a Gesture-Based Interface for YouTube Playback with Edge Impulse and the Raspberry Pi Pico 10. Deploying a CIFAR-10 Model for Memory-Constrained Devices with the Zephyr OS on QEMU 11. Running ML Models on Arduino and the Arm Ethos-U55 microNPU Using Apache TVM 12. Enabling Compelling tinyML Solutions with On-Device Learning and scikit-learn on the Arduino Nano and Raspberry Pi Pico 13. Conclusion
14. Other Books You May Enjoy
15. Index

Evaluating the model’s effectiveness

Accuracy and loss are not enough to judge the model’s effectiveness. In general, accuracy is a good performance indicator if the dataset is balanced, but it does not tell us the strengths and weaknesses of our model. For instance, what classes do we recognize with high confidence? What frequent mistakes does the model make?

This recipe will judge the model’s effectiveness by visualizing the confusion matrix and evaluating the recallprecision, and F1-score performance metrics.

Getting ready

To complete this recipe, we must familiarize ourselves with the confusion matrix and the alternative performance metrics crucial for evaluating the model’s effectiveness. Let’s start by learning the confusion matrix in the following subsection.

Evaluating the performance with the confusion matrix

A confusion matrix is an NxN matrix reporting the number of correct and incorrect predictions on the test dataset, where N is the number of output classes.

For our binary classification model, where there are two output categories, we have a 2x2 matrix like the one in Figure 3.8:

A diagram of negative negatives

Description automatically generated

Figure 3.8: A confusion matrix

The four values reported in the previous confusion matrix are as follows:

  • True positive (TP): The number of predicted positive results that are actually positive
  • True negative (TN): The number of predicted negative results that are actually negative
  • False positive (FP): The number of predicted positive results that are actually negative
  • False negative (FN): The number of predicted negative results that are actually positive

Ideally, we would like to have 100% accuracy, defined as the ratio of correctly predicted instances (both positive and negative) to the total number of instances in the dataset:

The preceding formula implies that the confusion matrix’s gray cells (FN and FP) should be 0 to obtain 100% accuracy.

However, although accuracy is a valuable metric, it does not provide a complete picture of model performance. Therefore, the following subsections will present alternative performance metrics for assessing the model’s effectiveness.

Evaluating recall, precision, and F-score

The first performance metric we want to present is recall, which quantifies how many of all positive (“Yes”) samples we predicted correctly:

As a result, recall should be as high as possible.

However, this metric does not consider the misclassification of negative samples. Hence, the model could be excellent at classifying positive samples but incapable of classifying negative ones.

For this reason, there is an alternative performance indicator that considers FPs. It is precision, which quantifies how many predicted positive classes (“Yes”) were actually positive:

Therefore, as with recall, precision should be as high as possible.

If we are interested in evaluating both recall and precision simultaneously, the F-score metric is what we need. In fact, this metric combines recall and precision with a single formula as follows:

The higher the F-score, the better the model’s effectiveness.

How to do it…

Continue working in Colab, and follow the steps to visualize the confusion matrix and calculate the recall, precision, and F-score metrics:

Step 1:

Use the trained model to predict the output classes of the test dataset:

y_test_pred = model.predict(x_test) 
y_test_pred = (y_test_pred > 0.5).astype("int32") 

The line y_test_pred = (y_test_pred > 0.5).astype("int32") binarizes the predicted values using a threshold of 0.5. If a predicted value is greater than 0.5, it is converted to 1.

Otherwise, it is converted to 0.

Step 2:

Compute the confusion matrix with scikit-learn:

import sklearn
cm = sklearn.metrics.confusion_matrix(y_test, 
                                      y_test_pred)

The confusion matrix is obtained with the confusion_matrix() function from the scikit-learn library, which takes two arguments: the true labels of the test dataset (y_test) and the predicted labels (y_test_pred). The cm variable stores the confusion matrix.

Step 3:

Display the confusion matrix in a heatmap:

index_names  = ["Actual No Snow", "Actual Snow"]
column_names = ["Predicted No Snow", "Predicted Snow"]
df_cm = pd.DataFrame(cm, index = index_names,
                     columns = column_names)
plt.figure(dpi=150)
sns.heatmap(df_cm, annot=True, fmt='d', cmap="Blues")

The previous code should produce a heatmap similar to the following one:

A screenshot of a graph

Description automatically generated

Figure 3.9: Confusion matrix obtained with the test dataset

The confusion matrix shows that the samples are mainly distributed in the leading diagonal, and there are more FPs than FNs. Therefore, although the network is suitable for detecting snow, we should expect some false detections.

Step 4:

Calculate the recall, precision, and F-score performance metrics:

TN = cm[0][0]
TP = cm[1][1]
FN = cm[1][0]
FP = cm[0][1]
accur  = (TP + TN) / (TP + TN + FN + FP)
precis = TP / (TP + FP)
recall = TP / (TP + FN)
f_score = (2 * recall * precis) / (recall + precis)
print("Accuracy:  ", round(accur, 3))
print("Recall:    ", round(recall, 3))
print("Precision: ", round(precis, 3))
print("F-score:   ", round(f_score, 3))

The preceding code prints the performance metrics on the output console, resulting in an output similar to what is shown in the following screenshot:

A black text with black text

Description automatically generated with medium confidence

Figure 3.10: Precision, recall, and F-score results

Based on the results reported in the preceding screenshot, which might slightly differ from yours, we can observe that the model has a high recall value of 0.923, indicating that it can accurately predict snowfall. However, the precision value of 0.818 is comparatively lower, meaning the model may produce some false alarms.

The F-score value of 0.867 demonstrates a balance between recall and precision metrics, meaning the model can accurately predict snow instances using the given input features.

There’s more…

In this recipe, we learned how to assess the model’s effectiveness by visualizing the confusion matrix and evaluating the recall, precision, and F-score metrics.

However, scikit-learn is not the only way to compute the confusion matrix. In fact, TensorFlow also provides a tool to calculate the confusion matrix as well. To delve deeper into this topic, we recommend referring to the TensorFlow documentation at the following link: https://www.tensorflow.org/versions/r2.13/api_docs/python/tf/math/confusion_matrix.

After evaluating the model’s effectiveness, the model’s quantization is the only step separating us from the beginning of the model deployment on the microcontroller.

In the upcoming recipe, we will compress the trained model by quantizing it to 8-bit using the TensorFlow Lite converter.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime