Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Deep Learning By Example

You're reading from   Deep Learning By Example A hands-on guide to implementing advanced machine learning algorithms and neural networks

Arrow left icon
Product type Paperback
Published in Feb 2018
Publisher Packt
ISBN-13 9781788399906
Length 450 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Ahmed Menshawy Ahmed Menshawy
Author Profile Icon Ahmed Menshawy
Ahmed Menshawy
Arrow right icon
View More author details
Toc

Table of Contents (18) Chapters Close

Preface 1. Data Science - A Birds' Eye View 2. Data Modeling in Action - The Titanic Example FREE CHAPTER 3. Feature Engineering and Model Complexity – The Titanic Example Revisited 4. Get Up and Running with TensorFlow 5. TensorFlow in Action - Some Basic Examples 6. Deep Feed-forward Neural Networks - Implementing Digit Classification 7. Introduction to Convolutional Neural Networks 8. Object Detection – CIFAR-10 Example 9. Object Detection – Transfer Learning with CNNs 10. Recurrent-Type Neural Networks - Language Modeling 11. Representation Learning - Implementing Word Embeddings 12. Neural Sentiment Analysis 13. Autoencoders – Feature Extraction and Denoising 14. Generative Adversarial Networks 15. Face Generation and Handling Missing Labels 16. Implementing Fish Recognition 17. Other Books You May Enjoy

Design procedure of data science algorithms

Different learning systems usually follow the same design procedure. They start by acquiring the knowledge base, selecting the relevant explanatory features from the data, going through a bunch of candidate learning algorithms while keeping an eye on the performance of each one, and finally the evaluation process, which measures how successful the training process was.

In this section, we are going to address all these different design steps in more detail:

Figure 1.11: Model learning process outline

Data pre-processing

This component of the learning cycle represents the knowledge base of our algorithm. So, in order to help the learning algorithm give accurate decisions about the unseen data, we need to provide this knowledge base in the best form. Thus, our data may need a lot of cleaning and pre-processing (conversions).

Data cleaning

Most datasets require this step, in which you get rid of errors, noise, and redundancies. We need our data to be accurate, complete, reliable, and unbiased, as there are lots of problems that may arise from using bad knowledge base, such as:

  • Inaccurate and biased conclusions
  • Increased error
  • Reduced generalizability, which is the model's ability to perform well over the unseen data that it didn't train on previously

Data pre-processing

In this step, we apply some conversions to our data to make it consistent and concrete. There are lots of different conversions that you can consider while pre-processing your data:

  • Renaming (relabeling): This means converting categorical values to numbers, as categorical values are dangerous if used with some learning methods, and also numbers will impose an order between the values
  • Rescaling (normalization): Transforming/bounding continuous values to some range, typically [-1, 1] or [0, 1]
  • New features: Making up new features from the existing ones. For example, obesity-factor = weight/height

Feature selection

The number of explanatory features (input variables) of a sample can be enormous wherein you get xi=(xi1, xi2, xi3, ... , xid) as a training sample (observation/example) and d is very large. An example of this can be a document classification task3, where you get 10,000 different words and the input variables will be the number of occurrences of different words.

This enormous number of input variables can be problematic and sometimes a curse because we have many input variables and few training samples to help us in the learning procedure. To avoid this curse of having an enormous number of input variables (curse of dimensionality), data scientists use dimensionality reduction techniques in order to select a subset from the input variables. For example, in the text classification task they can do the following:

  • Extracting relevant inputs (for instance, mutual information measure)
  • Principal component analysis (PCA)
  • Grouping (cluster) similar words (this uses a similarity measure)

Model selection

This step comes after selecting a proper subset of your input variables by using any dimensionality reduction technique. Choosing the proper subset of the input variable will make the rest of the learning process very simple.

In this step, you are trying to figure out the right model to learn.

If you have any prior experience with data science and applying learning methods to different domains and different kinds of data, then you will find this step easy as it requires prior knowledge of how your data looks and what assumptions could fit the nature of your data, and based on this you choose the proper learning method. If you don't have any prior knowledge, that's also fine because you can do this step by guessing and trying different learning methods with different parameter settings and choose the one that gives you better performance over the test set.

Also, initial data analysis and visualization will help you to make a good guess about the form of the distribution and nature of your data.

Learning process

By learning, we mean the optimization criteria that you are going to use to select the best model parameters. There are various optimization criteria for that:

  • Mean square error (MSE)
  • Maximum likelihood (ML) criterion
  • Maximum a posterior probability (MAP)

The optimization problem may be hard to solve, but the right choice of model and error function makes a difference.

Evaluating your model

In this step, we try to measure the generalization error of our model on the unseen data. Since we only have the specific data without knowing any unseen data beforehand, we can randomly select a test set from the data and never use it in the training process so that it acts like valid unseen data. There are different ways you can to evaluate the performance of the selected model:

  • Simple holdout method, which is dividing the data into training and testing sets
  • Other complex methods, based on cross-validation and random subsampling

Our objective in this step is to compare the predictive performance for different models trained on the same data and choose the one with a better (smaller) testing error, which will give us a better generalization error over the unseen data. You can also be more certain about the generalization error by using a statistical method to test the significance of your results.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime