Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Deep Learning with TensorFlow and Keras – 3rd edition
Deep Learning with TensorFlow and Keras – 3rd edition

Deep Learning with TensorFlow and Keras – 3rd edition: Build and deploy supervised, unsupervised, deep, and reinforcement learning models , Third Edition

Arrow left icon
Profile Icon Dr. Amita Kapoor Profile Icon Sujit Pal Profile Icon Antonio Gulli
Arrow right icon
€37.99
Full star icon Full star icon Full star icon Full star icon Half star icon 4.6 (45 Ratings)
Paperback Oct 2022 698 pages 3rd Edition
eBook
€8.99 €29.99
Paperback
€37.99
Subscription
Free Trial
Renews at €18.99p/m
Arrow left icon
Profile Icon Dr. Amita Kapoor Profile Icon Sujit Pal Profile Icon Antonio Gulli
Arrow right icon
€37.99
Full star icon Full star icon Full star icon Full star icon Half star icon 4.6 (45 Ratings)
Paperback Oct 2022 698 pages 3rd Edition
eBook
€8.99 €29.99
Paperback
€37.99
Subscription
Free Trial
Renews at €18.99p/m
eBook
€8.99 €29.99
Paperback
€37.99
Subscription
Free Trial
Renews at €18.99p/m

What do you get with Print?

Product feature icon Instant access to your digital eBook copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Shipping Address

Billing Address

Shipping Methods
Table of content icon View table of contents Preview book icon Preview Book

Deep Learning with TensorFlow and Keras – 3rd edition

Regression and Classification

Regression and classification are two fundamental tasks ubiquitously present in almost all machine learning applications. They find application in varied fields ranging from engineering, physical science, biology, and the financial market, to the social sciences. They are the fundamental tools in the hands of statisticians and data scientists. In this chapter, we will cover the following topics:

  • Regression
  • Classification
  • Difference between classification and regression
  • Linear regression
  • Different types of linear regression
  • Classification using the TensorFlow Keras API
  • Applying linear regression to estimate the price of a house
  • Applying logistic regression to identify handwritten digits

All the code files for this chapter can be found at https://packt.link/dltfchp2

Let us first start with understanding what regression really is.

What is regression?

Regression is normally the first algorithm that people in machine learning work with. It allows us to make predictions from data by learning about the relationship between a given set of dependent and independent variables. It has its use in almost every field; anywhere that has an interest in drawing relationships between two or more things will find a use for regression.

Consider the case of house price estimation. There are many factors that can have an impact on the house price: the number of rooms, the floor area, the locality, the availability of amenities, the parking space, and so on. Regression analysis can help us in finding the mathematical relationship between these factors and the house price.

Let us imagine a simpler world where only the area of the house determines its price. Using regression, we could determine the relationship between the area of the house (independent variable: these are the variables that do not depend upon any other variables) and its price (dependent variable: these variables depend upon one or more independent variables). Later, we could use this relationship to predict the price of any house, given its area. To learn more about dependent and independent variables and how to identify them, you can refer to this post: https://medium.com/deeplearning-concepts-and-implementation/independent-and-dependent-variables-in-machine-learning-210b82f891db. In machine learning, the independent variables are normally input into the model and the dependent variables are output from our model.

Depending upon the number of independent variables, the number of dependent variables, and the relationship type, we have many different types of regression. There are two important components of regression: the relationship between independent and dependent variables, and the strength of impact of different independent variables on dependent variables. In the following section, we will learn in detail about the widely used linear regression technique.

Prediction using linear regression

Linear regression is one of the most widely known modeling techniques. Existing for more than 200 years, it has been explored from almost all possible angles. Linear regression assumes a linear relationship between the input variable (X) and the output variable (Y). The basic idea of linear regression is building a model, using training data that can predict the output given the input, such that the predicted output is as near the observed training output Y for the input X. It involves finding a linear equation for the predicted value of the form:

where are the n input variables, and are the linear coefficients, with b as the bias term. We can also expand the preceding equation to:

The bias term allows our regression model to provide an output even in the absence of any input; it provides us with an option to shift our data for a better fit. The error between the observed values (Y) and predicted values () for an input sample i is:

The goal is to find the best estimates for the coefficients W and bias b, such that the error between the observed values Y and the predicted values is minimized. Let’s go through some examples to better understand this.

Simple linear regression

If we consider only one independent variable and one dependent variable, what we get is a simple linear regression. Consider the case of house price prediction, defined in the preceding section; the area of the house (A) is the independent variable, and the price (Y) of the house is the dependent variable. We want to find a linear relationship between predicted price and A, of the form:

where b is the bias term. Thus, we need to determine W and b, such that the error between the price Y and the predicted price is minimized. The standard method used to estimate W and b is called the method of least squares, that is, we try to minimize the sum of the square of errors (S). For the preceding case, the expression becomes:

We want to estimate the regression coefficients, W and b, such that S is minimized. We use the fact that the derivative of a function is 0 at its minima to get these two equations:

These two equations can be solved to find the two unknowns. To do so, we first expand the summation in the second equation:

Take a look at the last term on the left-hand side; it just sums up a constant N time. Thus, we can rewrite it as:

Reordering the terms, we get:

The two terms on the right-hand side can be replaced by , the average price (output), and , the average area (input), respectively, and thus we get:

In a similar fashion, we expand the partial differential equation of S with respect to weight W:

Substitute the expression for the bias term b:

Reordering the equation:

Playing around with the mean definition, we can get from this the value of weight W as:

where and are the average price and area, respectively. Let us try this on some simple sample data:

  1. We import the necessary modules. It is a simple example, so we’ll be using only NumPy, pandas, and Matplotlib:
    import tensorflow as tf
    import numpy as np
    import matplotlib.pyplot as plt
    import pandas as pd
    
  2. Next, we generate random data with a linear relationship. To make it more realistic, we also add a random noise element. You can see the two variables (the cause, area, and the effect, price) follow a positive linear dependence:
    #Generate a random data
    np.random.seed(0)
    area = 2.5 * np.random.randn(100) + 25
    price = 25 * area + 5 + np.random.randint(20,50, size = len(area))
    data = np.array([area, price])
    data = pd.DataFrame(data = data.T, columns=['area','price'])
    plt.scatter(data['area'], data['price'])
    plt.show()
    
Chart, scatter chart  Description automatically generated

Figure 2.1: Scatter plot between the area of the house and its price

  1. Now, we calculate the two regression coefficients using the equations we defined. You can see the result is very much near the linear relationship we have simulated:
    W = sum(price*(area-np.mean(area))) / sum((area-np.mean(area))**2)
    b = np.mean(price) - W*np.mean(area)
    print("The regression coefficients are", W,b)
    
    -----------------------------------------------
    The regression coefficients are 24.815544052284988 43.4989785533412
    
  2. Let us now try predicting the new prices using the obtained weight and bias values:
    y_pred = W * area + b
    
  3. Next, we plot the predicted prices along with the actual price. You can see that predicted prices follow a linear relationship with the area:
    plt.plot(area, y_pred, color='red',label="Predicted Price")
    plt.scatter(data['area'], data['price'], label="Training Data")
    plt.xlabel("Area")
    plt.ylabel("Price")
    plt.legend()
    
    A close up of a map  Description automatically generated

    Figure 2.2: Predicted values vs the actual price

From Figure 2.2, we can see that the predicted values follow the same trend as the actual house prices.

Multiple linear regression

The preceding example was simple, but that is rarely the case. In most problems, the dependent variables depend upon multiple independent variables. Multiple linear regression finds a linear relationship between the many independent input variables (X) and the dependent output variable (Y), such that they satisfy the predicted Y value of the form:

where are the n independent input variables, and are the linear coefficients, with b as the bias term.

As before, the linear coefficients Ws are estimated using the method of least squares, that is, minimizing the sum of squared differences between predicted values () and observed values (Y). Thus, we try to minimize the loss function (also called squared error, and if we divide by n, it is the mean squared error):

where the sum is over all the training samples.

As you might have guessed, now, instead of two, we will have n+1 equations, which we will need to simultaneously solve. An easier alternative will be to use the TensorFlow Keras API. We will learn shortly how to use the TensorFlow Keras API to perform the task of regression.

Multivariate linear regression

There can be cases where the independent variables affect more than one dependent variable. For example, consider the case where we want to predict a rocket’s speed and its carbon dioxide emission – these two will now be our dependent variables, and both will be affected by the sensors reading the fuel amount, engine type, rocket body, and so on. This is a case of multivariate linear regression. Mathematically, a multivariate regression model can be represented as:

where and . The term represents the jth predicted output value corresponding to the ith input sample, w represents the regression coefficients, and xik is the kth feature of the ith input sample. The number of equations needed to solve in this case will now be n x m. While we can solve these equations using matrices, the process will be computationally expensive as it will involve calculating the inverse and determinants. An easier way would be to use the gradient descent with the sum of least square error as the loss function and to use one of the many optimizers that the TensorFlow API includes.

In the next section, we will delve deeper into the TensorFlow Keras API, a versatile higher-level API to develop your model with ease.

Neural networks for linear regression

In the preceding sections, we used mathematical expressions for calculating the coefficients of a linear regression equation. In this section, we will see how we can use the neural networks to perform the task of regression and build a neural network model using the TensorFlow Keras API.

Before performing regression using neural networks, let us first review what a neural network is. Simply speaking, a neural network is a network of many artificial neurons. From Chapter 1, Neural Network Foundations with TF, we know that the simplest neural network, the (simple) perceptron, can be mathematically represented as:

where f is the activation function. Consider, if we have f as a linear function, then the above expression is similar to the expression of linear regression that we learned in the previous section. In other words, we can say that a neural network, which is also called a function approximator, is a generalized regressor. Let us try to build a neural network simple regressor next using the TensorFlow Keras API.

Simple linear regression using TensorFlow Keras

In the first chapter, we learned about how to build a model in TensorFlow Keras. Here, we will use the same Sequential API to build a single-layered perceptron (fully connected neural network) using the Dense class. We will continue with the same problem, that is, predicting the price of a house given its area:

  1. We start with importing the packages we will need. Notice the addition of the Keras module and the Dense layer in importing packages:
    import tensorflow as tf
    import numpy as np
    import matplotlib.pyplot as plt
    import pandas as pd
    import tensorflow.keras as K
    from tensorflow.keras.layers import Dense
    
  2. Next, we generate the data, as in the previous case:
    #Generate a random data
    np.random.seed(0)
    area = 2.5 * np.random.randn(100) + 25
    price = 25 * area + 5 + np.random.randint(20,50, size = len(area))
    data = np.array([area, price])
    data = pd.DataFrame(data = data.T, columns=['area','price'])
    plt.scatter(data['area'], data['price'])
    plt.show()
    
  3. The input to neural networks should be normalized; this is because input gets multiplied with weights, and if we have very large numbers, the result of multiplication will be large, and soon our metrics may cross infinity (the largest number your computer can handle):
    data = (data - data.min()) / (data.max() - data.min())  #Normalize
    
  4. Let us now build the model; since it is a simple linear regressor, we use a Dense layer with only one unit:
    model = K.Sequential([
                          Dense(1, input_shape = [1,], activation=None)
    ])
    model.summary()
    
    Model: "sequential"
    ____________________________________________________________
     Layer (type)           Output Shape              Param #   
    ============================================================
     dense (Dense)          (None, 1)                 2         
                                                                
    ============================================================
    Total params: 2
    Trainable params: 2
    Non-trainable params: 0
    ____________________________________________________________
    
  5. To train a model, we will need to define the loss function and optimizer. The loss function defines the quantity that our model tries to minimize, and the optimizer decides the minimization algorithm we are using. Additionally, we can also define metrics, which is the quantity we want to log as the model is trained. We define the loss function, optimizer (see Chapter 1, Neural Network Foundations with TF), and metrics using the compile function:
    model.compile(loss='mean_squared_error', optimizer='sgd')
    
  6. Now that model is defined, we just need to train it using the fit function. Observe that we are using a batch_size of 32 and splitting the data into training and validation datasets using the validation_spilt argument of the fit function:
    model.fit(x=data['area'],y=data['price'], epochs=100, batch_size=32, verbose=1, validation_split=0.2)
    
    model.fit(x=data['area'],y=data['price'], epochs=100, batch_size=32, verbose=1, validation_split=0.2)
    Epoch 1/100
    3/3 [==============================] - 0s 78ms/step - loss: 1.2643 - val_loss: 1.4828
    Epoch 2/100
    3/3 [==============================] - 0s 13ms/step - loss: 1.0987 - val_loss: 1.3029
    Epoch 3/100
    3/3 [==============================] - 0s 13ms/step - loss: 0.9576 - val_loss: 1.1494
    Epoch 4/100
    3/3 [==============================] - 0s 16ms/step - loss: 0.8376 - val_loss: 1.0156
    Epoch 5/100
    3/3 [==============================] - 0s 15ms/step - loss: 0.7339 - val_loss: 0.8971
    Epoch 6/100
    3/3 [==============================] - 0s 16ms/step - loss: 0.6444 - val_loss: 0.7989
    Epoch 7/100
    3/3 [==============================] - 0s 14ms/step - loss: 0.5689 - val_loss: 0.7082
    .
    .
    .
    Epoch 96/100
    3/3 [==============================] - 0s 22ms/step - loss: 0.0827 - val_loss: 0.0755
    Epoch 97/100
    3/3 [==============================] - 0s 17ms/step - loss: 0.0824 - val_loss: 0.0750
    Epoch 98/100
    3/3 [==============================] - 0s 14ms/step - loss: 0.0821 - val_loss: 0.0747
    Epoch 99/100
    3/3 [==============================] - 0s 21ms/step - loss: 0.0818 - val_loss: 0.0740
    Epoch 100/100
    3/3 [==============================] - 0s 15ms/step - loss: 0.0815 - val_loss: 0.0740
    <keras.callbacks.History at 0x7f7228d6a790>
    
  7. Well, you have successfully trained a neural network to perform the task of linear regression. The mean squared error after training for 100 epochs is 0.0815 on training data and 0.074 on validation data. We can get the predicted value for a given input using the predict function:
    y_pred = model.predict(data['area'])
    
  8. Next, we plot a graph of the predicted and the actual data:
    plt.plot(data['area'], y_pred, color='red',label="Predicted Price")
    plt.scatter(data['area'], data['price'], label="Training Data")
    plt.xlabel("Area")
    plt.ylabel("Price")
    plt.legend()
    
  9. Figure 2.3 shows the plot between the predicted data and the actual data. You can see that, just like the linear regressor, we have got a nice linear fit:
Chart, scatter chart  Description automatically generated

Figure 2.3: Predicted price vs actual price

  1. In case you are interested in knowing the coefficients W and b, we can do it by printing the weights of the model using model.weights:
    [<tf.Variable 'dense/kernel:0' shape=(1, 1) dtype=float32, numpy=array([[-0.33806288]], dtype=float32)>,
    <tf.Variable 'dense/bias:0' shape=(1,) dtype=float32, numpy=array([0.68142694], dtype=float32)>]
    

We can see from the result above that our coefficients are W= 0.69 and bias b= 0.127. Thus, using linear regression, we can find a linear relationship between the house price and its area. In the next section, we explore multiple and multivariate linear regression using the TensorFlow Keras API.

Multiple and multivariate linear regression using the TensorFlow Keras API

The example in the previous section had only one independent variable, the area of the house, and one dependent variable, the price of the house. However, problems in real life are not that simple; we may have more than one independent variable, and we may need to predict more than one dependent variable. As you must have realized from the discussion on multiple and multivariate regression, they involve solving multiple equations. We can make our tasks easier by using the Keras API for both tasks.

Additionally, we can have more than one neural network layer, that is, we can build a deep neural network. A deep neural network is like applying multiple function approximators:

with being the function at layer L. From the expression above, we can see that if f was a linear function, adding multiple layers of a neural network was not useful; however, using a non-linear activation function (see Chapter 1, Neural Network Foundations with TF, for more details) allows us to apply neural networks to the regression problems where dependent and independent variables are related in some non-linear fashion. In this section, we will use a deep neural network, built using TensorFlow Keras, to predict the fuel efficiency of a car, given its number of cylinders, displacement, acceleration, and so on. The data we use is available from the UCI ML repository (Blake, C., & Merz, C. (1998), the UCI repository of machine learning databases (http://www.ics.uci.edu/~mlearn/MLRepository.html):

  1. We start by importing the modules that we will need. In the previous example, we normalized our data using the DataFrame operations. In this example, we will make use of the Keras Normalization layer. The Normalization layer shifts the data to a zero mean and one standard deviation. Also, since we have more than one independent variable, we will use Seaborn to visualize the relationship between different variables:
    import tensorflow as tf
    import numpy as np
    import matplotlib.pyplot as plt
    import pandas as pd
    import tensorflow.keras as K
    from tensorflow.keras.layers import Dense, Normalization
    import seaborn as sns
    
  2. Let us first download the data from the UCI ML repo.
    url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data'
    column_names = ['mpg', 'cylinders', 'displacement', 'horsepower', 'weight', 'acceleration', 'model_year', 'origin']
    data = pd.read_csv(url, names=column_names, na_values='?', comment='\t', sep=' ', skipinitialspace=True)
    
  3. The data consists of eight features: mpg, cylinders, displacement, horsepower, weight, acceleration, model year, and origin. Though the origin of the vehicle can also affect the fuel efficiency “mpg” (miles per gallon), we use only seven features to predict the mpg value. Also, we drop any rows with NaN values:
    data = data.drop('origin', 1)
    print(data.isna().sum())
    data = data.dropna()
    
  4. We divide the dataset into training and test datasets. Here, we are keeping 80% of the 392 datapoints as training data and 20% as test dataset:
    train_dataset = data.sample(frac=0.8, random_state=0)
    test_dataset = data.drop(train_dataset.index)
    
  5. Next, we use Seaborn’s pairplot to visualize the relationship between the different variables:
    sns.pairplot(train_dataset[['mpg', 'cylinders', 'displacement','horsepower', 'weight', 'acceleration', 'model_year']], diag_kind='kde')
    
  6. We can see that mpg (fuel efficiency) has dependencies on all the other variables, and the dependency relationship is non-linear, as none of the curves are linear:
A picture containing text, electronics, display  Description automatically generated

Figure 2.4: Relationship among different variables of auto-mpg data

  1. For convenience, we also separate the variables into input variables and the label that we want to predict:
    train_features = train_dataset.copy()
    test_features = test_dataset.copy() 
    train_labels = train_features.pop('mpg')
    test_labels = test_features.pop('mpg')
    
  2. Now, we use the Normalization layer of Keras to normalize our data. Note that while we normalized our inputs to a value with mean 0 and standard deviation 1, the output prediction 'mpg' remains as it is:
    #Normalize
    data_normalizer = Normalization(axis=1)
    data_normalizer.adapt(np.array(train_features))
    
  3. We build our model. The model has two hidden layers, with 64 and 32 neurons, respectively. For the hidden layers, we have used Rectified Linear Unit (ReLU) as our activation function; this should help in approximating the non-linear relation between fuel efficiency and the rest of the variables:
    model = K.Sequential([
        data_normalizer,
        Dense(64, activation='relu'),
        Dense(32, activation='relu'),
        Dense(1, activation=None)
    ])
    model.summary()
    
  4. Earlier, we used stochastic gradient as the optimizer; this time, we try the Adam optimizer (see Chapter 1, Neural Network Foundations with TF, for more details). The loss function for the regression we chose is the mean squared error again:
    model.compile(optimizer='adam', loss='mean_squared_error')
    
  5. Next, we train the model for 100 epochs:
    history = model.fit(x=train_features,y=train_labels, epochs=100, verbose=1, validation_split=0.2)
    
  6. Cool, now that the model is trained, we can check if our model is overfitted, underfitted, or properly fitted by plotting the loss curve. Both validation loss and training loss are near each other as we increase the training epochs; this suggests that our model is properly trained:
    plt.plot(history.history['loss'], label='loss')
    plt.plot(history.history['val_loss'], label='val_loss')
    plt.xlabel('Epoch')
    plt.ylabel('Error [MPG]')
    plt.legend()
    plt.grid(True)
    
Chart, line chart  Description automatically generated

Figure 2.5: Model error

  1. Let us finally compare the predicted fuel efficiency and the true fuel efficiency on the test dataset. Remember that the model has not seen a test dataset ever, thus this prediction is from the model’s ability to generalize the relationship between inputs and fuel efficiency. If the model has learned the relationship well, the two should form a linear relationship:
    y_pred = model.predict(test_features).flatten()
    a = plt.axes(aspect='equal')
    plt.scatter(test_labels, y_pred)
    plt.xlabel('True Values [MPG]')
    plt.ylabel('Predictions [MPG]')
    lims = [0, 50]
    plt.xlim(lims)
    plt.ylim(lims)
    plt.plot(lims, lims)
    
Chart, scatter chart  Description automatically generated

Figure 2.6: Plot between predicted fuel efficiency and actual values

  1. Additionally, we can also plot the error between the predicted and true fuel efficiency:
    error = y_pred - test_labels
    plt.hist(error, bins=30)
    plt.xlabel('Prediction Error [MPG]')
    plt.ylabel('Count')
    
Chart, histogram  Description automatically generated

Figure 2.7: Prediction error

In case we want to make more than one prediction, that is, dealing with a multivariate regression problem, the only change would be that instead of one unit in the last dense layer, we will have as many units as the number of variables to be predicted. Consider, for example, we want to build a model which takes into account a student’s SAT score, attendance, and some family parameters, and wants to predict the GPA score for all four undergraduate years; then we will have the output layer with four units. Now that you are familiar with regression, let us move toward the classification tasks.

Classification tasks and decision boundaries

Till now, the focus of the chapter was on regression. In this section, we will talk about another important task: the task of classification. Let us first understand the difference between regression (also sometimes referred to as prediction) and classification:

  • In classification, the data is grouped into classes/categories, while in regression, the aim is to get a continuous numerical value for given data. For example, identifying the number of handwritten digits is a classification task; all handwritten digits will belong to one of the ten numbers lying between 0-9. The task of predicting the price of the house depending upon different input variables is a regression task.
  • In a classification task, the model finds the decision boundaries separating one class from another. In the regression task, the model approximates a function that fits the input-output relationship.
  • Classification is a subset of regression; here, we are predicting classes. Regression is much more general.

Figure 2.8 shows how classification and regression tasks differ. In classification, we need to find a line (or a plane or hyperplane in multidimensional space) separating the classes. In regression, the aim is to find a line (or plane or hyperplane) that fits the given input points:

Figure 2.8: Classification vs regression

In the following section, we will explain logistic regression, which is a very common and useful classification technique.

Logistic regression

Logistic regression is used to determine the probability of an event. Conventionally, the event is represented as a categorical dependent variable. The probability of the event is expressed using the sigmoid (or “logit”) function:

The goal now is to estimate weights and bias term b. In logistic regression, the coefficients are estimated using either the maximum likelihood estimator or stochastic gradient descent. If p is the total number of input data points, the loss is conventionally defined as a cross-entropy term given by:

Logistic regression is used in classification problems. For example, when looking at medical data, we can use logistic regression to classify whether a person has cancer or not. If the output categorical variable has two or more levels, we can use multinomial logistic regression. Another common technique used for two or more output variables is one versus all.

For multiclass logistic regression, the cross-entropy loss function is modified as:

where K is the total number of classes. You can read more about logistic regression at https://en.wikipedia.org/wiki/Logistic_regression.

Now that you have some idea about logistic regression, let us see how we can apply it to any dataset.

Logistic regression on the MNIST dataset

Next, we will use TensorFlow Keras to classify handwritten digits using logistic regression. We will be using the MNIST (Modified National Institute of Standards and Technology) dataset. For those working in the field of deep learning, MNIST is not new, it is like the ABC of machine learning. It contains images of handwritten digits and a label for each image, indicating which digit it is. The label contains a value lying between 0-9 depending on the handwritten digit. Thus, it is a multiclass classification.

To implement the logistic regression, we will make a model with only one dense layer. Each class will be represented by a unit in the output, so since we have 10 classes, the number of units in the output would be 10. The probability function used in the logistic regression is similar to the sigmoid activation function; therefore, we use sigmoid activation.

Let us build our model:

  1. The first step is, as always, importing the modules needed. Notice that here we are using another useful layer from the Keras API, the Flatten layer. The Flatten layer helps us to resize the 28 x 28 two-dimensional input images of the MNIST dataset into a 784 flattened array:
    import tensorflow as tf
    import numpy as np
    import matplotlib.pyplot as plt
    import pandas as pd
    import tensorflow.keras as K
    from tensorflow.keras.layers import Dense, Flatten
    
  2. We take the input data of MNIST from the tensorflow.keras dataset:
    ((train_data, train_labels),(test_data, test_labels)) = tf.keras.datasets.mnist.load_data()
    
  3. Next, we preprocess the data. We normalize the images; the MNIST dataset images are black and white images with the intensity value of each pixel lying between 0-255. We divide it by 255, so that now the values lie between 0-1:
    train_data = train_data/np.float32(255)
    train_labels = train_labels.astype(np.int32)  
    test_data = test_data/np.float32(255)
    test_labels = test_labels.astype(np.int32)
    
  4. Now, we define a very simple model; it has only one Dense layer with 10 units, and it takes an input of size 784. You can see from the output of the model summary that only the Dense layer has trainable parameters:
    model = K.Sequential([
        Flatten(input_shape=(28, 28)),
        Dense(10, activation='sigmoid')
    ])
    model.summary()
    
    Model: "sequential"
    ____________________________________________________________
     Layer (type)           Output Shape              Param #   
    ============================================================
     flatten (Flatten)      (None, 784)               0         
                                                                
     dense (Dense)          (None, 10)                7850      
                                                                
    ============================================================
    Total params: 7,850
    Trainable params: 7,850
    Non-trainable params: 0
    ____________________________________________________________
    
  5. Since the test labels are integral values, we will use SparseCategoricalCrossentropy loss with logits set to True. The optimizer selected is Adam. Additionally, we also define accuracy as metrics to be logged as the model is trained. We train our model for 50 epochs, with a train-validation split of 80:20:
    model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy'])
    history = model.fit(x=train_data,y=train_labels, epochs=50, verbose=1, validation_split=0.2)
    
  6. Let us see how our simple model has fared by plotting the loss plot. You can see that since the validation loss and training loss are diverging, as the training loss is decreasing, the validation loss increases, thus the model is overfitting. You can improve the model performance by adding hidden layers:
    plt.plot(history.history['loss'], label='loss')
    plt.plot(history.history['val_loss'], label='val_loss')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.legend()
    plt.grid(True)
    
Chart, line chart  Description automatically generated

Figure 2.9: Loss plot

  1. To better understand the result, we build two utility functions; these functions help us in visualizing the handwritten digits and the probability of the 10 units in the output:
    def plot_image(i, predictions_array, true_label, img):
        true_label, img = true_label[i], img[i]
        plt.grid(False)
        plt.xticks([])
        plt.yticks([])
        plt.imshow(img, cmap=plt.cm.binary)
        predicted_label = np.argmax(predictions_array)
        if predicted_label == true_label:
          color ='blue'
        else:
          color ='red'
        plt.xlabel("Pred {} Conf: {:2.0f}% True ({})".format(predicted_label,
                                      100*np.max(predictions_array),
                                      true_label),
                                      color=color)
    def plot_value_array(i, predictions_array, true_label):
        true_label = true_label[i]
        plt.grid(False)
        plt.xticks(range(10))
        plt.yticks([])
        thisplot = plt.bar(range(10), predictions_array,
        color"#777777")
        plt.ylim([0, 1])
        predicted_label = np.argmax(predictions_array)
        thisplot[predicted_label].set_color('red')
        thisplot[true_label].set_color('blue')
    
  2. Using these utility functions, we plot the predictions:
    predictions = model.predict(test_data)
    i = 56
    plt.figure(figsize=(10,5))
    plt.subplot(1,2,1)
    plot_image(i, predictions[i], test_labels, test_data)
    plt.subplot(1,2,2)
    plot_value_array(i, predictions[i],  test_labels)
    plt.show()
    
  3. The plot on the left is the image of the handwritten digit, with the predicted label, the confidence in the prediction, and the true label. The image on the right shows the probability (logistic) output of the 10 units; we can see that the unit which represents the number 4 has the highest probability:
A picture containing logo  Description automatically generated

Figure 2.10: Predicted digit and confidence value of the prediction

  1. In this code, to stay true to logistic regression, we used a sigmoid activation function and only one Dense layer. For better performance, adding dense layers and using softmax as the final activation function will be helpful. For example, the following model gives 97% accuracy on the validation dataset:
    better_model = K.Sequential([
        Flatten(input_shape=(28, 28)),
        Dense(128,  activation='relu'),
        Dense(10, activation='softmax')
    ])
    better_model.summary()
    

You can experiment by adding more layers, or by changing the number of neurons in each layer, and even changing the optimizer. This will give you a better understanding of how these parameters influence the model performance.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Understand the fundamentals of deep learning and machine learning through clear explanations and extensive code samples
  • Implement graph neural networks, transformers using Hugging Face and TensorFlow Hub, and joint and contrastive learning
  • Learn cutting-edge machine and deep learning techniques

Description

Deep Learning with TensorFlow and Keras teaches you neural networks and deep learning techniques using TensorFlow (TF) and Keras. You'll learn how to write deep learning applications in the most powerful, popular, and scalable machine learning stack available. TensorFlow 2.x focuses on simplicity and ease of use, with updates like eager execution, intuitive higher-level APIs based on Keras, and flexible model building on any platform. This book uses the latest TF 2.0 features and libraries to present an overview of supervised and unsupervised machine learning models and provides a comprehensive analysis of deep learning and reinforcement learning models using practical examples for the cloud, mobile, and large production environments. This book also shows you how to create neural networks with TensorFlow, runs through popular algorithms (regression, convolutional neural networks (CNNs), transformers, generative adversarial networks (GANs), recurrent neural networks (RNNs), natural language processing (NLP), and graph neural networks (GNNs)), covers working example apps, and then dives into TF in production, TF mobile, and TensorFlow with AutoML.

Who is this book for?

This hands-on machine learning book is for Python developers and data scientists who want to build machine learning and deep learning systems with TensorFlow. This book gives you the theory and practice required to use Keras, TensorFlow, and AutoML to build machine learning systems. Some machine learning knowledge would be useful. We don’t assume TF knowledge.

What you will learn

  • Learn how to use the popular GNNs with TensorFlow to carry out graph mining tasks
  • Discover the world of transformers, from pretraining to fine-tuning to evaluating them
  • Apply self-supervised learning to natural language processing, computer vision, and audio signal processing
  • Combine probabilistic and deep learning models using TensorFlow Probability
  • Train your models on the cloud and put TF to work in real environments
  • Build machine learning and deep learning systems with TensorFlow 2.x and the Keras API
Estimated delivery fee Deliver to Latvia

Premium delivery 7 - 10 business days

€25.95
(Includes tracking information)

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Oct 06, 2022
Length: 698 pages
Edition : 3rd
Language : English
ISBN-13 : 9781803232911
Category :
Concepts :
Tools :

What do you get with Print?

Product feature icon Instant access to your digital eBook copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Shipping Address

Billing Address

Shipping Methods
Estimated delivery fee Deliver to Latvia

Premium delivery 7 - 10 business days

€25.95
(Includes tracking information)

Product Details

Publication date : Oct 06, 2022
Length: 698 pages
Edition : 3rd
Language : English
ISBN-13 : 9781803232911
Category :
Concepts :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
€18.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
€189.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts
€264.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total 119.97
Machine Learning with PyTorch and Scikit-Learn
€41.99
Modern Time Series Forecasting with Python
€39.99
Deep Learning with TensorFlow and Keras – 3rd edition
€37.99
Total 119.97 Stars icon
Banner background image

Table of Contents

22 Chapters
Neural Network Foundations with TF Chevron down icon Chevron up icon
Regression and Classification Chevron down icon Chevron up icon
Convolutional Neural Networks Chevron down icon Chevron up icon
Word Embeddings Chevron down icon Chevron up icon
Recurrent Neural Networks Chevron down icon Chevron up icon
Transformers Chevron down icon Chevron up icon
Unsupervised Learning Chevron down icon Chevron up icon
Autoencoders Chevron down icon Chevron up icon
Generative Models Chevron down icon Chevron up icon
Self-Supervised Learning Chevron down icon Chevron up icon
Reinforcement Learning Chevron down icon Chevron up icon
Probabilistic TensorFlow Chevron down icon Chevron up icon
An Introduction to AutoML Chevron down icon Chevron up icon
The Math Behind Deep Learning Chevron down icon Chevron up icon
Tensor Processing Unit Chevron down icon Chevron up icon
Other Useful Deep Learning Libraries Chevron down icon Chevron up icon
Graph Neural Networks Chevron down icon Chevron up icon
Machine Learning Best Practices Chevron down icon Chevron up icon
TensorFlow 2 Ecosystem Chevron down icon Chevron up icon
Advanced Convolutional Neural Networks Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.6
(45 Ratings)
5 star 73.3%
4 star 17.8%
3 star 4.4%
2 star 0%
1 star 4.4%
Filter icon Filter
Top Reviews

Filter reviews by




Carlo Estopia Feb 18, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Feefo Verified review Feefo
hawkinflight Oct 06, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This is the third edition of the book, updated and seasoned, and my first time looking at it. Why learn and use Deep Learning? "DL techniques can solve problems with a level of accuracy that was not possible using previous methods."The book is nicely concise and thorough, well-written. Following the code example in the first chapter, I quickly fit the Sentiment Analysis model of IMDB reviews. I had not really used Google Colab before, it was easy and similar to Jupyter notebooks. You can choose to run on a CPU, GPU, or TPU. This first example uses the simplest of three methods of model building with tf.keras, the Sequential() model. Skimming the code made me curious - what is this and that?, so I searched online for the documentation, quickly found it at tensorflow dot org, where they also have tutorials. There are many code examples in the book and they use Python which uses "TensorFlow 2.x, a modular network library based on Keras-like APIs".I like the chapter divisions and the offerings; there are 20, which includes one focusing on the math behind DL. Other topics of interest to me are: Transformers, Probabilistic Tensorflow, Intro to AutoML, Four generations of TPUs, Other Useful DL libraries, ML Best Practices, and TensorFlow Lite. I like that there is a list of references and resources at the end of each chapter.I think this book will be an excellent companion on a further journey of exploration of DL model building. The library comes with datasets, if you want to avoid preparing your own at the start.
Amazon Verified review Amazon
Lydia Jan 23, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Absolutely amazing book which delivers insights on machine learning and NLP models. The mathematical and structural descriptions are well motivated and followed by code that is well documented using standard packages. It is rare to find such a reference on even one of the topics, but this reference delivers across a wide range of techniques.I was especially impressed with the chapters devoted to natural language processing. After well written chapters on basic concepts such as word vectors, the authors provide excellent coverage of transformers which are the current state of the art for language processing. The authors cover the basics of transformers and then illuminate the differences amongst the many transformer variates with their target uses and particular strengths. As in the other chapters, the discussion of transformers is capped by a detailed walk through of code insuring that the reader understands the steps needed to construct the processing pipeline through to model training and output.The ending chapters make up an excellent reference manual of concept and techniques such as parameter turning using AutoML, the mathematical methods used to optimize model coefficients by backpropagation, hardware decisions, and an introduction to other deep learning libraries.I highly recommend this book regardless of your level of modeling experience.Elliot NomaLead Data ScientistThe Financial Regulatory Authority
Amazon Verified review Amazon
Nivas Dec 21, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Really enjoyed reading through 'Deep Learning with TensorFlow and Keras'. The authors have delivered a comprehensive and detailed book on how to use TensorFlow and Keras. Not only will you get familiar with using ML platforms and open-source libraries, you will learn when and why you should use certain ML techniques. There is so much useful content here that I will plan to continue to use this book as a reference!
Amazon Verified review Amazon
SACHIN SINGH Nov 30, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This textbook is really good, and contains from scratch knowledge about deep learning framework and implementation.Consider this textbook for the serious life long learners of deep learning, and also helpful in clearing the tensorflow developer exam.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is the delivery time and cost of print book? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela
What is custom duty/charge? Chevron down icon Chevron up icon

Customs duty are charges levied on goods when they cross international borders. It is a tax that is imposed on imported goods. These duties are charged by special authorities and bodies created by local governments and are meant to protect local industries, economies, and businesses.

Do I have to pay customs charges for the print book order? Chevron down icon Chevron up icon

The orders shipped to the countries that are listed under EU27 will not bear custom charges. They are paid by Packt as part of the order.

List of EU27 countries: www.gov.uk/eu-eea:

A custom duty or localized taxes may be applicable on the shipment and would be charged by the recipient country outside of the EU27 which should be paid by the customer and these duties are not included in the shipping charges been charged on the order.

How do I know my custom duty charges? Chevron down icon Chevron up icon

The amount of duty payable varies greatly depending on the imported goods, the country of origin and several other factors like the total invoice amount or dimensions like weight, and other such criteria applicable in your country.

For example:

  • If you live in Mexico, and the declared value of your ordered items is over $ 50, for you to receive a package, you will have to pay additional import tax of 19% which will be $ 9.50 to the courier service.
  • Whereas if you live in Turkey, and the declared value of your ordered items is over € 22, for you to receive a package, you will have to pay additional import tax of 18% which will be € 3.96 to the courier service.
How can I cancel my order? Chevron down icon Chevron up icon

Cancellation Policy for Published Printed Books:

You can cancel any order within 1 hour of placing the order. Simply contact customercare@packt.com with your order details or payment transaction id. If your order has already started the shipment process, we will do our best to stop it. However, if it is already on the way to you then when you receive it, you can contact us at customercare@packt.com using the returns and refund process.

Please understand that Packt Publishing cannot provide refunds or cancel any order except for the cases described in our Return Policy (i.e. Packt Publishing agrees to replace your printed book because it arrives damaged or material defect in book), Packt Publishing will not accept returns.

What is your returns and refunds policy? Chevron down icon Chevron up icon

Return Policy:

We want you to be happy with your purchase from Packtpub.com. We will not hassle you with returning print books to us. If the print book you receive from us is incorrect, damaged, doesn't work or is unacceptably late, please contact Customer Relations Team on customercare@packt.com with the order number and issue details as explained below:

  1. If you ordered (eBook, Video or Print Book) incorrectly or accidentally, please contact Customer Relations Team on customercare@packt.com within one hour of placing the order and we will replace/refund you the item cost.
  2. Sadly, if your eBook or Video file is faulty or a fault occurs during the eBook or Video being made available to you, i.e. during download then you should contact Customer Relations Team within 14 days of purchase on customercare@packt.com who will be able to resolve this issue for you.
  3. You will have a choice of replacement or refund of the problem items.(damaged, defective or incorrect)
  4. Once Customer Care Team confirms that you will be refunded, you should receive the refund within 10 to 12 working days.
  5. If you are only requesting a refund of one book from a multiple order, then we will refund you the appropriate single item.
  6. Where the items were shipped under a free shipping offer, there will be no shipping costs to refund.

On the off chance your printed book arrives damaged, with book material defect, contact our Customer Relation Team on customercare@packt.com within 14 days of receipt of the book with appropriate evidence of damage and we will work with you to secure a replacement copy, if necessary. Please note that each printed book you order from us is individually made by Packt's professional book-printing partner which is on a print-on-demand basis.

What tax is charged? Chevron down icon Chevron up icon

Currently, no tax is charged on the purchase of any print book (subject to change based on the laws and regulations). A localized VAT fee is charged only to our European and UK customers on eBooks, Video and subscriptions that they buy. GST is charged to Indian customers for eBooks and video purchases.

What payment methods can I use? Chevron down icon Chevron up icon

You can pay with the following card types:

  1. Visa Debit
  2. Visa Credit
  3. MasterCard
  4. PayPal
What is the delivery time and cost of print books? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela