Keras's Sequential API is straightforward to understand and implement. It lets us create a neural network linearly; that is, we can build a neural network layer-by-layer where we initialize a sequential model and then stack a series of hidden and output layers on it.
Sequential API
Getting ready
Before creating a neural network using the Sequential API, let's load the keras library into our environment and generate some dummy data:
library(keras)
Now, let's simulate some dummy data for this exercise:
x_data <- matrix(rnorm(1000*784), nrow = 1000, ncol = 784)
y_data <- matrix(rnorm(1000), nrow = 1000, ncol = 1)
We can check the dimension of the x and y data by executing the following commands:
dim(x_data)
dim(y_data)
The dimension of the x_data data is 1,000×784, whereas the dimension of the y_data data is 1,000×1.
How to do it...
Now, we can build our first sequential keras model and train it:
- Let's start by defining a sequential model:
model_sequential <- keras_model_sequential()
- We need to add layers to the model we defined in the preceding code block:
model_sequential %>%
layer_dense(units = 16,batch_size = ,input_shape = c(784)) %>%
layer_activation('relu') %>%
layer_dense(units = 1)
- After adding the layers to our model, we need to compile it:
model_sequential %>% compile(
loss = "mse",
optimizer = optimizer_sgd(),
metrics = list("mean_absolute_error")
)
- Now, let's visualize the summary of the model we created:
model_sequential %>% summary()
The summary of the model is as follows:
- Now, let's train the model and store the training stats in a variable in order to plot the model's metrics:
history <- model_sequential %>% fit(
x_data,
y_data,
epochs = 30,
batch_size = 128,
validation_split = 0.2
)
# Plotting model metrics
plot(history)
The preceding code generates the following plot:
The preceding plot shows the loss and mean absolute error for the training and validation data.
How it works...
In step 1, we initialized a sequential model by calling the keras_model_sequential() function. In the next step, we stacked hidden and output layers by using a series of layer functions. The layer_dense() function adds a densely-connected layer to the defined model. The first layer of the sequential model needs to know what input shape it should expect, so we passed a value to the input_shape argument of the first layer. In our case, the input shape was equal to the number of features in the dataset. When we add layers to the keras sequential model, the model object is modified in-place, and we do not need to assign the updated object back to the original. The keras object's behavior is unlike most R objects (R objects are typically immutable). For our model, we used the relu activation function. The layer_activation() function creates an activation layer that takes input from the preceding hidden layer and applies activation to the output of our previous hidden layer. We can also use different functions, such as leaky ReLU, softmax, and more (activation functions will be discussed in Implementing a single-layer neural network recipe). In the output layer of our model, no activation was applied.
We can also implement various activation functions for each layer by passing a value to the activation argument in the layer_dense() function instead of adding an activation layer explicitly. It applies the following operation:
output=activation(dot(input, kernel)+bias)
Here, the activation argument refers to the element-wise activation function that's passed, while the kernel is a weights matrix that's created by the layer. The bias is a bias vector that's produced by the layer.
To train a model, we need to configure the learning process. We did this in step 3 using the compile() function. In our training process, we applied a stochastic gradient descent optimizer to find the weights and biases that minimize our objective loss function; that is, the mean squared error. The metrics argument calculates the metric to be evaluated by the model during training and testing.
In step 4, we looked at the summary of the model; it showed us information about each layer, such as the shape of the output of each layer and the parameters of each layer.
In the last step, we trained our model for a fixed number of iterations on the dataset. Here, the epochs argument defines the number of iterations. The validation_split argument can take float values between 0 and 1. It specifies a fraction of the training data to be used as validation data. Finally, batch_size defines the number of samples that propagate through the network.
There's more...
Training a deep learning model is a time-consuming task. If training stops unexpectedly, we can lose a lot of our work. The keras library in R provides us with the functionality to save a model's progress during and after training. A saved model contains the weight values, the model's configuration, and the optimizer's configuration. If the training process is interrupted somehow, we can pick up training from there.
The following code block shows how we can save the model after training:
# Save model
model_sequential %>% save_model_hdf5("my_model.h5")
If we want to save the model after each iteration while training, we need to create a checkpoint object. To perform this task, we use the callback_model_checkpoint() function. The value of the filepath argument defines the name of the model that we want to save at the end of each iteration. For example, if filepath is {epoch:02d}-{val_loss:.2f}.hdf5, the model will be saved with the epoch number and the validation loss in the filename.
The following code block demonstrates how to save a model after each epoch:
checkpoint_dir <- "checkpoints"
dir.create(checkpoint_dir, showWarnings = FALSE)
filepath <- file.path(checkpoint_dir, "{epoch:02d}.hdf5")
# Create checkpoint callback
cp_callback <- callback_model_checkpoint(
filepath = filepath,
verbose = 1
)
# Fit model and save model after each check point
model_sequential %>% fit(
x_data,
y_data,
epochs = 30,
batch_size = 128,
validation_split = 0.2,
callbacks = list(cp_callback)
)
By doing this, you've learned how to save models with the appropriate checkpoints and callbacks.
See also
- To find out more about writing custom layers in Keras, go to https://tensorflow.rstudio.com/keras/articles/custom_layers.html.