Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Java Deep Learning Cookbook

You're reading from   Java Deep Learning Cookbook Train neural networks for classification, NLP, and reinforcement learning using Deeplearning4j

Arrow left icon
Product type Paperback
Published in Nov 2019
Publisher Packt
ISBN-13 9781788995207
Length 304 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Rahul Raj Rahul Raj
Author Profile Icon Rahul Raj
Rahul Raj
Arrow right icon
View More author details
Toc

Table of Contents (14) Chapters Close

Preface 1. Introduction to Deep Learning in Java 2. Data Extraction, Transformation, and Loading FREE CHAPTER 3. Building Deep Neural Networks for Binary Classification 4. Building Convolutional Neural Networks 5. Implementing Natural Language Processing 6. Constructing an LSTM Network for Time Series 7. Constructing an LSTM Neural Network for Sequence Classification 8. Performing Anomaly Detection on Unsupervised Data 9. Using RL4J for Reinforcement Learning 10. Developing Applications in a Distributed Environment 11. Applying Transfer Learning to Network Models 12. Benchmarking and Neural Network Optimization 13. Other Books You May Enjoy

Deep learning intuition

If you're a newbie to deep learning, you may be wondering how exactly it is differs from machine learning; or is it the same? Deep learning is a subset of the larger domain of machine learning. Let's think about this in the context of an automobile image classification problem:

As you can see in the preceding diagram, we need to perform feature extraction ourselves as legacy machine learning algorithms cannot do that on their own. They might be super-efficient with accurate results, but they cannot learn signals from data. In fact, they don't learn on their own and still rely on human effort:

On the other hand, deep learning algorithms learn to perform tasks on their own. Neural networks under the hood are based on the concept of deep learning and it trains on their own to optimize the results. However, the final decision process is hidden and cannot be tracked. The intent of deep learning is to imitate the functioning of a human brain.

Backpropagation

The backbone of a neural network is the backpropagation algorithm. Refer to the sample neural network structure shown as follows:

For any neural network, data flows from the input layer to the output layer during the forward pass. Each circle in the diagram represents a neuron. Every layer has a number of neurons present. Our data will pass through the neurons across layers. The input needs to be in a numerical format to support computational operations in neurons. Each neuron in the neural network is assigned a weight (matrix) and an activation function. Using the input data, weight matrix, and an activation function, a probabilistic value is generated at each neuron. The error (that is, a deviation from the actual value) is calculated at the output layer using a loss function. We utilize the loss score during the backward pass (that is, from the output layer to the input layer ) by reassigning weights to the neurons to reduce the loss score. During this stage, some output layer neurons will be assigned with high weights and vice versa depending upon the loss score results. This process will continue backward as far as the input layer by updating the weights of neurons. In a nutshell, we are tracking the rate of change of loss with respect to the change in weights across all neurons. This entire cycle (a forward and backward pass) is called an epoch. We perform multiple epochs during a training session. A neural network will tend to optimize the results after every training epoch.

Multilayer Perceptron (MLP)

An MLP is a standard feed-forward neural network with at least three layers: an input layer, a hidden layer, and an output layer. Hidden layers come after the input layer in the structure. Deep neural networks have two or more hidden layers in the structure, while an MLP has only one.

Convolutional Neural Network (CNN)

CNNs are generally used for image classification problems, but can also be exposed in Natural Language Processing (NLP), in conjunction with word vectors, because of their proven results. Unlike a regular neural network, a CNN will have additional layers such as convolutional layers and subsampling layers. Convolutional layers take input data (such as images) and apply convolution operations on top of them. You can think of it as applying a function to the input. Convolutional layers act as filters that pass a feature of interest to the upcoming subsampling layer. A feature of interest can be anything (for example, a fur, shade and so on in the case of an image) that can be used to identify the image. In the subsampling layer, the input from convolutional layers is further smoothed. So, we end up with a much smaller image resolution and reduced color contrast, preserving only the important information. The input is then passed on to fully connected layers. Fully connected layers resemble regular feed-forward neural networks.

Recurrent Neural Network (RNN)

An RNN is a neural network that can process sequential data. In a regular feed-forward neural network, the current input is considered for neurons in the next layer. On the other hand, an RNN can accept previously received inputs as well. It can also use memory to memorize previous inputs. So, it is capable of preserving long-term dependencies throughout the training session. RNN is a popular choice for NLP tasks such as speech recognition. In practice, a slightly variant structure called Long Short-Term Memory (LSTM) is used as a better alternative to RNN.

Why is DL4J important for deep learning?

The following points will help you understand why DL4J is important for deep learning:

  • DL4J provides commercial support. It is the first commercial-grade, open source, deep learning library in Java.
  • Writing training code is simple and precise. DL4J supports Plug and Play mode, which means switching between hardware (CPU to GPU) is just a matter of changing the Maven dependencies and no modifications are needed on the code.
  • DL4J uses ND4J as its backend. ND4J is a computation library that can run twice as fast as NumPy (a computation library in Python) in large matrix operations. DL4J exhibits faster training times in GPU environments compared to other Python counterparts.
  • DL4J supports training on a cluster of machines that are running in CPU/GPU using Apache Spark. DL4J brings in automated parallelism in distributed training. This means that DL4J bypasses the need for extra libraries by setting up worker nodes and connections.
  • DL4J is a good production-oriented deep learning library. As a JVM-based library, DL4J applications can be easily integrated/deployed with existing corporate applications that are running in Java/Scala.
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image