Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Java Deep Learning Projects

You're reading from   Java Deep Learning Projects Implement 10 real-world deep learning applications using Deeplearning4j and open source APIs

Arrow left icon
Product type Paperback
Published in Jun 2018
Publisher Packt
ISBN-13 9781788997454
Length 436 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Md. Rezaul Karim Md. Rezaul Karim
Author Profile Icon Md. Rezaul Karim
Md. Rezaul Karim
Arrow right icon
View More author details
Toc

Table of Contents (13) Chapters Close

Preface 1. Getting Started with Deep Learning 2. Cancer Types Prediction Using Recurrent Type Networks FREE CHAPTER 3. Multi-Label Image Classification Using Convolutional Neural Networks 4. Sentiment Analysis Using Word2Vec and LSTM Network 5. Transfer Learning for Image Classification 6. Real-Time Object Detection using YOLO, JavaCV, and DL4J 7. Stock Price Prediction Using LSTM Network 8. Distributed Deep Learning – Video Classification Using Convolutional LSTM Networks 9. Playing GridWorld Game Using Deep Reinforcement Learning 10. Developing Movie Recommendation Systems Using Factorization Machines 11. Discussion, Current Trends, and Outlook 12. Other Books You May Enjoy

Answers to FAQs

Answer to question 1: There are many ways to solve this problem:

  1. A ⊕ B= (A ∨ ¬ B)∨ (¬ A ∧ B)
  2. A ⊕ B = (A ∨ B) ∧ ¬(A ∨ B)
  3. A ⊕ B = (A ∨ B) ∧ (¬ A ∨ ∧ B), and so on

If we go with the first approach, the resulting ANNs would look like this:

Now from computer science literature, we know that only two input combinations and one output are associated with the XOR operation. With inputs (0, 0) or (1, 1) the network outputs 0; and with inputs (0, 1) or (1, 0), it outputs 1. So we can formally represent the preceding truth table as follows:

X0

X1

Y

0

0

0

0

1

1

1

0

1

1

1

0

Here, each pattern is classified into one of two classes that can be separated by a single line L. They are known as linearly separable patterns, as represented here:

Answer to question 2: The most significant progress in ANN and DL can be described in the following timeline. We have already seen how artificial neurons and perceptrons provided the base in 1943s and 1958s respectively. Then, XOR was formulated as a linearly non-separable problem in 1969 by Minsky et al. But later in 1974, Werbos et al. demonstrated the backpropagation algorithm for training the perceptron in 1974.

However, the most significant advancement happened in the 1980s, when John Hopfield et al. proposed the Hopfield Network in 1982. Then, Hinton, one of the godfathers of neural networks and deep learning, and his team proposed the Boltzmann machine in 1985. However, probably one of the most significant advances happened in 1986, when Hinton et al. successfully trained the MLP, and Jordan et. al. proposed RNNs. In the same year, Smolensky et al. also proposed an improved version of the RBM.

In the 1990s, the most significant year was 1997. Lecun et al. proposed LeNet in 1990, and Jordan et al. proposed RNN in 1997. In the same year, Schuster et al. proposed an improved version of LSTM and an improved version of the original RNN, called bidirectional RNN.

Despite significant advances in computing, from 1997 to 2005, we hadn't experienced much advancement, until Hinton struck again in 2006. He and his team proposed a DBN by stacking multiple RBMs. Then in 2012, again Hinton invented dropout, which significantly improved regularization and overfitting in a DNN.

After that, Ian Goodfellow et al. introduced GANs, a significant milestone in image recognition. In 2017, Hinton proposed CapsNets to overcome the limitations of regular CNNs—so far one of the most significant milestones.

Answer to question 3: Yes, you can use other deep learning frameworks described in the Deep learning frameworks section. However, since this book is about using Java for deep learning, I would suggest going for DeepLearning4J. We will see how flexibly we can create networks by stacking input, hidden, and output layers using DeepLearning4J in the next chapter.

Answer to question 4: Yes, you can, since the passenger's name containing a different title (for example, Mr., Mrs., Miss, Master, and so on) could be significant too. For example, we can imagine that being a woman (that is, Mrs.) and being a junior (for example, Master.) could give a higher chance of survival.

Even, after watching the famous movie Titanic (1997), we can imagine that being in a relationship, a girl might have a good chance of survival since his boyfriend would try to save her! Anyway, this is just for imagination, so do not take it seriously. Now, we can write a user-defined function to encode this using Apache Spark. Let's take a look at the following UDF in Java:

private static final UDF1<String, Option<String>> getTitle = (String name) ->      {
if(name.contains("Mr.")) { // If it has Mr.
return Some.apply("Mr.");
} else if(name.contains("Mrs.")) { // Or if has Mrs.
return Some.apply("Mrs.");
} else if(name.contains("Miss.")) { // Or if has Miss.
return Some.apply("Miss.");
} else if(name.contains("Master.")) { // Or if has Master.
return Some.apply("Master.");
} else{ // Not any.
return Some.apply("Untitled");
}
};

Next, we can register the UDF. Then I had to register the preceding UDF as follows:

spark.sqlContext().udf().register("getTitle", getTitle, DataTypes.StringType);

Dataset<Row> categoricalDF = df.select(callUDF("getTitle", col("Name")).alias("Name"), col("Sex"),
col("Ticket"), col("Cabin"), col("Embarked"));
categoricalDF.show();

The resulting column would look like this:

Answer to question 5: For many problems, you can start with just one or two hidden layers. This setting will work just fine using two hidden layers with the same total number of neurons (continue reading to get an idea about a number of neurons) in roughly the same amount of training time. Now let's see some naïve estimation about setting the number of hidden layers:

  • 0: Only capable of representing linear separable functions
  • 1: Can approximate any function that contains a continuous mapping from one finite space to another
  • 2: Can represent an arbitrary decision boundary to arbitrary accuracy

However, for a more complex problem, you can gradually ramp up the number of hidden layers, until you start overfitting the training set. Nevertheless, you can try increasing the number of neurons gradually until the network starts overfitting. This means the upper bound on the number of hidden neurons that will not result in overfitting is:

In the preceding equation:

  • Ni = number of input neurons
  • No = number of output neurons
  • Ns = number of samples in training dataset
  • α = an arbitrary scaling factor, usually 2-10

Note that the preceding equation does not come from any research but from my personal working experience.

Answer to question 6: Of course, we can. We can cross-validate the training and create a grid search technique for finding the best hyperparameters. Let's give it a try.

First, we have the layers defined. Unfortunately, we cannot cross-validate layers. Probably, it's either a bug or made intentionally by the Spark guys. So we stick to a single layering:

int[] layers = new int[] {10, 16, 16, 2};

Then we create the trainer and set only the layer and seed parameters:

MultilayerPerceptronClassifier mlp = new MultilayerPerceptronClassifier()
.setLayers(layers)
.setSeed(1234L);

We search through the MLP's different hyperparameters for the best model:

ParamMap[] paramGrid = new ParamGridBuilder() 
.addGrid(mlp.blockSize(), new int[] {32, 64, 128})
.addGrid(mlp.maxIter(), new int[] {10, 50})
.addGrid(mlp.tol(), new double[] {1E-2, 1E-4, 1E-6})
.build();
MulticlassClassificationEvaluator evaluator = new MulticlassClassificationEvaluator()
.setLabelCol("label")
.setPredictionCol("prediction");

We then set up the cross-validator and perform 10-fold cross-validation:

int numFolds = 10;
CrossValidator crossval = new CrossValidator()
.setEstimator(mlp)
.setEvaluator(evaluator)
.setEstimatorParamMaps(paramGrid)
.setNumFolds(numFolds);

Then we perform training using the cross-validated model:

CrossValidatorModel cvModel = crossval.fit(trainingData);

Finally, we evaluate the cross-validated model on the test set, as follows:

Dataset<Row> predictions = cvModel.transform(validationData);

Now we can compute and show the performance metrics, similar to our previous example:

double accuracy = evaluator1.evaluate(predictions);
double precision = evaluator2.evaluate(predictions);
double recall = evaluator3.evaluate(predictions);
double f1 = evaluator4.evaluate(predictions);

// Print the performance metrics
System.out.println("Accuracy = " + accuracy);
System.out.println("Precision = " + precision);
System.out.println("Recall = " + recall);
System.out.println("F1 = " + f1);
System.out.println("Test Error = " + (1 - accuracy));
>>>
Accuracy = 0.7810132575757576
Precision = 0.7810132575757576
Recall = 0.7810132575757576
F1 = 0.7810132575757576
Test Error = 0.21898674242424243
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime