In this chapter, we developed a complete deep learning application that classifies a large collection of video datasets from the UCF101 dataset. We applied a combined CNN-LSTM network with DL4J that overcome the limitation of standalone CNN or RNN LSTM networks.
Finally, we saw how to perform training in parallel and distributed ways across multiple devices (CPUs and GPUs). In summary, this end-to-end project can be treated as a primer for human activity recognition from a video. Although we did not achieve high accuracy after training, in the network with a full video dataset and hyperparameter tuning, the accuracy will definitely be increased.
The next chapter is all about designing a machine learning system driven by criticisms and rewards. We will see how to develop a demo GridWorld game using DL4J, RL4J, and neural Q-learning, which acts as the Q-function. We will...