All the examples have been implemented using Deeplearning4j with some open source libraries in Java. To be more specific, the following API/tools are required:
- Java/JDK version 1.8
- Spark version 2.3.0
- Spark csv_2.11 version 1.3.0
- ND4j backend version nd4j-cuda-9.0-platform for GPU, otherwise nd4j-native
- ND4j version >=1.0.0-alpha
- DL4j version >=1.0.0-alpha
- Datavec version >=1.0.0-alpha
- Arbiter version >=1.0.0-alpha
- Logback version 1.2.3
- JavaCV platform version 1.4.1
- HTTP Client version 4.3.5
- Jfreechart 1.0.13
- Jcodec 0.2.3
- Eclipse Mars or Luna (latest) or Intellij IDEA
- Maven Eclipse plugin (2.9 or higher)
- Maven compiler plugin for Eclipse (2.3.2 or higher)
- Maven assembly plugin for Eclipse (2.4.1 or higher)
Regarding operating system: Linux distributions are preferable (including Debian, Ubuntu, Fedora, RHEL, CentOS). To be more specific, for example, for Ubuntu it is recommended to have a 14.04 (LTS) 64-bit (or later) complete installation or VMWare player 12 or Virtual box. You can run Spark jobs on Windows (XP/7/8/10) or Mac OS X (10.4.7+).
Regarding hardware configuration: A machine or server having core i5 processor, about 100 GB disk space, and at least 16 GB RAM. In addition, an Nvidia GPU driver has to be installed with CUDA and CuDNN configured if you want to perform the training on GPU. Enough storage for running heavy jobs is needed (depending on the dataset size you will be handling), preferably at least 50 GB of free disk storage (for standalone and for SQL warehouse).