Setting up the environment for Spark NLP
Since Spark NLP runs on Apache Spark, you need to first set up Apache Spark to make Spark NLP work properly. Apache Spark requires Java to run properly; thus, you also need to install Java. Optionally, you can install Scala, another programming language.
To install Spark NLP, you should install the following frameworks and libraries:
- Python
- Java
- Scala (optional)
- Apache Spark
- PySpark and Spark NLP
We have already installed Python following the procedure described in the Technical requirements section, so we can start installing the software from the second step, Java.
Installing Java
Spark NLP is built at the top of Apache Spark, which can be installed on any operating system supporting Java. Apache Spark requires Java 8 to work properly:
- To verify whether Java is already installed on your computer, as well as its current version, you can open a terminal and run the following command:
Java –...