Chapter 6. NLP with Spark
In this chapter, we will see how to run NLP algorithms over Spark. You will learn the following recipes:
- Installing NLTK on Linux
- Installing Anaconda on Linux
- Anaconda for cluster management
- POS tagging with PySpark on an Anaconda cluster
- Named Entity Recognition with IPython over Spark
- Implementing openNLP - chunker over Spark
- Implementing openNLP - sentence detector over Spark
- Implementing stanford NLP - lemmatization over Spark
- Implementing sentiment analysis using stanford NLP over Spark