Tools for text summarization
Since our focus in this book is data mining with Python, we will focus on understanding some of the tools, libraries, and applications designed for text summarization in a Python environment. However, if you ever find yourself in a non-Python environment, or if you have a special case where you want to use an off-the-shelf or non-Python solution, you will be glad to know that there are dozens of other text summarization tools for other programming environments, many of which require no programming at all. In fact, the autotldr bot we discussed at the beginning of this chapter uses a package called SUMMRY, which has an API that is accessible via REST and returns JSON. You can read more about SUMMRY at http://smmry.com/api.
Here we will discuss three Python solutions: a simple NLTK-based method, a Gensim-based method, and a Python summarization package called Sumy.
Naive text summarization using NLTK
So far in this book, we have used NLTK for a variety of tasks including...