Key Input Parameters for TextRank
We'll be using the gensim library to implement TextRank. The following are the parameters required for this:
text
: This is the input text.ratio
: This is the required ratio of the number of sentences in the summary to the number of sentences in the input text.
The gensim implementation of the TextRank algorithm uses BM25—a probabilistic variation of TF-IDF—for similarity computation in place of the similarity measure described in step 3 of the algorithm. This will be clearer in the following exercise, in which you will summarize text using TextRank.
Exercise 7.02: Performing Summarization Using TextRank
In this exercise, we will use the classic short story, After Twenty Years by O. Henry, which is available on Project Gutenberg, and the first section of the Wikipedia article on Oscar Wilde. We will summarize each text separately so that we have 20% of the sentences in the original text and then have 25% of...