Summary
As usual, this is the author speaking at the end of the chapter. How was your first experience with internal audit? It seems you actually got a lot out of those PDFs.
You first learned how to iteratively read text from PDFs and store it in a single data frame.
Then you discovered how to prepare the data frame for text mining activities, removing irrelevant words and transforming it from a list of sentences into a list of words. Finally, you learned how to perform sentiment analysis, wordcloud development, and n-gram analysis on it.
From these analyses, you discovered that the companies you predicted being defaulted are actually considered bad customers by your colleagues in the commercial department.
This helped you gain knowledge from unstructured data.
Moving to more structured data contained in the same PDFs, you learned how to transform the data into an edge list in order to perform network analysis, which mainly consisted of the computation of the nodes' degrees. This resulted in...