Word2Vec for 10-K financial documents to the SEC
Financial documents have a large amount of unstructured data too. This use case has been chosen in this book because it applies NLP techniques to financial documents. Professionals in the financial services industry may find this use case helpful.
Background
A 10-K financial document is an annual report filled in by a publicly traded company on its financial performance. It is required by the US Securities and Exchange Commission (SEC). While a 10-K report has many numbers and tables, there are textual sections on Risk Factors (Item 1A), Management’s Discussion and Analysis (Item 7), and Quantitative and Qualitative Disclosures about Market Risks (Item 7A). These textual sections represent the perspective of management about the business of the company.
Questions
How do we apply NLP techniques to financial documents?
NLP solution
The author of [8] applied Word2Vec to get the word embeddings. The author then performed...