Part 2: Latent Semantic Analysis/Latent Semantic Indexing
In this part, you will understand how Latent Semantic Analysis (LSA) was developed from Sigular Value Decomposition (SVD) and Truncated SVD. We will use Scikit-learn to show you the development from Truncated SVD to LSA.
This book follows the conventional name LSA/LSI because LSA is also called Latent Semantic Indexing (LSI). In this part, we will introduce one of the core measures for vector similarity – cosine similarity. We will then use Gensim to build an LSA/LSI model. You will learn how to determine the optimal number of topics, set up your model as an informational retrieval tool, and use cosine similarity to search for the most relevant documents according to a set of keywords.
This part contains the following chapters:
- Chapter 4, Latent Semantic Analysis with scikit-learn
- Chapter 5, Cosine Similarity
- Chapter 6, Latent Semantic Indexing with Gensim