Non-negative matrix factorization (NMF) relies heavily on linear algebra. It factorizes an input matrix, V, into a product of two smaller matrices, W and H, in such a way that these three matrices have no negative values. In the context of NLP, these three matrices have the following meanings:
- The input matrix V is the term counts or tf-idf matrix of size n * m, where n is the number of documents or samples, and m is the number of terms.
- The first decomposition output matrix W is the feature matrix of size t * m, where t is the number of topics specified. Each row of W represents a topic with each element in the row representing the rank of a term in the topic.
- The second decomposition output matrix H is the coefficient matrix of size n * t. Each row of H represents a document, with each element in the row representing the weight of a topic within the...