The introduction to this chapter touched upon the challenge of representing text data in a mathematical form. Two of the most popular data structures used with text data are vectors and matrices. We will now have a look at each one of these in detail.
Vectors
Vectors are a one-dimensional array of numbers in which each number could be identified by its respective indices. They are typically represented as a column enclosed in square brackets, as shown here:
In this example, the x vector has three elements, and these three elements store information about the vector. Mathematicians abstract vectors as an object in space, where each element of the vector represents the projection of that vector along a given axis. We often use the term Rn to define a vector, where R is a representation mechanism and n denotes the number of dimensions used to describe the vector. In general, Rn is the set of all n-tuples of real numbers.
In the preceding example, the...