Introducing sequential learning
The machine learning problems we have solved so far in this book have been time independent. For example, ad click-through doesn’t depend on the user’s historical ad clicks under our previous approach; in face classification, the model only takes in the current face image, not previous ones. However, there are many cases in life that depend on time. For example, in financial fraud detection, we can’t just look at the present transaction; we should also consider previous transactions so that we can model based on their discrepancy. Another example is Part-of-Speech (PoS) tagging, where we assign a PoS (verb, noun, adverb, and so on) to a word. Instead of solely focusing on the given word, we must look at some previous words, and sometimes the next words too.
In time-dependent cases like those just mentioned, the current output is dependent on not only the current input but also the previous inputs; note that the length of the...