Tutorial – developing a scientific data search engine using transformers
So far, we have looked at text from a word-by-word perspective in the sense that we kept our text as is, without the need to convert or embed it in any way. In some cases, converting words into numerical values or embeddings can open many new doors and unlock many new possibilities, especially when it comes to deep learning. Our main objective within this tutorial will be to develop a search engine to find and retrieve scientific data. We will do so by implementing an important and useful deep learning NLP architecture known as a transformer. The main benefit here is that we will be designing a powerful semantic search engine in the sense that we can now search for ideas or semantic meaning rather than only keywords.
We can think of transformers as deep learning models designed to solve sequence-based tasks using a mechanism known as self-attention. We can think of self-attention as a method to help...