Synthetic data as a solution for NLP problems
In this section, you will understand how companies are leveraging synthetic data as a solution for their NLP-based problems. We will look at four case studies:
- SYSTRAN Soft’s use of synthetic data
- Telefónica’s use of synthetic data
- Clinical text mining utilizing synthetic data
- The Alexa virtual assistant model
SYSTRAN Soft’s use of synthetic data
Neural Machine Translation (NMT) is a promising approach in NLP. It utilizes neural networks to learn statistical models and thus perform the translation task. The typical architecture is composed of an encoder-decoder, which is usually trained on large-scale training datasets. These models were shown to achieve excellent results in practice. However, they also have some limitations, as we will see with the SYSTRAN case study.
SYSTRAN is one of the few pioneering companies in the field of machine translation technology (https://www.systransoft...