Domain-specific issues limiting the usability of synthetic data
In addition to general issues that limit the usability of synthetic data in practice, there are also domain-specific issues related to that. In this section, we explore these common domain-specific issues limiting the usability of synthetic data. Let’s study synthetic data usability issues in the following three fields: healthcare, finance, and autonomous cars.
Healthcare
ML in healthcare requires large-scale training data. Usually, the data is unstructured, comes from different sensors and sources, is longitudinal (data collected over a long period), is highly imbalanced, and contains sensitive information. The illnesses and diseases that patients suffer from are diverse and complex and depend on a multitude of factors, such as genes, geographic location, medical conditions, and occupation. Thus, to generate useful synthetic training data in the healthcare field, domain experts are usually needed...