Unveiling the challenges of generating and utilizing synthetic data
In this section, you will understand the main common issues usually seen across different domains that limit the benefits and usability of synthetic data.
We can roughly categorize these limiting factors into four main categories:
- Domain gap
- Data representation
- Privacy, security, and validation
- Trust and credibility
They can be represented as shown in Figure 13.1:
Figure 13.1 – Main factors that limit the usability of synthetic data in practice
Next, let’s delve into each of these categories in more detail.
Domain gap
While neural networks are very successful at learning hidden patterns, correlations, and structures in large datasets, they can suffer from the domain gap problem. Domain gap usually refers to the difference between the source and target domains’ data. The source domain refers to the training data’s domain on...