Real data challenges and issues
So far in this chapter, we have presented different approaches for mitigating the privacy issues in real data. As you can see, it is clear that these approaches have limitations and they are not always practical. One fundamental issue is that ML models memorize the training data. Thus, given a trained ML model, it may be possible to retrieve some of the training data. Many researchers recently raised a red flag about the privacy of ML models, even after applying standard privacy solutions. For more information, please refer to How To Break Anonymity of the Netflix Prize Dataset (https://arxiv.org/abs/cs/0610105) and The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks (https://arxiv.org/abs/1802.08232).
The nature of the real data is the essence of the problem. For instance, if you are given real human faces and you do some operations to anonymize this data or if you apply state-of-the-art approaches for PPML training...