Managing data privacy
In the Understanding the security requirements of a data mesh architecture section, we learned that data security and data privacy are related topics and have an overlap. If you have implemented all aspects of data security, then you have covered most of data privacy, too. There are only two more topics that we should cover to complete the data security and privacy topic: data masking and data retention.
Once again, we will consider the two dominant data stores—SQL Database and Azure Data Lake Gen2.
Data masking
Often, data engineers and data scientists have a requirement to query sensitive personally identifiable information (PII) as a part of their experiment or pipeline. While this data should flow through the system, it should not be visible to the human eye to prevent any malicious use or data leak. Sometimes, sensitive data, such as social security numbers, can even be part of a table relationship and, hence, part of join operations.
For...