Pseudonymizing data in Power BI
Unlike anonymization, pseudonymization maintains the statistical characteristics of the dataset by transforming the same input string into the same output string, and keeps track of replacements that have occurred, allowing those with access to this mapping information to obtain the original dataset again.
Moreover, pseudonymization replaces sensitive data with fake strings (pseudonyms), having the same form as the original one, making the de-identified data more realistic.
Depending on the analytical language used, there are different solutions driven by the different packages available that lead to the same result. Let's see how to apply pseudonymization in Power BI to the contents of the same Excel file used in the previous sections with Python.
Pseudonymizing data using Python
The modules and the code structure you will use are quite similar to those already used for anonymization. One difference is that, once the sensitive entities...