Sensitive data masking and encryption using Hadoop
A lot of companies handle sensitive information such as SSN numbers, names, credit card numbers, and so on. In this recipe, we are going to take a look at how to use Hadoop to mask or encrypt this data in order to secure it. This recipe can be referred to by various domains, such as finance, retail, telecom, and those people who handle critical information.
Getting ready
To perform this recipe, you should have an up and running Hadoop cluster.
How to do it...
Before jumping into the solution, let's first try to understand the problem statement.
Problem statement
Handling sensitive information is a critical part of today's data operations. Here, the problem statement is to transform critical information into masked data or completely encrypted data.
Solution
Here, we assume that we already have data with us in flat files and it has been loaded into HDFS.
Let's say we have some sample data, as shown here, which has the name and credit...