Encryption
For sensitive and legally protected data such as personal identity information (PII), it is required to store the data in encrypted format in the filesystem. However, Hive does not natively support encryption and decryption yet (see https://issues.apache.org/jira/browse/HIVE-5207).
Alternatively, we can look for third-party tools to encrypt and decrypt data after exporting it from Hive, but this requires additional postprocessing. The new HDFS encryption (see https://issues.apache.org/jira/browse/HDFS-6134) offers great transparent encryption and decryption of data on HDFS. It will satisfy our request if we want to encrypt the whole dataset in HDFS. However, it cannot be applied to the selected column and row level in the table of Hive, where most PII that is encrypted is only a part of raw data. In this case, the best solution for now is to use Hive UDF to plug in encryption and decryption implementations on selected columns or partial data in the Hive tables.
Sample UDF implementations...