Importing data into HDFS from Mainframes
Mainframes is one of the most used datasets in financial domain for quite a long time. Sqoop supports importing datasets from Mainframes into HDFS. This is an important recipe for those who are looking to migrate from Mainframes to Hadoop base systems.
Getting ready
To perform this recipe, you should have a Hadoop cluster running with you as well as the latest version of Sqoop installed on it. Here I am using Sqoop 1.4.6. We would also need a MySQL database to be present in the network. Installing Sqoop is easy by downloading Sqoop tar ball and setting it in the system path.
How to do it...
Sqoop provides a tool called import-mainframe
, using which, we can connect to a certain mainframe host and select the dataset to be imported. The following command connects to a mainframe host with the provided credentials and then imports the mentioned dataset into the HDFS target directory:
sqoop import-mainframe --connnect <mainframes-host> \ --dataset <...