In Hadoop, there are many file formats available. A user can select any format based on the use case. Each format has special features in terms of storage and performance. Let's discuss each file format in detail.
Hadoop file formats
Text/CSV file
Text and CSV files are very common in Hadoop data processing algorithms. Each line in the file is treated as a new record. Typically, each line ends with the n character. These files do not support column headers. Hence, while processing, an extra line of the code is always required to remove column headings. CSV files are typically compressed using GZIP codec because they do not support block level compression; it adds to more processing costs. Needless to mention they do not...