Managing files in Hadoop
Hadoop has its own file management concepts that come with many different mechanisms for data storage and retrieval. Hadoop is designed to manage large volumes of data distributed across many nodes built with commodity hardware. As such, Hadoop manages the distribution of large volumes of data using techniques designed to divide, compress, and share the data all while dealing with the possibilities of node failures and numerous processes accessing the same data simultaneously. Many of the filesystem concepts in Hadoop are exactly the same as in other systems, such as directory structures. However, other concepts, such as MapFiles and Hadoop Archive Files, are unique to Hadoop. This section covers many of the file management concepts that are unique to Hadoop.
File permissions
HDFS uses a standard file permission approach. The three types of permissions for files and directories are:
- Read (
r
): Read a file and list a directory's contents - Write (
w
): Write to a file...