Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Securing Hadoop
Securing Hadoop

Securing Hadoop: Implement robust end-to-end security for your Hadoop ecosystem

eBook
R$80 R$147.99
Paperback
R$183.99
Subscription
Free Trial
Renews at R$50p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

Securing Hadoop

Chapter 2. Hadoop Security Design

In Chapter 1, Hadoop Security Overview, we discussed the security considerations for an end-to-end Hadoop-based Big Data ecosystem. In this chapter, we will narrow our focus and take a deep dive into the security design of the Hadoop platform. Hadoop security was implemented as part of the HADOOP-4487 Jira issue, starting in late 2009 (https://issues.apache.org/jira/browse/HADOOP-4487). Currently, there are efforts to implement SSO Authentication in Hadoop. This is currently not production-ready, and hence will be out of scope of this book.

Hadoop security implementation is based on Kerberos. So in this chapter, first we will be provided with a high-level overview of key Kerberos terminologies and concepts, and then we will look into the details of the Hadoop security implementation.

The following are the topics we'll be covering in this chapter:

  • What is Kerberos?

  • The Hadoop default security model

  • The Hadoop Kerberos security implementation

What is Kerberos?


In any distributed system, when two parties (the client and server) have to communicate over the network, the first step in this communication is to establish trust between these parties. This is usually done through the authentication process, where the client presents its password to the server and the server verifies this password. If the client sends passwords over an unsecured network, there is a risk of passwords getting compromised as they travel through the network.

Kerberos is a secured network authentication protocol that provides strong authentication for client/server applications without transferring the password over the network. Kerberos works by using time-sensitive tickets that are generated using the symmetric key cryptography. Kerberos is derived from the Greek mythology where Kerberos was the three-headed dog that guarded the gates of Hades. The three heads of Kerberos in the security paradigm are:

  • The user who is trying to authenticate.

  • The service to...

The Hadoop default security model without Kerberos


Now that we understand how the Kerberos security protocol works, let us look at the details of the Hadoop default security model and its limitations.

Hadoop implements a security model similar to the POSIX filesystem, which gives the ability to apply file permissions and restrict read-write access to files and directories in HDFS. The user and admin can use the chmod and chown commands to change the permissions and ownership of the file/directories, similar to the POSIX filesystem. Hadoop does not provide any user management functionality. It uses the operating system user within Hadoop.

By default, Hadoop doesn't support any authentication of users or Hadoop services. A user only authenticates with the operating system during the logon process. After that, when the user invokes the Hadoop command, the user ID and group is set by executing whoami and bash -c groups respectively. So if a user writes their own whoami script and adds it to the...

Hadoop Kerberos security implementation


Enforcing security within a distributed system such as Hadoop is complex. The detailed requirements for securing Hadoop were identified by Owen O'Malley and others as part of the Hadoop security design. The detailed document is attached with the ticket HADOOP-4487 at https://issues.apache.org/jira/browse/HADOOP-4487. A summary of these requirements is explained in this section.

User-level access controls

A brief on the user-level access controls is:

  • Users of Hadoop should only be able to access data that is authorized for them

  • Only authenticated users should be able to submit jobs to the Hadoop cluster

  • Users should be able to view, modify, and kill only their own jobs

  • Only authenticated services should be able to register themselves as DataNodes or TaskTracker

  • Data block access within DataNode needs to be secured, and only authenticated users should be able to access the data stored in the Hadoop cluster

Service-level access controls

Here's a gist of the service...

Summary


In this chapter, we looked at the Kerberos authentication protocol and understood the key concepts involved in implementing Kerberos. We understood the default security implementation in Hadoop and how a Hadoop process gets the logged in user and group details. The default security implementation has many gaps and can't be used in production.

In a production scenario, securing Hadoop with Kerberos is essential. So we looked at the requirements that Hadoop supports at the user and Hadoop service level to secure the Hadoop cluster. We looked at the various internal secret keys (Delegation Token, Block Access Token, and Job Token) that are exchanged by the various Hadoop processes to ensure a secured ecosystem. Understanding the need and use of these tokens is vital to debug and troubleshoot any configuration issues in a secured Hadoop cluster. In the next chapter we will detail the procedure for securing a Hadoop cluster.

Left arrow icon Right arrow icon

Key benefits

  • Master the key concepts behind Hadoop security as well as how to secure a Hadoop-based Big Data ecosystem
  • Understand and deploy authentication, authorization, and data encryption in a Hadoop-based Big Data platform
  • Administer the auditing and security event monitoring system

Description

Security of Big Data is one of the biggest concerns for enterprises today. How do we protect the sensitive information in a Hadoop ecosystem? How can we integrate Hadoop security with existing enterprise security systems? What are the challenges in securing Hadoop and its ecosystem? These are the questions which need to be answered in order to ensure effective management of Big Data. Hadoop, along with Kerberos, provides security features which enable Big Data management and which keep data secure. This book is a practitioner's guide for securing a Hadoop-based Big Data platform. This book provides you with a step-by-step approach to implementing end-to-end security along with a solid foundation of knowledge of the Hadoop and Kerberos security models. This practical, hands-on guide looks at the security challenges involved in securing sensitive data in a Hadoop-based Big Data platform and also covers the Security Reference Architecture for securing Big Data. It will take you through the internals of the Hadoop and Kerberos security models and will provide detailed implementation steps for securing Hadoop. You will also learn how the internals of the Hadoop security model are implemented, how to integrate Enterprise Security Systems with Hadoop security, and how you can manage and control user access to a Hadoop ecosystem seamlessly. You will also get acquainted with implementing audit logging and security incident monitoring within a Big Data platform.

Who is this book for?

This book is great for Hadoop practitioners (solution architects, Hadoop administrators, developers, and Hadoop project managers) who are looking to get a good grounding in what Kerberos is all about and who wish to learn how to implement end-to-end Hadoop security within an enterprise setup. It's assumed that you will have some basic understanding of Hadoop as well as be familiar with some basic security concepts.

What you will learn

  • Understand the challenges of securing Hadoop and Big Data and master the reference architecture for Big Data security
  • Demystify Kerberos and the Hadoop security model
  • Learn the steps to secure a Hadoop platform with Kerberos
  • Integrate Enterprise Security Systems with Hadoop security and build an integrated security model
  • Get detailed insights into securing sensitive data in a Hadoop Big Data platform
  • Implement audit logging and a security event monitoring system for your Big Data platform
  • Discover the various industry tools and vendors that can be used to build a secured Hadoop platform
  • Recognize how the various Hadoop components interact with each other and what protocols and security they implement
  • Design a secure Hadoop infrastructure and implement the various security controls within the enterprise

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Nov 22, 2013
Length: 116 pages
Edition : 1st
Language : English
ISBN-13 : 9781783285266
Vendor :
Apache
Category :
Languages :
Tools :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Nov 22, 2013
Length: 116 pages
Edition : 1st
Language : English
ISBN-13 : 9781783285266
Vendor :
Apache
Category :
Languages :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
R$50 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
R$500 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just R$25 each
Feature tick icon Exclusive print discounts
R$800 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just R$25 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total R$ 401.98
Hadoop Cluster Deployment
R$217.99
Securing Hadoop
R$183.99
Total R$ 401.98 Stars icon

Table of Contents

7 Chapters
Hadoop Security Overview Chevron down icon Chevron up icon
Hadoop Security Design Chevron down icon Chevron up icon
Setting Up a Secured Hadoop Cluster Chevron down icon Chevron up icon
Securing the Hadoop Ecosystem Chevron down icon Chevron up icon
Integrating Hadoop with Enterprise Security Systems Chevron down icon Chevron up icon
Securing Sensitive Data in Hadoop Chevron down icon Chevron up icon
Security Event and Audit Logging in Hadoop Chevron down icon Chevron up icon

Customer reviews

Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.5
(2 Ratings)
5 star 50%
4 star 50%
3 star 0%
2 star 0%
1 star 0%
Jeff Feb 13, 2014
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Great presentation of information on how to secure Hadoop. Covers all relevant areas of Hadoop security in great detail while offering solid best practices.
Amazon Verified review Amazon
MiaMadea Mar 05, 2014
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
Before reading the book, i had a minimal understanding of Hadoop. let alone knowing how to secure it. The book was comprehensive regarding security configurations, yet was straightforward to understand if you have a basic knowledge of Hadoop. I now refer to the book occasionally when reviewing Hadoop implementations.The reader should be aware that different vendors offering Hadoop (Cloudera, IBM, etc) made have their own extensions of Hadoop and thus the security features available may differ from what is described in the book.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.