0

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Free Learning

Programming MapReduce with Scalding

You're reading from Programming MapReduce with Scalding A practical guide to designing, testing, and implementing complex MapReduce applications in Scala

Product type Paperback

Published in Jun 2014

Publisher

ISBN-13 9781783287017

Length 148 pages

Edition 1st Edition

Languages

Scala

Tools

Hadoop

Concepts

Front End Web Development

Author (1):

Antonios Chalkiopoulos

View More author details

Table of Contents (11) Chapters

Preface

1. Introduction to MapReduce

2. Get Ready for Scalding FREE CHAPTER

3. Scalding by Example

4. Intermediate Examples

5. Scalding Design Patterns

6. Testing and TDD

7. Running Scalding in Production

8. Using External Data Stores

9. Matrix Calculations and Machine Learning

Index

Reading and writing files

Data lives mostly in files stored in the filesystem in semi-structured text files, structured delimited files, or more sophisticated formats such as Avro and Parquet. Logfiles, SQL exports, JSON, XML, and any type of file can be processed with Scalding.

Reading and writing files

Scalding is capable of reading and writing many file formats, which are:

The TextLine format is used to read and write raw text files, and it returns tuples with two fields named by default: offset and line. These values are inherited from Hadoop. After reading a text file, we usually parse with regular expressions to apply a schema to the data.
Delimited files such as Tab Separated Values (TSV), Comma Separated Values (CSV), and One Separated Values (OSV), with the latter commonly used in Pig and Hive, are already structured text files, and thus, easier to work with.
Advanced serialization files such as Avro, Parquet, Thrift, and protocol buffers offer their own capabilities. Avro, for example, is a data-serialization...

The rest of the chapter is locked

Register for a free Packt account to unlock a world of extra content!

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (1)

Antonios Chalkiopoulos

Antonios Chalkiopoulos

Antonios Chalkiopoulos is a developer living in London and a professional working with Hadoop and Big Data technologies. He completed a number of complex MapReduce applications in Scalding into 40-plus production nodes HDFS Cluster. He is a contributor to Scalding and other open source projects, and he is interested in cloud technologies, NoSQL databases, distributed real-time computation systems, and machine learning. He was involved in a number of Big Data projects before discovering Scala and Scalding. Most of the content of this book comes from his experience and knowledge accumulated while working with a great team of engineers.

See other products by Antonios Chalkiopoulos

Personalised recommendations for you

Based on your interests and search pattern

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Mastering Node.js Web Development

Mastering Node.js Web Development

Explore Node.js with practical examples that will teach you how to utilize open-source packages for real-world solutions. Gain the skills to develop and deploy server-side applications that enhance your client-side projects.

Jun 2024 25h 56m

Mastering Node.js Web Development

Mastering Node.js Web Development

Explore Node.js with practical examples that will teach you how to utilize open-source packages for real-world solutions. Gain the skills to develop and deploy server-side applications that enhance your client-side projects.

Jun 2024 25h 56m

Mastering Node.js Web Development

Mastering Node.js Web Development

Explore Node.js with practical examples that will teach you how to utilize open-source packages for real-world solutions. Gain the skills to develop and deploy server-side applications that enhance your client-side projects.

Jun 2024 25h 56m

Mastering Node.js Web Development

Mastering Node.js Web Development

Explore Node.js with practical examples that will teach you how to utilize open-source packages for real-world solutions. Gain the skills to develop and deploy server-side applications that enhance your client-side projects.

Jun 2024 25h 56m