Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Free Learning

You're reading from Artificial Intelligence for Big Data Complete guide to automating Big Data solutions using Artificial Intelligence techniques

Product type Paperback

Published in May 2018

Publisher Packt

ISBN-13 9781788472173

Length 384 pages

Edition 1st Edition

Languages

Java

Tools

TensorFlow

Concepts

Artificial Intelligence

Authors (2):

Anand Deshpande

Manish Kumar

View More author details

Table of Contents (14) Chapters

Preface

1. Big Data and Artificial Intelligence Systems FREE CHAPTER

2. Ontology for Big Data

3. Learning from Big Data

4. Neural Network for Big Data

5. Deep Big Data Analytics

6. Natural Language Processing

7. Fuzzy Systems

8. Genetic Programming

9. Swarm Intelligence

10. Reinforcement Learning

11. Cyber Security

12. Cognitive Computing

13. Other Books You May Enjoy

Leave a review - let other readers know what you think

Building data preparation pipelines

The deep neural networks are best suited for supervised learning problems where we have access to historical datasets. These datasets are used for training the neural network. As seen in diagram 5.1, the more data we have at our disposal for training, the better the deep neural network gets in terms of accurately predicting the outcome for the new and unknown data values by generalizing the training datasets. In order for the deep neural networks to perform optimally, we need to carefully procure, transform, scale, normalize, join, and split the data. This is very similar to building a data pipeline in a data warehouse or a data lake with the help of the ETL (Extract Transform and Load with a traditional data warehouse) and ELTTT (Extract Load and Transform multiple times in modern data lakes) pipelines.

We are going to deal with data...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (2)

Deshpande

Tanmay Deshpande is a Hadoop and big data evangelist. He currently works with Schlumberger as a Big Data Architect in Pune, India. He has interest in a wide range of technologies, such as Hadoop, Hive, Pig, NoSQL databases, Mahout, Sqoop, Java, cloud computing, and so on. He has vast experience in application development in various domains, such as oil and gas, finance, telecom, manufacturing, security, and retail. He enjoys solving machine-learning problems and spends his time reading anything that he can get his hands on. He has great interest in open source technologies and has been promoting them through his talks. Before Schlumberger, he worked with Symantec, Lumiata, and Infosys. Through his innovative thinking and dynamic leadership, he has successfully completed various projects. He regularly blogs on his website http://hadooptutorials.co.in. You can connect with him on LinkedIn at https://www.linkedin.com/in/deshpandetanmay/. He has also authored Mastering DynamoDB, published in August 2014, DynamoDB Cookbook, published in September 2015, Hadoop Real World Solutions Cookbook-Second Edition, published in March 2016, Hadoop: Data Processing and Modelling, published in August, 2016, and Hadoop Blueprints, published in September 2016, all by Packt Publishing.

See other products by Deshpande

Kumar

Ashish Kumar is a seasoned data science professional, a publisher author and a thought leader in the field of data science and machine learning. An IIT Madras graduate and a Young India Fellow, he has around 7 years of experience in implementing and deploying data science and machine learning solutions for challenging industry problems in both hands-on and leadership roles. Natural Language Procession, IoT Analytics, R Shiny product development, Ensemble ML methods etc. are his core areas of expertise. He is fluent in Python and R and teaches a popular ML course at Simplilearn. When not crunching data, Ashish sneaks off to the next hip beach around and enjoys the company of his Kindle. He also trains and mentors data science aspirants and fledgling start-ups.

See other products by Kumar