Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
50 Hours of Big Data, PySpark, AWS, Scala, and Scraping
50 Hours of Big Data, PySpark, AWS, Scala, and Scraping

50 Hours of Big Data, PySpark, AWS, Scala, and Scraping: Big Data with Scala and Spark, PySpark and AWS, Data Scraping and Data Mining with Python, Mastering MongoDB for Beginners

Arrow left icon
Profile Icon AI Sciences
Arrow right icon
€8.99 €37.99
Full star icon Full star icon Full star icon Full star icon Full star icon 5 (1 Ratings)
Video Mar 2022 54hrs 32mins 1st Edition
Video
€8.99 €37.99
Subscription
Free Trial
Renews at €18.99p/m
Arrow left icon
Profile Icon AI Sciences
Arrow right icon
€8.99 €37.99
Full star icon Full star icon Full star icon Full star icon Full star icon 5 (1 Ratings)
Video Mar 2022 54hrs 32mins 1st Edition
Video
€8.99 €37.99
Subscription
Free Trial
Renews at €18.99p/m
Video
€8.99 €37.99
Subscription
Free Trial
Renews at €18.99p/m

What do you get with a video?

Product feature icon Download this video in MP4 format
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Key benefits

  • Data scraping and data mining for beginners to pro with Python
  • Clear unfolding of concepts with examples in Python, Scrapy, Scala, PySpark, and MongoDB
  • Master Big Data with PySpark and AWS

Description

Part 1 is designed to reflect the most in-demand Scala skills. It provides an in-depth understanding of core Scala concepts. We will wrap up with a discussion on Map Reduce and ETL pipelines using Spark from AWS S3 to AWS RDS (includes six mini-projects and one Scala Spark project). Part 2 covers PySpark to perform data analysis. You will explore Spark RDDs, Dataframes, a bit of Spark SQL queries, transformations, and actions that can be performed on the data using Spark RDDs and dataframes, the ecosystem of Spark and Hadoop, and their underlying architecture. You will also learn how we can leverage AWS storage, databases, computations, and how Spark can communicate with different AWS services. Part 3 is all about data scraping and data mining. You will cover important concepts such as Internet Browser execution and communication with the server, synchronous and asynchronous, parsing data in response from the server, tools for data scraping, Python requests module, and more. In Part 4, you will be using MongoDB to develop an understanding of the NoSQL databases. You will explore the basic operations and explore the MongoDB query, project and update operators. We will wind up this section with two projects: Developing a CRUD-based application using Django and MongoDB and implementing an ETL pipeline using PySpark to dump the data in MongoDB. By the end of this course, you will be able to relate the concepts and practical aspects of learned technologies with real-world problems. All the resources of this course are available at https://github.com/PacktPublishing/50-Hours-of-Big-Data-PySpark-AWS-Scala-and-Scraping

Who is this book for?

This course is designed for absolute beginners who want to create intelligent solutions, study with actual data, and enjoy learning theory and then putting it into practice. Data scientists, machine learning experts, and drop shippers will all benefit from this training. A basic understanding of programming, HTML tags, Python, SQL, and Node JS is required. However, no prior knowledge of data scraping, and Scala is needed.

What you will learn

  • Build ETL pipeline from AWS S3 to AWS RDS using Spark
  • Explore Spark/Hadoop applications, ecosystem, and architecture
  • Learn collaborative filtering in PySpark
  • Recognize the distinction between synchronous and asynchronous requests
  • Understand MongoDB CRUD, query operators, projection operators, and update operators
  • Build APIs for CRUD operations in MongoDB through Django

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Mar 30, 2022
Length: 54hrs 32mins
Edition : 1st
Language : English
ISBN-13 : 9781803237039
Category :
Languages :
Concepts :
Tools :

What do you get with a video?

Product feature icon Download this video in MP4 format
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Mar 30, 2022
Length: 54hrs 32mins
Edition : 1st
Language : English
ISBN-13 : 9781803237039
Category :
Languages :
Concepts :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
€18.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
€189.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts
€264.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total 116.97 145.97 29.00 saved
50 Hours of Big Data, PySpark, AWS, Scala, and Scraping
€8.99 €37.99
Solutions Architect's Handbook
€67.99
Modern Time Series Forecasting with Python
€39.99
Total 116.97 145.97 29.00 saved Stars icon
Banner background image

Table of Contents

35 Chapters
Part 1 - Data Scraping and Data Mining for Beginners to Pro with Python Chevron down icon Chevron up icon
Requests Chevron down icon Chevron up icon
Beautiful Soup 4 (BS4) Chevron down icon Chevron up icon
CSS Selectors Chevron down icon Chevron up icon
Scrapy Chevron down icon Chevron up icon
Scrapy Project Chevron down icon Chevron up icon
Selenium Chevron down icon Chevron up icon
Project Selenium Chevron down icon Chevron up icon
Part 2 - Scala and Spark - Master Big Data with Scala and Spark Chevron down icon Chevron up icon
Scala Overview Chevron down icon Chevron up icon
Flow Control Chevron down icon Chevron up icon
Functions Chevron down icon Chevron up icon
Classes Chevron down icon Chevron up icon
Data Structures Chevron down icon Chevron up icon
Project for Scala and Spark Chevron down icon Chevron up icon
Part 3 - PySpark and AWS - Master Big Data with PySpark and AWS Chevron down icon Chevron up icon
Introduction to Hadoop, Spark Ecosystems and Architectures Chevron down icon Chevron up icon
Spark RDDs Chevron down icon Chevron up icon
Spark DFs Chevron down icon Chevron up icon
Collaborative Filtering Chevron down icon Chevron up icon
Spark Streaming Chevron down icon Chevron up icon
ETL Pipeline Chevron down icon Chevron up icon
Project - Change Data Capture / Replication On Going Chevron down icon Chevron up icon
Part 4 - MongoDB-Mastering MongoDB for Beginners (Theory and Projects) Chevron down icon Chevron up icon
Overview Chevron down icon Chevron up icon
Basic Mongo Operations Chevron down icon Chevron up icon
Basic Update Operation Chevron down icon Chevron up icon
Basic Read Operation Chevron down icon Chevron up icon
Basic Delete Operation Chevron down icon Chevron up icon
Query and projection operators Chevron down icon Chevron up icon
Update Operators Chevron down icon Chevron up icon
Mongo with Node Chevron down icon Chevron up icon
Mongo with Python Chevron down icon Chevron up icon
Django with Mongo Chevron down icon Chevron up icon
Spark with Mongo Chevron down icon Chevron up icon

Customer reviews

Rating distribution
Full star icon Full star icon Full star icon Full star icon Full star icon 5
(1 Ratings)
5 star 100%
4 star 0%
3 star 0%
2 star 0%
1 star 0%
Tadeo Jan 09, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
it is a complete and good course !!!
Subscriber review Packt
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How can I download a video package for offline viewing? Chevron down icon Chevron up icon
  1. Login to your account at Packtpub.com.
  2. Click on "My Account" and then click on the "My Videos" tab to access your videos.
  3. Click on the "Download Now" link to start your video download.
How can I extract my video file? Chevron down icon Chevron up icon

All modern operating systems ship with ZIP file extraction built in. If you'd prefer to use a dedicated compression application, we've tested WinRAR / 7-Zip for Windows, Zipeg / iZip / UnRarX for Mac and 7-Zip / PeaZip for Linux. These applications support all extension files.

How can I get help and support around my video package? Chevron down icon Chevron up icon

If your video course doesn't give you what you were expecting, either because of functionality problems or because the content isn't up to scratch, please mail customercare@packt.com with details of the problem. In addition, so that we can best provide the support you need, please include the following information for our support team.

  1. Video
  2. Format watched (HTML, MP4, streaming)
  3. Chapter or section that issue relates to (if relevant)
  4. System being played on
  5. Browser used (if relevant)
  6. Details of support
Why can’t I download my video package? Chevron down icon Chevron up icon

In the even that you are having issues downloading your video package then please follow these instructions:

  1. Disable all your browser plugins and extensions: Some security and download manager extensions can cause issues during the download.
  2. Download the video course using a different browser: We've tested downloads operate correctly in current versions of Chrome, Firefox, Internet Explorer, and Safari.