Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Python: End-to-end Data Analysis

You're reading from   Python: End-to-end Data Analysis Leverage the power of Python to clean, scrape, analyze, and visualize your data

Arrow left icon
Product type Course
Published in May 2017
Publisher Packt
ISBN-13 9781788394697
Length 931 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Authors (5):
Arrow left icon
Luiz Felipe Martins Luiz Felipe Martins
Author Profile Icon Luiz Felipe Martins
Luiz Felipe Martins
Ivan Idris Ivan Idris
Author Profile Icon Ivan Idris
Ivan Idris
Phuong Vo.T.H Phuong Vo.T.H
Author Profile Icon Phuong Vo.T.H
Phuong Vo.T.H
Martin Czygan Martin Czygan
Author Profile Icon Martin Czygan
Martin Czygan
Magnus Vilhelm Persson Magnus Vilhelm Persson
Author Profile Icon Magnus Vilhelm Persson
Magnus Vilhelm Persson
+1 more Show less
Arrow right icon
View More author details
Toc

Chapter 5. Web Mining, Databases, and Big Data

On the menu for this chapter are the following recipes:

  • Simulating web browsing
  • Scraping the Web
  • Dealing with non-ASCII text and HTML entities
  • Implementing association tables
  • Setting up database migration scripts
  • Adding a table column to an existing table
  • Adding indices after table creation
  • Setting up a test web server
  • Implementing a star schema with fact and dimension tables
  • Using HDFS
  • Setting up Spark
  • Clustering data with Spark

Introduction

This chapter is light on math, but it is more focused on technical topics. Technology has a lot to offer for data analysts. Databases have been around for a while, but the relational databases that most people are familiar with can be traced back to the 1970s. Edgar Codd came up with a number of ideas that later led to the creation of the relational model and SQL. Relational databases have been a dominant technology since then. In the 1980s, object-oriented programming languages caused a paradigm shift and...

You have been reading a chapter from
Python: End-to-end Data Analysis
Published in: May 2017
Publisher: Packt
ISBN-13: 9781788394697
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image