You're reading from Practical Data Science with Python Learn tools and techniques from hands-on examples to extract insights from data

Product type Paperback

Published in Sep 2021

Publisher Packt

ISBN-13 9781801071970

Length 620 pages

Edition 1st Edition

Languages

Python

Tools

Excel

Concepts

Data Science

Author (1):

Nathan George

View More author details

Table of Contents (30) Chapters

Preface

1. Part I - An Introduction and the Basics

2. Introduction to Data Science FREE CHAPTER

3. Getting Started with Python

4. Part II - Dealing with Data

5. SQL and Built-in File Handling Modules in Python

6. Loading and Wrangling Data with Pandas and NumPy

7. Exploratory Data Analysis and Visualization

8. Data Wrangling Documents and Spreadsheets

9. Web Scraping

10. Part III - Statistics for Data Science

11. Probability, Distributions, and Sampling

12. Statistical Testing for Data Science

13. Part IV - Machine Learning

14. Preparing Data for Machine Learning: Feature Selection, Feature Engineering, and Dimensionality Reduction

15. Machine Learning for Classification

16. Evaluating Machine Learning Classification Models and Sampling for Classification

17. Machine Learning with Regression

18. Optimizing Models and Using AutoML

19. Tree-Based Machine Learning Models

20. Support Vector Machine (SVM) Machine Learning Models

21. Part V - Text Analysis and Reporting

22. Clustering with Machine Learning

23. Working with Text

24. Part VI - Wrapping Up

25. Data Storytelling and Automated Reporting/Dashboarding

26. Ethics and Privacy

27. Staying Up to Date and the Future of Data Science

28. Other Books You May Enjoy

29. Index

Understanding the structure of the internet

Before undertaking web scraping, it's important to have a rudimentary understanding of how the internet works. Most of us simply interact through web browsers and don't see all the details behind webpages, but behind the images and text that we see in our browser, there's a lot of complex code and data exchange happening.

When we visit a webpage, we type in a web address in our browser address bar and ask for a file from a remote server. The file is returned to us and displayed through our browser. The web addresses we use are URLs, or Uniform Resource Locators. An example we'll use here is

https://subscription.packtpub.com/book/IoT-and-Hardware/9781789958034

which is a page for the book MicroPython Projects, by Jacob Beningo. URLs follow a pattern, which we can use when web scraping:

[scheme]://[authority][path_to_resource]?[parameters]

In our example:

Scheme – https
Authority...

The rest of the chapter is locked

You're reading from Practical Data Science with Python Learn tools and techniques from hands-on examples to extract insights from data

Table of Contents (30) Chapters

Understanding the structure of the internet

Authors (1)

Other recommended products

Personalised recommendations for you

You're reading from Practical Data Science with Python Learn tools and techniques from hands-on examples to extract insights from data

Table of Contents (30) Chapters

Understanding the structure of the internet

Unlock this book and the full library FREE for 7 days

Authors (1)

Other recommended products

Personalised recommendations for you