0

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Free Learning

Getting Started with Beautiful Soup

You're reading from Getting Started with Beautiful Soup Learn how to extract information from websites using Beautiful Soup and the Python urllib2 module. This practical, hands-on guide covers everything you need to know to get a head start in website scraping.

Product type Paperback

Published in Jan 2014

Publisher Packt

ISBN-13 9781783289554

Length 130 pages

Edition 1st Edition

Languages

Python

Tools

BeautifulSoup

Concepts

Web Utilities

Author (1):

Vineeth G Nair

View More author details

Table of Contents (10) Chapters

Preface

1. Installing Beautiful Soup FREE CHAPTER

2. Creating a BeautifulSoup Object

3. Search Using Beautiful Soup

4. Navigation Using Beautiful Soup

5. Modifying Content Using Beautiful Soup

6. Encoding Support in Beautiful Soup

7. Output in Beautiful Soup

8. Creating a Web Scraper

Index

Chapter 6. Encoding Support in Beautiful Soup

All web pages will have an encoding associated with it. Modern websites have different encodings such as UTF-8, and Latin-1. Nowadays, UTF-8 is the encoding standard used in websites. So, while dealing with the scraping of such pages, it is important that the scraper should also be capable of understanding those encodings. Otherwise, the user will see certain characters in the web browser whereas the result you would get after using a scraper would be gibberish characters. For example, consider a sample web content from Wikipedia where we are able to see the Spanish character ñ.

Encoding Support in Beautiful Soup

If we run the same content through a scraper with no support for the previous encoding used by the website, we might end up with the following content:

The Spanish language is written using the Spanish alphabet, which is the Latin alphabet with one additional letter, e&#21336;e (&#21336;), for a total of 27 letters.

We see the Spanish character &...

The rest of the chapter is locked

Register for a free Packt account to unlock a world of extra content!

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (1)

Vineeth G Nair

Vineeth G Nair

Vineeth G. Nair completed his bachelors in Computer Science and Engineering from Model Engineering College, Cochin, Kerala. He is currently working with Oracle India Pvt. Ltd. as a Senior Applications Engineer. He developed an interest in Python during his college days and began working as a freelance programmer. This led him to work on several web scraping projects using Beautiful Soup. It helped him gain a fair level of mastery on the technology and a good reputation in the freelance arena. He can be reached at vineethgnair.mec@gmail.com. You can visit his website at www.kochi-coders.com.

See other products by Vineeth G Nair

Personalised recommendations for you

Based on your interests and search pattern

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Modern Full-Stack React Projects

Modern Full-Stack React Projects

Full-Stack React Projects is a complete guide to learning full-stack web development, understanding the creation and integration of backend systems, and advancing your career as a frontend developer.

Jun 2024 16h 52m

Mastering Node.js Web Development

Mastering Node.js Web Development

Explore Node.js with practical examples that will teach you how to utilize open-source packages for real-world solutions. Gain the skills to develop and deploy server-side applications that enhance your client-side projects.

Jun 2024 25h 56m

Mastering Node.js Web Development

Mastering Node.js Web Development

Explore Node.js with practical examples that will teach you how to utilize open-source packages for real-world solutions. Gain the skills to develop and deploy server-side applications that enhance your client-side projects.

Jun 2024 25h 56m

Mastering Node.js Web Development

Mastering Node.js Web Development

Explore Node.js with practical examples that will teach you how to utilize open-source packages for real-world solutions. Gain the skills to develop and deploy server-side applications that enhance your client-side projects.

Jun 2024 25h 56m

Mastering Node.js Web Development

Mastering Node.js Web Development

Explore Node.js with practical examples that will teach you how to utilize open-source packages for real-world solutions. Gain the skills to develop and deploy server-side applications that enhance your client-side projects.

Jun 2024 25h 56m