Packt+ | Advance your knowledge in tech

You're reading from Effective Python Penetration Testing Pen test your system like a pro and overcome vulnerabilities by leveraging Python scripts, libraries, and tools

Product type Paperback

Published in Jun 2016

Publisher Packt

ISBN-13 9781785280696

Length 164 pages

Edition 1st Edition

Languages

Python

Concepts

Penetration Testing

Author (1):

Rejah Rehim

View More author details

Table of Contents (11) Chapters

Preface

1. Python Scripting Essentials

2. Analyzing Network Traffic with Scapy FREE CHAPTER

3. Application Fingerprinting with Python

4. Attack Scripting with Python

5. Fuzzing and Brute-Forcing

6. Debugging and Reverse Engineering

7. Crypto, Hash, and Conversion Functions

8. Keylogging and Screen Grabbing

9. Attack Automation

10. Looking Forward

Parsing HTML with lxml

Another powerful, fast, and flexible parser is the HTML Parser that comes with lxml. As lxml is an extensive library written for parsing both XML and HTML documents, it can handle messed up tags in the process.

Let's start with an example.

Here, we will use the requests module to retrieve the web page and parse it with lxml:

#Importing modules 
from lxml import html 
import requests 
 
response = requests.get('http://packtpub.com/') 
tree = html.fromstring(response.content)

Now the whole HTML is saved to tree in a nice tree structure that we can inspect in two different ways: XPath or CSS Select. XPath is used to navigate through elements and attributes to find information in structured documents such as HTML or XML.

We can use any of the page inspect tools, such as Firebug or Chrome developer tools, to get the XPath of an element:

If we want to get the book names and prices from the list, find the following section in the source.

<div...

The rest of the chapter is locked

Tech Concepts

Programming languages

Tech Tools

Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.

Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.

50+ new titles added per month and exclusive early access to books as they are being written.

You're reading from Effective Python Penetration Testing Pen test your system like a pro and overcome vulnerabilities by leveraging Python scripts, libraries, and tools

Table of Contents (11) Chapters

Parsing HTML with lxml

Authors (1)

Personalised recommendations for you

You're reading from Effective Python Penetration Testing Pen test your system like a pro and overcome vulnerabilities by leveraging Python scripts, libraries, and tools

Table of Contents (11) Chapters

Parsing HTML with lxml

Authors (1)

Personalised recommendations for you

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access