You're reading from Python for Security and Networking Leverage Python modules and tools in securing your network and applications

Product type Paperback

Published in Jun 2023

Publisher Packt

ISBN-13 9781837637553

Length 586 pages

Edition 3rd Edition

Languages

Python

Concepts

Information Security

Author (1):

José Manuel Ortega

View More author details

Table of Contents (23) Chapters

Preface

1. Section 1: Python Environment and System Programming Tools

2. Working with Python Scripting FREE CHAPTER

3. System Programming Packages

4. Section 2: Network Scripting and Packet Sniffing with Python

5. Socket Programming

6. HTTP Programming and Web Authentication

7. Analyzing Network Traffic and Packet Sniffing

8. Section 3: Server Scripting and Port Scanning with Python

9. Gathering Information from Servers with OSINT Tools

10. Interacting with FTP, SFTP, and SSH Servers

11. Working with Nmap Scanner

12. Section 4: Server Vulnerabilities and Security in Web Applications

13. Interacting with Vulnerability Scanners

14. Interacting with Server Vulnerabilities in Web Applications

15. Obtain Information from Vulnerabilities Databases

16. Section 5: Python Forensics

17. Extracting Geolocation and Metadata from Documents, Images, and Browsers

18. Python Tools for Brute-Force Attacks

19. Cryptography and Code Obfuscation

20. Assessments – Answers to the End-of-Chapter Questions

21. Other Books You May Enjoy

22. Index

Extracting metadata with PyMuPDF

Another way to extract text from PDF documents is using the PyMuPDF module (https://github.com/pymupdf/PyMuPDF), which is available in the PyPi repository, and you can install it with the following command:

$ pip install PyMuPDF

Viewing document information and extracting text from a PDF document is done similarly to with PyPDF2. The module to be imported is called fitz and provides a method called load_page() for loading a specific page, and for extracting text from a specific page, we can use the get_text() method from the page object. The following script allows us to obtain the text for a specific page number. You can find the following code in the extractTextFromPDF_fitz.py file in the pymupdf folder:

import fitz
pdf_document = "pdf/XMPSpecificationPart3.pdf"
doc = fitz.open(pdf_document)
print ("number of pages: %i" % doc.page_count)
page_number= input("Enter page number:")	
page = doc.load_page(int...

The rest of the chapter is locked

You're reading from Python for Security and Networking Leverage Python modules and tools in securing your network and applications

Table of Contents (23) Chapters

Extracting metadata with PyMuPDF

Authors (2)

Personalised recommendations for you

You're reading from Python for Security and Networking Leverage Python modules and tools in securing your network and applications

Table of Contents (23) Chapters

Extracting metadata with PyMuPDF

Unlock this book and the full library FREE for 7 days

Authors (2)

Personalised recommendations for you