Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Mastering spaCy
Mastering spaCy

Mastering spaCy: Build structured NLP solutions with custom components and models powered by spacy-llm , Second Edition

Arrow left icon
Profile Icon Déborah Mesquita Profile Icon Duygu Altınok
Arrow right icon
$27.99 $31.99
eBook Feb 2025 238 pages 2nd Edition
eBook
$27.99 $31.99
Paperback
$39.99
Subscription
Free Trial
Renews at $19.99p/m
Arrow left icon
Profile Icon Déborah Mesquita Profile Icon Duygu Altınok
Arrow right icon
$27.99 $31.99
eBook Feb 2025 238 pages 2nd Edition
eBook
$27.99 $31.99
Paperback
$39.99
Subscription
Free Trial
Renews at $19.99p/m
eBook
$27.99 $31.99
Paperback
$39.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

Mastering spaCy

Getting Started with spaCy

In this chapter, we will have a comprehensive introduction to natural language processing (NLP) application development with Python and spaCy. First, we will see how NLP development can go hand in hand with Python, along with an overview of what spaCy offers as a Python library.

After the warm-up, you will quickly get started with spaCy by downloading the library and loading the models. You will then explore spaCy’s popular visualizer, displaCy, to visualize language data and explore its various features.

By the end of this chapter, you will know what you can achieve with spaCy and gain an overview of some of its key features. You will be also settled with your development environment, which will be used in all the chapters of this book.

We’re going to cover the following topics:

  • Overview of spaCy
  • Installing spaCy
  • Installing spaCy’s language models
  • Visualization with displaCy

Technical requirements

The code of this chapter can be found at https://github.com/PacktPublishing/Mastering-spaCy-Second-Edition.

Overview of spaCy

NLP is a subfield of AI that analyzes text, speech, and other forms of human-generated language data. Human language is complicated – even a short paragraph contains references to the previous words, pointers to real-world objects, cultural references, and the writer’s or speaker’s personal experiences. Figure 1.1 shows such an example sentence, which includes a reference to a relative date (recently), phrases that can be resolved only by another person who knows the speaker (regarding the city that the speaker’s parents live in), and who has general knowledge about the world (a city is a place where human beings live together):

Figure 1.1 – An example of human language, containing many cognitive and cultural aspects

Figure 1.1 – An example of human language, containing many cognitive and cultural aspects

How do we process such a complicated structure using computers? With spaCy, we can easily model natural language with statistical models, and process linguistic features to turn the text...

Installing spaCy

Let’s get started by installing and setting up spaCy. spaCy is compatible with 64-bit CPython 3.7+ and can run on Unix/Linux, macOS/OS X, and Windows. CPython is a reference implementation of Python in C. If you already have Python running on your system, most probably your CPython modules are fine too – hence, you don’t need to worry about this detail. The newest spaCy releases are always downloadable via pip (https://pypi.org/) and conda (https://conda.io/en/latest/). pip and conda are two of the most popular distribution packages.

It’s always a good idea to create a virtual environment to isolate the independent set of Python packages for each project. On Windows, we can create a virtual environment and install spacy with pip using these commands:

python -m venv .env
.env\Scripts\activate
pip install -U pip setuptools wheel
pip install -U spacy

If your machine has a GPU available, you can install spaCy with GPU support with this...

Installing spaCy’s language models

The spaCy installation doesn’t come with the statistical language models needed for the spaCy pipeline tasks. spaCy language models contain knowledge about a specific language collected from a set of resources. Language models let us perform a variety of NLP tasks, including parts of speech tagging popularly called as POS tagging and named entity recognition (NER).

Different languages have different models that are language-specific. There are also different models available for the same language. The naming convention of the models is [lang]_[name]. The [name] part usually contains information about the model capabilities, the genre, and the size. For example, the pt_core_web_sm model is a small Portuguese pipeline trained on web text. Large models can require a lot of disk space, for example, en_core_web_lg takes up 382 MB, while en_core_web_md needs 31 and en_core_web_sm takes only 12 MB.

It is a good practice to match the model...

Visualization with displaCy

Visualization is the easiest way to explain some concepts to your colleagues, your boss, and any technical or non-technical audience. Visualization of language data is specifically useful and allows you to identify patterns in your data at a glance.

There are many Python libraries and plugins such as matplotlib, seaborn, tensorboard, and so on. spaCy also comes with its own visualizer – displaCy. In this subsection, you’ll learn how to spin up a displaCy server on your machine, in a Jupyter notebook, and in a web application. We’ll start by exploring the easiest way – using displaCy’s interactive demo.

Getting started with displaCy

Go ahead and navigate to https://demos.explosion.ai/displacy to use the interactive demo. Enter your text in the Text to parse box and then click the search icon on the right to generate the visualization. The result might look like Figure 1.3.

Figure 1.3 – displaCy’s online demo

Figure 1.3 ...

Summary

This chapter gave you an introduction to NLP with Python and spaCy. You now have a brief idea about why to use Python for language processing and the reasons to prefer spaCy for creating your NLP applications. We also got started on our spaCy journey by installing spaCy and downloading language models. This chapter also introduced us to the visualization tool, displaCy.

In the next chapter, we will continue our exciting spaCy journey with spaCy core operations such as tokenization and lemmatization. It’ll be our first encounter with spaCy features in detail. See you there!

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Build end-to-end NLP workflows, from local development to production, using Weasel and FastAPI
  • Master no-training NLP development with spacy-llm, covering everything from prompt engineering to custom tasks
  • Create advanced NLP solutions, including custom components and neural coreference resolution
  • Purchase of the print or Kindle book includes a free PDF eBook

Description

Mastering spaCy, Second Edition is your comprehensive guide to building sophisticated NLP applications using the spaCy ecosystem. This revised edition builds on the expertise of Duygu Altinok, a seasoned NLP engineer and spaCy contributor, and introduces new chapters by Déborah Mesquita, a data science educator and consultant known for making complex concepts accessible. This edition embraces the latest advancements in NLP, featuring chapters on large language models with spacy-llm, transformer integration, and end-to-end workflow management with Weasel. You’ll learn how to enhance NLP tasks using LLMs, streamline workflows using Weasel, and integrate spaCy with third-party libraries like Streamlit, FastAPI, and DVC. From training custom Named Entity Recognition (NER) pipelines to categorizing emotions in Reddit posts, this book covers advanced topics such as text classification and coreference resolution. Starting with the fundamentals—tokenization, NER, and dependency parsing—you’ll explore more advanced topics like creating custom components, training domain-specific models, and building scalable NLP workflows. Through practical examples, clear explanations, tips, and tricks, this book will equip you to build robust NLP pipelines and seamlessly integrate them into web applications for end-to-end solutions.

Who is this book for?

This book is for NLP engineers, machine learning developers, and LLM engineers looking to build production-grade language processing solutions. Not just professionals working with language models and NLP pipelines but software engineers transitioning into NLP development will also find this book valuable. Basic Python programming knowledge and familiarity with NLP concepts is recommended to leverage spaCy's latest capabilities.

What you will learn

  • Apply transformer models and fine-tune them for specialized NLP tasks
  • Master spaCy core functionalities including data structures and processing pipelines
  • Develop custom pipeline components and semantic extractors for domain-specific needs
  • Build scalable applications by integrating spaCy with FastAPI, Streamlit, and DVC
  • Master advanced spaCy features including coreference resolution and neural pipeline components
  • Train domain-specific models, including NER and coreference resolution
  • Prototype rapidly with spacy-llm and develop custom LLM tasks

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Feb 14, 2025
Length: 238 pages
Edition : 2nd
Language : English
ISBN-13 : 9781835880470
Category :
Languages :
Concepts :
Tools :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Feb 14, 2025
Length: 238 pages
Edition : 2nd
Language : English
ISBN-13 : 9781835880470
Category :
Languages :
Concepts :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Table of Contents

16 Chapters
Part 1: Getting Started with spaCy Chevron down icon Chevron up icon
Chapter 1: Getting Started with spaCy Chevron down icon Chevron up icon
Chapter 2: Core Operations with spaCy Chevron down icon Chevron up icon
Part 2: Advanced Linguistic and Semantic Analysis Chevron down icon Chevron up icon
Chapter 3: Extracting Linguistic Features Chevron down icon Chevron up icon
Chapter 4: Mastering Rule-Based Matching Chevron down icon Chevron up icon
Chapter 5: Extracting Semantic Representations with spaCy Pipelines Chevron down icon Chevron up icon
Chapter 6: Utilizing spaCy with Transformers Chevron down icon Chevron up icon
Part 3: Customizing and Integrating NLP Workflows Chevron down icon Chevron up icon
Chapter 7: Enhancing NLP Tasks Using LLMs with spacy-llm Chevron down icon Chevron up icon
Chapter 8: Training an NER Component with Your Own Data Chevron down icon Chevron up icon
Chapter 9: Creating End-to-End spaCy Workflows with Weasel Chevron down icon Chevron up icon
Chapter 10: Training an Entity Linker Model with spaCy Chevron down icon Chevron up icon
Chapter 11: Integrating spaCy with Third-Party Libraries Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.