Packt+ | Advance your knowledge in tech

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Free Learning

Natural Language Processing with Java

You're reading from Natural Language Processing with Java Techniques for building machine learning and neural network models for NLP

Product type Paperback

Published in Jul 2018

Publisher

ISBN-13 9781788993494

Length 318 pages

Edition 2nd Edition

Languages

Java

Tools

Processing

Concepts

Machine Learning

Authors (2):

Ashish Bhatia

Richard M. Reese

View More author details

Table of Contents (14) Chapters

Preface

1. Introduction to NLP FREE CHAPTER

2. Finding Parts of Text

3. Finding Sentences

4. Finding People and Things

5. Detecting Part of Speech

6. Representing Text with Features

7. Information Retrieval

8. Classifying Texts and Documents

9. Topic Modeling

10. Using Parsers to Extract Relationships

11. Combined Pipeline

12. Creating a Chatbot

13. Other Books You May Enjoy

Leave a review - let other readers know what you think

Dictionaries and tolerant retrieval

Dictionary data structures store the list term vocabulary, with the list of documents that contain the given term, also as posting.

Dictionary data structures can be stored in two different ways: using hash tables or trees. The naive approach to storing such data structures will lead to performance issues when the corpus grows. Some IR systems use the hash approach, whereas others use the tree approach to make the dictionaries. Both approaches have their pros and cons.

Hash tables store vocabulary terms in the form of integers, which are obtained by hashing. Lookups or searches in hash tables are faster,as it is time constant O(1). If the search is prefix-based search like find text starting with "abc", it will not work if the hash tables are used to store the terms because terms will be hashed. It is not easy to find minor variants. As the terms grow, rehashing is expensive.

A tree base approach uses a tree structure, normally a binary tree, which is very...

The rest of the chapter is locked

Register for a free Packt account to unlock a world of extra content!

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (1)

Richard M. Reese

Richard M. Reese

Richard Reese has worked in the industry and academics for the past 29 years. For 10 years he provided software development support at Lockheed and at one point developed a C based network application. He was a contract instructor providing software training to industry for 5 years. Richard is currently an Associate Professor at Tarleton State University in Stephenville Texas. Richard is the author of various books and video courses some of which are as follows: Natural Language Processing with Java. Java for Data Science Getting Started with Natural Language Processing in Java

See other products by Richard M. Reese

Other recommended products

Related to this chapter

Natural Language Processing with Java Cookbook

Natural Language Processing with Java Cookbook

This book will teach you how to perform basic and advanced NLP tasks in Java, using independent recipes. The book not only covers the essential aspects of NLP but also addresses other important areas such as the acquisition of text and techniques for utilizing NLP in varied domains

Apr 2019 12h 52m

Java Data Science Cookbook

Java Data Science Cookbook

Java has been one of the most popular languages for developers for several decades and yet the potential of the Java ecosystem still remains untapped when it comes to using JVM-based languages and platforms to solve data science related problems. A variety of tools and libraries are available such as Spark, Hadoop, and Mahout for computation and libraries such as MLlib, Weka, DL4j to implement smart data models. This book uncovers practically all these techniques in the form of recipes showing you how these tools and libraries can solve statistical, analytical, data mining, and information science related problems.

Mar 2017 12h 24m

Java for Data Science

Java for Data Science

Harness the incredible power of Java-based approaches to data science and create new, innovative applications to explore, visualise and analyse big data. With its tutorial approach and step-by-step instructional style, Java for Data Science is the ultimate data science book for Java developers interested in Java-based data science solutions.

Jan 2017 12h 52m

Mastering spaCy

Mastering spaCy

Using machine learning-based NLP models, you can speed up business processes, make more accurate predictions, and uncover new insights from your existing data, where spaCy, an advanced industrial-grade natural language processing library, can help. With this book, you'll learn how to use it and create high-impact ML solutions for NLP.

Jul 2021 11h 52m

Mastering spaCy

Mastering spaCy

Using machine learning-based NLP models, you can speed up business processes, make more accurate predictions, and uncover new insights from your existing data, where spaCy, an advanced industrial-grade natural language processing library, can help. With this book, you'll learn how to use it and create high-impact ML solutions for NLP.

Jul 2021 11h 52m

Mastering spaCy

Mastering spaCy

Using machine learning-based NLP models, you can speed up business processes, make more accurate predictions, and uncover new insights from your existing data, where spaCy, an advanced industrial-grade natural language processing library, can help. With this book, you'll learn how to use it and create high-impact ML solutions for NLP.

Jul 2021 11h 52m

Mastering spaCy

Mastering spaCy

Using machine learning-based NLP models, you can speed up business processes, make more accurate predictions, and uncover new insights from your existing data, where spaCy, an advanced industrial-grade natural language processing library, can help. With this book, you'll learn how to use it and create high-impact ML solutions for NLP.

Jul 2021 11h 52m

Mastering spaCy

Mastering spaCy

Using machine learning-based NLP models, you can speed up business processes, make more accurate predictions, and uncover new insights from your existing data, where spaCy, an advanced industrial-grade natural language processing library, can help. With this book, you'll learn how to use it and create high-impact ML solutions for NLP.

Jul 2021 11h 52m

Mastering spaCy

Mastering spaCy

Using machine learning-based NLP models, you can speed up business processes, make more accurate predictions, and uncover new insights from your existing data, where spaCy, an advanced industrial-grade natural language processing library, can help. With this book, you'll learn how to use it and create high-impact ML solutions for NLP.

Jul 2021 11h 52m

Mastering spaCy

Mastering spaCy

Using machine learning-based NLP models, you can speed up business processes, make more accurate predictions, and uncover new insights from your existing data, where spaCy, an advanced industrial-grade natural language processing library, can help. With this book, you'll learn how to use it and create high-impact ML solutions for NLP.

Jul 2021 11h 52m

Mastering spaCy

Mastering spaCy

Using machine learning-based NLP models, you can speed up business processes, make more accurate predictions, and uncover new insights from your existing data, where spaCy, an advanced industrial-grade natural language processing library, can help. With this book, you'll learn how to use it and create high-impact ML solutions for NLP.

Jul 2021 11h 52m

Mastering spaCy

Mastering spaCy

Using machine learning-based NLP models, you can speed up business processes, make more accurate predictions, and uncover new insights from your existing data, where spaCy, an advanced industrial-grade natural language processing library, can help. With this book, you'll learn how to use it and create high-impact ML solutions for NLP.

Jul 2021 11h 52m