Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Natural Language Processing with Java

You're reading from   Natural Language Processing with Java Explore various approaches to organize and extract useful text from unstructured data using Java

Arrow left icon
Product type Paperback
Published in Mar 2015
Publisher
ISBN-13 9781784391799
Length 262 pages
Edition 1st Edition
Languages
Arrow right icon
Authors (2):
Arrow left icon
Richard M. Reese Richard M. Reese
Author Profile Icon Richard M. Reese
Richard M. Reese
Richard M Reese Richard M Reese
Author Profile Icon Richard M Reese
Richard M Reese
Arrow right icon
View More author details
Toc

Why use NLP?

NLP is used in a wide variety of disciplines to solve many different types of problems. Text analysis is performed on text that ranges from a few words of user input for an Internet query to multiple documents that need to be summarized. We have seen a large growth in the amount and availability of unstructured data in recent years. This has taken forms such as blogs, tweets, and various other social media. NLP is ideal for analyzing this type of information.

Machine learning and text analysis are used frequently to enhance an application's utility. A brief list of application areas follow:

  • Searching: This identifies specific elements of text. It can be as simple as finding the occurrence of a name in a document or might involve the use of synonyms and alternate spelling/misspelling to find entries that are close to the original search string.
  • Machine translation: This typically involves the translation of one natural language into another.
  • Summation: Paragraphs, articles, documents, or collections of documents may need to be summarized. NLP has been used successfully for this purpose.
  • Named Entity Recognition (NER): This involves extracting names of locations, people, and things from text. Typically, this is used in conjunction with other NLP tasks such as processing queries.
  • Information grouping: This is an important activity that takes textual data and creates a set of categories that reflect the content of the document. You have probably encountered numerous websites that organize data based on your needs and have categories listed on the left-hand side of the website.
  • Parts of Speech Tagging (POS): In this task, text is split up into different grammatical elements such as nouns and verbs. This is useful in analyzing the text further.
  • Sentiment analysis: People's feelings and attitudes regarding movies, books, and other products can be determined using this technique. This is useful in providing automated feedback with regards to how well a product is perceived.
  • Answering queries: This type of processing was illustrated when IBM's Watson successfully won a Jeopardy competition. However, its use is not restricted to winning game shows and has been used in a number of other fields including medicine.
  • Speech recognition: Human speech is difficult to analyze. Many of the advances that have been made in this field are the result of NLP efforts.
  • Natural Language Generation: This is the process of generating text from a data or knowledge source, such as a database. It can automate reporting of information such as weather reports, or summarize medical reports.

NLP tasks frequently use different machine learning techniques. A common approach starts with training a model to perform a task, verifying that the model is correct, and then applying the model to a problem. We will examine this process further in Understanding NLP models later in the chapter.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image