Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Mastering Java Machine Learning

You're reading from   Mastering Java Machine Learning A Java developer's guide to implementing machine learning and big data architectures

Arrow left icon
Product type Paperback
Published in Jul 2017
Publisher Packt
ISBN-13 9781785880513
Length 556 pages
Edition 1st Edition
Languages
Concepts
Arrow right icon
Authors (2):
Arrow left icon
Uday Kamath Uday Kamath
Author Profile Icon Uday Kamath
Uday Kamath
Krishna Choppella Krishna Choppella
Author Profile Icon Krishna Choppella
Krishna Choppella
Arrow right icon
View More author details
Toc

Table of Contents (13) Chapters Close

Preface 1. Machine Learning Review FREE CHAPTER 2. Practical Approach to Real-World Supervised Learning 3. Unsupervised Machine Learning Techniques 4. Semi-Supervised and Active Learning 5. Real-Time Stream Machine Learning 6. Probabilistic Graph Modeling 7. Deep Learning 8. Text Mining and Natural Language Processing 9. Big Data Machine Learning – The Final Frontier A. Linear Algebra B. Probability Index

Issues with mining unstructured data


Humans can read, parse, and understand unstructured text/documents more easily than computer-based programs. Some of the reasons why text mining is more complicated than general supervised or unsupervised learning are given here:

  • Ambiguity in terms and phrases. The word bank has multiple meanings, which a human reader can correctly associate based on context, yet this requires preprocessing steps such as POS tagging and word sense disambiguation, as we have seen. According to the Oxford English Dictionary, the word run has no fewer than 645 different uses in the verb form alone and we can see that such words can indeed present problems in resolving the meaning intended (between them, the words run, put, set, and take have more than a thousand meanings).

  • Context and background knowledge associated with the text. Consider a sentence that uses a neologism with the suffix gate to signify a political scandal, as in, With cries for impeachment and popularity...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime