Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
R for Data Science

You're reading from   R for Data Science Learn and explore the fundamentals of data science with R

Arrow left icon
Product type Paperback
Published in Dec 2014
Publisher
ISBN-13 9781784390860
Length 364 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Dan Toomey Dan Toomey
Author Profile Icon Dan Toomey
Dan Toomey
Arrow right icon
View More author details
Toc

Packages

While the standard R system has a number of features and functions available, one of the better aspects of R is the use of packages to add functionalities. A package contains a number of functions (and sometimes sample data) that can be used to solve a particular problem in R. Packages are developed by interested groups for the general good of all R developers. In this chapter, we will be using the following packages:

  • tm: This contains text mining tools
  • XML: This contains XML processing tools

Text processing

R has built-in functions for manipulating text. These include adjustments to the text to make it more analyzable (such as using word stems or removing punctuation) and developing a document matrix showing usage of words throughout a document. Once these steps are done, we can then submit our documents to analysis and clustering.

Example

In this example, we will perform the following steps:

  1. We will take an HTML document from the Internet.
  2. We will clean up the document using text processing...
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime