Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Machine Learning with R

You're reading from   Machine Learning with R Expert techniques for predictive modeling

Arrow left icon
Product type Paperback
Published in Apr 2019
Publisher Packt
ISBN-13 9781788295864
Length 458 pages
Edition 3rd Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Brett Lantz Brett Lantz
Author Profile Icon Brett Lantz
Brett Lantz
Arrow right icon
View More author details
Toc

Table of Contents (16) Chapters Close

Preface 1. Introducing Machine Learning 2. Managing and Understanding Data FREE CHAPTER 3. Lazy Learning – Classification Using Nearest Neighbors 4. Probabilistic Learning – Classification Using Naive Bayes 5. Divide and Conquer – Classification Using Decision Trees and Rules 6. Forecasting Numeric Data – Regression Methods 7. Black Box Methods – Neural Networks and Support Vector Machines 8. Finding Patterns – Market Basket Analysis Using Association Rules 9. Finding Groups of Data – Clustering with k-means 10. Evaluating Model Performance 11. Improving Model Performance 12. Specialized Machine Learning Topics Other Books You May Enjoy
Leave a review - let other readers know what you think
Index

Machine learning with R

Many of the algorithms needed for machine learning are not included as part of the base R installation. Instead, the algorithms are available via a large community of experts who have shared their work freely. These must be installed on top of base R manually. Thanks to R's status as free open-source software, there is no additional charge for this functionality.

A collection of R functions that can be shared among users is called a package. Free packages exist for each of the machine learning algorithms covered in this book. In fact, this book only covers a small portion of all of R's machine learning packages.

If you are interested in the breadth of R packages, you can view a list at Comprehensive R Archive Network (CRAN), a collection of web and FTP sites located around the world to provide the most up-to-date versions of R software and packages. If you obtained the R software via download, it was most likely from CRAN. The CRAN website is available at http://cran.r-project.org/index.html.

Tip

If you do not already have R, the CRAN website also provides installation instructions and information on where to find help if you have trouble.

The Packages link on the left side of the CRAN page will take you to a page where you can browse the packages in alphabetical order or sorted by publication date. At the time of this writing, a total of 13,904 packages were available—over two times the number since the second edition of this book was written, and over three times since the first edition! Clearly, the R community is thriving, and this trend shows no sign of slowing!

The Task Views link on the left side of the CRAN page provides a curated list of packages by subject area. The task view for machine learning, which lists the packages covered in this book (and many more), is available at https://CRAN.R-project.org/view=MachineLearning.

Installing R packages

Despite the vast set of available R add-ons, the package format makes installation and use a virtually effortless process. To demonstrate the use of packages, we will install and load the RWeka package developed by Kurt Hornik, Christian Buchta, and Achim Zeileis (see Open-Source Machine Learning: R Meets Weka, Computational Statistics Vol. 24, pp 225-232 for more information). The RWeka package provides a collection of functions that give R access to the machine learning algorithms in the Java-based Weka software package by Ian H. Witten and Eibe Frank. For more information on Weka, see http://www.cs.waikato.ac.nz/~ml/weka/.

Tip

To use the RWeka package, you will need to have Java installed, if it isn't already (many computers come with Java preinstalled). Java is a set of programming tools, available for free, that allow for the use of cross-platform applications such as Weka. For more information and to download Java for your system, visit http://www.java.com.

The most direct way to install a package is via the install.packages() function. To install the RWeka package, at the R command prompt simply type:

> install.packages("RWeka")

R will then connect to CRAN and download the package in the correct format for your operating system. Some packages, such as RWeka, require additional packages to be installed before they can be used. These are called dependencies. By default, the installer will automatically download and install any dependencies.

Tip

The first time you install a package, R may ask you to choose a CRAN mirror. If this happens, choose the mirror residing at a location close to you. This will generally provide the fastest download speed.

The default installation options are appropriate for most systems. However, in some cases, you may want to install a package to another location. For example, if you do not have root or administrator privileges on your system, you may need to specify an alternative installation path. This can be accomplished using the lib option as follows:

> install.packages("RWeka", lib = "/path/to/library")

The installation function also provides additional options for installing from a local file, installing from source, or using experimental versions. You can read about these options in the help file by using the following command:

> ?install.packages

More generally, the question mark operator can be used to obtain help on any R function. Simply type ? before the name of the function.

Loading and unloading R packages

In order to conserve memory, R does not load every installed package by default. Instead, packages are loaded by users with the library() function as they are needed.

Tip

The name of this function leads some people to incorrectly use the terms "library" and "package" interchangeably. However, to be precise, a library refers to the location where packages are installed and never to a package itself.

To load the RWeka package installed previously, you can type the following:

> library(RWeka)

Aside from RWeka, there are several other R packages that will be used in later chapters. Installation instructions will be provided as these additional packages are needed.

To unload an R package, use the detach() function. For example, to unload the RWeka package shown previously, use the following command:

> detach("package:RWeka", unload = TRUE)

This will free up any resources used by the package.

Installing RStudio

Before you begin working with R, it is highly recommended to also install the open-source RStudio desktop application. RStudio is an additional interface to R that includes functionalities that make it far easier, more convenient, and more interactive to work with R code. It is available free of charge at https://www.rstudio.com/.

Installing RStudio

Figure 1.10: The RStudio desktop environment makes R easier and more convenient to use

The RStudio interface includes an integrated code editor, an R command-line console, a file browser, and an R object browser. R code syntax is automatically colorized, and the code's output, plots, and graphics are displayed directly within the environment, which makes it much easier to follow long or complex statements and programs. More advanced features allow R project and package management; integration with source control or version control tools, such as Git and Subversion; database connection management; and the compilation of R output to HTML, PDF, or Microsoft Word formats.

RStudio is a key reason why R is a top choice for data scientists today. It wraps the power of R programming and its tremendous library of machine learning and statistical packages in an easy-to-use and easy-to-install development interface. It is not only ideal for learning R, but can also grow with you as you learn R's more advanced functionality.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image