Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
R Data Analysis Cookbook, Second Edition

You're reading from   R Data Analysis Cookbook, Second Edition Customizable R Recipes for data mining, data visualization and time series analysis

Arrow left icon
Product type Paperback
Published in Sep 2017
Publisher Packt
ISBN-13 9781787124479
Length 560 pages
Edition 2nd Edition
Languages
Tools
Arrow right icon
Authors (3):
Arrow left icon
Kuntal Ganguly Kuntal Ganguly
Author Profile Icon Kuntal Ganguly
Kuntal Ganguly
Shanthi Viswanathan Shanthi Viswanathan
Author Profile Icon Shanthi Viswanathan
Shanthi Viswanathan
Viswa Viswanathan Viswa Viswanathan
Author Profile Icon Viswa Viswanathan
Viswa Viswanathan
Arrow right icon
View More author details
Toc

Table of Contents (14) Chapters Close

Preface 1. Acquire and Prepare the Ingredients - Your Data 2. What's in There - Exploratory Data Analysis FREE CHAPTER 3. Where Does It Belong? Classification 4. Give Me a Number - Regression 5. Can you Simplify That? Data Reduction Techniques 6. Lessons from History - Time Series Analysis 7. How does it look? - Advanced data visualization 8. This may also interest you - Building Recommendations 9. It's All About Your Connections - Social Network Analysis 10. Put Your Best Foot Forward - Document and Present Your Analysis 11. Work Smarter, Not Harder - Efficient and Elegant R Code 12. Where in the World? Geospatial Analysis 13. Playing Nice - Connecting to Other Systems

Reading data from fixed-width formatted files

In fixed-width formatted files, columns have fixed widths; if a data element does not use up the entire allotted column width, then the element is padded with spaces to make up the specified width. To read fixed-width text files, specify the columns either by column widths or by starting positions.

Getting ready

Download the files for this chapter and store the student-fwf.txt file in your R working directory.

How to do it...

Read the fixed-width formatted file as follows:

> student  <- read.fwf("student-fwf.txt",     widths=c(4,15,20,15,4),       col.names=c("id","name","email","major","year")) 

How it works...

In the student-fwf.txt file, the first column occupies 4 character positions, the second 15, and so on. The c(4,15,20,15,4) expression specifies the widths of the 5 columns in the data file.

We can use the optional col.names argument to supply our own variable names.

There's more...

The read.fwf() function has several optional arguments that come in handy. We discuss a few of these, as follows:

Files with headers

Files with headers use the following command:

> student  <- read.fwf("student-fwf-header.txt",     widths=c(4,15,20,15,4), header=TRUE, sep="t",skip=2) 

If header=TRUE, the first row of the file is interpreted as having the column headers. Column headers, if present, need to be separated by the specified sep argument. The sep argument only applies to the header row.

The skip argument denotes the number of lines to skip; in this recipe, the first two lines are skipped.

Excluding columns from data

To exclude a column, make the column width negative. Thus, to exclude the email column, we will specify its width as -20 and also remove the column name from the col.names vector, as follows:

> student <- read.fwf("student-fwf.txt",widths=c(4,15,-20,15,4),     col.names=c("id","name","major","year")) 
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image