One of the great results you can obtain by employing this package is streamlining workflows involving different development and productivity tools.
For instance, it is possible to produce tables directly into sas and make them available to the R environment without any particular export procedure in sas, we can directly acquire data in R as it is produced, or input into an Excel spreadsheet.
We first import the dataset using the import()
function. To understand the structure of the import()
function, we can leverage a useful behavior of the R console: putting a function name without parentheses and running the command will result in the printing of all the function definitions.
Running the import on the R console will produce the following output:
function (file, format, setclass, ...)
{
if (missing(format))
fmt <- get_ext(file)
else fmt <- tolower(format)
if (grepl("^http.*://", file)) {
temp_file <- tempfile(fileext = fmt)
on.exit(unlink(temp_file))
curl_download(file, temp_file, mode = "wb")
file <- temp_file
}
x <- switch(fmt, r = dget(file = file), tsv = import.delim(file = file,
sep = "\t", ...), txt = import.delim(file = file, sep = "\t",
...), fwf = import.fwf(file = file, ...), rds = readRDS(file = file,
...), csv = import.delim(file = file, sep = ",", ...),
csv2 = import.delim(file = file, sep = ";", dec = ",",
...), psv = import.delim(file = file, sep = "|",
...), rdata = import.rdata(file = file, ...), dta = import.dta(file = file,
...), dbf = read.dbf(file = file, ...), dif = read.DIF(file = file,
...), sav = import.sav(file = file, ...), por = read_por(path = file),
sas7bdat = read_sas(b7dat = file, ...), xpt = read.xport(file = file),
mtp = read.mtp(file = file, ...), syd = read.systat(file = file,
to.data.frame = TRUE), json = fromJSON(txt = file,
...), rec = read.epiinfo(file = file, ...), arff = read.arff(file = file),
xls = read_excel(path = file, ...), xlsx = import.xlsx(file = file,
...), fortran = import.fortran(file = file, ...),
zip = import.zip(file = file, ...), tar = import.tar(file = file,
...), ods = import.ods(file = file, ...), xml = import.xml(file = file,
...), clipboard = import.clipboard(...), gnumeric = stop(stop_for_import(fmt)),
jpg = stop(stop_for_import(fmt)), png = stop(stop_for_import(fmt)),
bmp = stop(stop_for_import(fmt)), tiff = stop(stop_for_import(fmt)),
sss = stop(stop_for_import(fmt)), sdmx = stop(stop_for_import(fmt)),
matlab = stop(stop_for_import(fmt)), gexf = stop(stop_for_import(fmt)),
npy = stop(stop_for_import(fmt)), stop("Unrecognized file format"))
if (missing(setclass)) {
return(set_class(x))
}
else {
a <- list(...)
if ("data.table" %in% names(a) && isTRUE(a[["data.table"]]))
setclass <- "data.table"
return(set_class(x, class = setclass))
}
}
As you can see, the first task performed by the import()
function calls the get_ext()
function, which basically retrieves the extension from the filename.
Once the file format is clear, the import()
function looks for the right subimport
function to be used and returns the result of this function.
Next, we visualize the result with the RStudio viewer. One of the most powerful RStudio tools is the data viewer, which lets you get a spreadsheet-like view of your data.frame
objects. With RStudio 0.99, this tool got even more powerful, removing the previous 1000-row limit and adding the ability to filter and format your data in the correct order.
When using this viewer, you should be aware that all filtering and ordering activities will not affect the original data.frame
object you are visualizing.