Importing R data
We will use pyreadr
to read an R data file into pandas. Since pyreadr
cannot capture the metadata, we will write code to reconstruct value labels (analogous to R factors) and column headings. This is similar to what we did in the Importing data from SQL databases recipe.
The R statistical package is, in many ways, similar to the combination of Python and pandas, at least in its scope. Both have strong tools across a range of data preparation and data analysis tasks. Some data scientists work with both R and Python, perhaps doing data manipulation in Python and statistical analysis in R, or vice-versa, depending on their preferred packages. But there is currently a scarcity of tools for reading data saved in R, as rds
or rdata
files, into Python. The analyst often saves the data as a CSV file first, and then loads the CSV file into Python. We will use pyreadr
, from the same author as pyreadstat
, because it does not require an installation of R.
When we receive...