Analyzing data
In practical data analysis, most time is spent on data cleansing, that is, to filter and transform the original data (or raw data) to a form that is easier to analyze. The filtering and transforming process is also called data manipulation. We will dedicate an entire chapter to this topic.
In this section, we directly assume that the data is ready for analysis. We won't go deep into the models, but will apply some simple models to leave you an impression of how to fit a model with data, how to interact with fitted models, and how to apply a fitted model to make predictions.
Fitting a linear model
The simplest model in R is the linear model, that is, we use a linear function to describe the relationship between two random variables under a certain set of assumptions. In the following example, we will create a linear function that maps xto 3 + 2 * x. Then we generate a normally-distributed random numeric vector x
, and generate y
by f(x)
plus some independent noise:
f <- function...