Detecting missing values
Missing values reduce the representativeness of the sample, and furthermore, might distort inferences about the population. This recipe will focus on detecting missing values within the airquality dataset.
Getting ready
You need to have completed the previous recipes, which involve converting Month
into a factor type.
Note
In R, a missing value is noted with the symbol NA (not available), and an impossible value is NaN (not a number).
How to do it...
Perform the following steps to detect the missing value:
- The
is.na
function is used to denote which index of the attribute contains theNA
value. Here, we apply it to theOzone
attribute first:
> is.na(mydata$Ozone) Output [1] FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE [14] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE [27] TRUE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE...