Models for count data
Logistic regression can handle only binary responses. If you have count data, such as the number of deaths or failures in a given period of time, or in a given geographical area, you can use Poisson or negative binomial regression. These data types are particularly common when working with aggregated data, which is provided as a number of events classified in different categories.
Poisson regression
Poisson regression models are generalized linear models with the logarithm as the link function, and they assume that the response has a Poisson distribution. The Poisson distribution takes only integer values. It is appropriate for count data, such as events occurring over a fixed period of time, that is, if the events are rather rare, such as a number of hard drive failures per day.
In the following example, we will use the Hard Drive Data Sets for the year of 2013. The dataset was downloaded from https://docs.backblaze.com/public/hard-drive-data/2013_data.zip, but we polished...