Business case
For this chapter, we will stick with cancer—prostate cancer in this case. It is a small dataset of 97 observations and nine variables but allows you to fully grasp what is going on with regularization techniques by allowing a comparison with the traditional techniques. We will start by performing best subsets regression to identify the features and use this as a baseline for the comparison.
Business understanding
The Stanford University Medical Center has provided the preoperative Prostate Specific Antigen (PSA) data on 97 patients who are about to undergo radical prostatectomy (complete prostate removal) for the treatment of prostate cancer. The American Cancer Society (ACS) estimates that nearly 30,000 American men died of prostate cancer in 2014 (http://www.cancer.org/). PSA is a protein that is produced by the prostate gland and is found in the bloodstream. The goal is to develop a predictive model of PSA among the provided set of clinical measures. PSA can be an...