Before getting down to the implementation of a logistic regression pipeline, refer back to the earlier table in section Breast cancer dataset at a glance where nine breast cancer tissue sample characteristics (features) are listed, along with one class column. To recap, those characteristics or features are listed as follows for context:
- clump_thickness
- size_uniformity
- shape_uniformity
- marginal_adhesion
- epithelial_size
- bare_nucleoli
- bland_chromatin
- normal_nucleoli
- mitoses
Now, let's get down to a high-level formulation of the logistic regression approach in terms of what it is meant to achieve. The following diagram represents the elements of such a formulation at a high level:
![](https://static.packt-cdn.com/products/9781788624114/graphics/assets/ba37e64b-0234-427d-bfa4-589093753c8a.png)
Breast cancer classification formulation
The preceding diagram represents a high-level formulation of a logistic classifier pipeline that we are aware...