Chapter 6. SciPy for Data Mining
This chapter covers those branches of mathematics and statistics that treat the collection, organization, analysis, and interpretation of data. There are different applications and operations that spread over several modules and submodules: scipy.stats
(for purely statistical tools), scipy.ndimage.measurements
(for analysis and organization of data), scipy.spatial
(for spatial algorithms and data structures), and finally the clustering package scipy.cluster
. The scipy.cluster
clustering package consists of two submodules: scipy.cluster.vq
(vector quantization) and scipy.cluster.hierarchy
(for hierarchical and agglomerative clustering).
As in the previous chapters, fluency with the subject matter is assumed. Our emphasis is to show you some of the SciPy functions available to perform statistical computations, not to teach it. Accordingly, you are welcome to read this chapter along side your preferred book(s) on the subject so that you can fully explore...