7.1.1 Description
This project’s intent is to inspect raw data to understand if it is actually cardinal data. In some cases, floating-point values may have been used to represent nominal data; the data appears to be a measurement but is actually a code.
Spreadsheet software tends to transform all data into floating-point numbers; many data items may look like cardinal data.
One example is US Postal Codes, which are strings of digits, but may be transformed into numeric values by a spreadsheet.
Another example is bank account numbers, which — while very long — can be converted into floating-point numbers. A floating-point value uses 8 bytes of storage, but will comfortably represent about 15 decimal digits. While this is a net saving in storage, it is a potential confusion of data types and there is a (small) possibility of having an account number altered by floating-point truncation rules.
The user experience is a Jupyter Lab notebook that can be used to examine...