In Summary
Spreadsheets can be understood at two levels: an external level and an internal level. It is necessary to understand spreadsheets at the internal level in order to grasp the scope of what kind of data analysis is possible.
There are three special characters that are found at the internal level: xlstab, linefeed, and eold. The spreadsheet as defined by the user is translated into a string of text that is formatted with the content of the spreadsheet.
Spreadsheets can and should be formatted by the end user in a “standard” format. Once the spreadsheet has been formatted as such, it can be converted into a corporate database.
In order to be converted into a corporate database, each value must have context. Context is determined by the intersection of the column name and the row identifier.
The row containing column names is identified manually by the end user and placed in a database called the ssdef table. A spreadsheet can have multiple column...