Predicting with Tabular Data
Most of the available data that can be easily found is not composed of images or text documents, but it is instead made of relational tables, each one possibly containing numbers, dates, and short text, which can be all joined together. This is because of the widespread adoption of database applications based on the relational paradigm (data tables that can be combined together by the values of certain columns that act as joining keys). These tables are the main source of tabular data nowadays and because of that, there are certain challenges.
Here are the challenges commonly faced by Deep Neural Networks (DNNs) when applied to tabular data:
- Mixed features data types
- Data in a sparse format (there are more zeros than non-zero data), which is not the best for a DNN converging to an optimum solution
- No state-of-the-art architecture has emerged yet, there are just some various best practices
- Less data is available...