Converting tabular data to a TensorFlow dataset
Tabular or comma separated values (CSV) data with fixed schemas and data types are commonly encountered. We typically work it into a pandas DataFrame. We have seen in the previous chapter how this can be easily done when the data is hosted in a BigQuery table (the BigQuery magic command that returns a query result to a pandas DataFrame by default).
Let's take a look at how to handle data that can fit into the memory. In this example, we are going to read a public dataset using the BigQuery magic command, so we can easily obtain the data in a pandas DataFrame. Then we are going to convert it to a TensorFlow dataset. A TensorFlow dataset is the data structure for streaming training data in batches without using up the compute node's runtime memory.
Converting a BigQuery table to a TensorFlow dataset
Each of the following steps is executed in a cell. Again, use any of the AI platforms you prefer (AI Notebook, Deep Learning...