Additional considerations
There are other services offered by Google that we wish to highlight:
- Dataprep: This is a web application that allows us to define preparation rules for our data by interfacing with a sample of the data. Like many of the other services we have discussed, Dataprep is serverless, meaning no upfront deployments are required. By default, Dataprep jobs are executed on a Dataflow pipeline. Refer to https://cloud.google.com/dataprep/ for more information.
- Datalab: This is built on Jupyter (formerly IPython), which is an open source web application. Datalab is an interactive data analysis and machine learning environment. We can use this product to visualize and explore data using Python and SQL interactively. This would be treated as part of the data usage stage of our end-to-end solution and would use data that's passed from BigQuery. Datalab is free of charge but runs on Compute Engine instances, so charges will be applicable. For more information...