Applying our learning
Now that we have learned about the feature engineering components of the DI platform, let’s put these topics into action and build on the example project datasets with new features that will enhance the data science projects.
Technical requirements
Here are the technical requirements needed to complete the hands-on examples in this chapter:
- The Streaming Transactions project requires more compute power than is available in the single node cluster. We created a multi-node cluster to address this. See Figure 5.9 for the multi-node CPU configuration we used.
Figure 5.9 – Multi-node CPU cluster configuration (on AWS) used for this book
- We will use managed volumes to store cleaned, featurized data.
Project – Streaming Transactions
If you’ve been following the code in the previous chapters, at this point, you have the streaming data you need. In this chapter, we will augment that...