Dataset
Building a data pipeline is as important as the architecture of your network, especially when you train your network in real time. The data that you get from the wild is never going to be clean, and you'll have to process it before throwing it at your network. For example, if we were to collect data for predicting whether a person buys a product or not, we would end up having outliers. Outliers could be of any kind and unpredictable. Somebody could have made an order accidently, for example, or they could have given access to their friends who then made the order, and so on.
Theoretically, deep neural networks are ideal for finding patterns and solutions from your dataset because they are supposed to mimic the human brain. However, in practice, this is often not quite the case. Your network will be able to solve problems easily by finding the pattern if your data is clean and properly formatted. PyTorch gives data preprocessing wrappers out of the box...