There are many ethical implications and risks when manipulating data that you need to know. We live in a world where most deep learning algorithms will have to be corrected, by re-training them, because it was found that they were biased or unfair. That is very unfortunate; you want to be a person who exercises responsible AI and produces carefully thought out models.
When manipulating data, be careful about removing outliers from the data just because you think they are decreasing your model's performance. Sometimes, outliers represent information about protected groups or minorities, and removing those perpetuates unfairness and introduces bias toward the majority groups. Avoid removing outliers unless you are absolutely sure that they are errors caused by faulty sensors or human error.
Be careful of the way you transform the distribution of the data. Altering the distribution is fine in most cases, but if you are dealing with demographic...