The first chapter of this book is dedicated to a very important (if not the most important) part of any data science/quantitative finance project—gathering and working with data. In line with the "garbage in, garbage out" maxim, we should strive to have data of the highest possible quality, and correctly preprocess it for later use with statistical and machine learning algorithms. The reason for this is simple—the results of our analyses highly depend on the input data, and no sophisticated model will be able to compensate for that.
In this chapter, we cover the entire process of gathering financial data and preprocessing it into the form that is most commonly used in real-life projects. We begin by presenting a few possible sources of high-quality data, show how to convert prices into returns (which have properties desired...