Transforming Data
We have read the data from files and databases. In this section, we will perform some operations to consolidate, filter, aggregate, and transform them. We will start with consolidation operations.
Joining and Concatenating
The web activity dataset from the old system comes from a CSV file and, after column renaming, consists of two data columns: CustomerKey
and First_WebActivity_
. First_WebActivity_
ranks how active a customer is on the company's web site: 0
means not active all and 3
means very active.
The web activity dataset from the new web system comes from the SQLite database and consists of three columns: CustomerKey
, First_WebActivity_
, and Count
. Count
is just a progressive number associated with the data rows. It is not important for the upcoming analysis. We can decide later whether to remove it or keep it.
It would be nice to have both rankings for the web activity, from the old and the new system, together in one single data table. For...