Joining DataFrames
To demonstrate joining, we will use two CSV files-dest.csv
and tips.csv
. The use case behind it is that we are running a taxi company. Every time a passenger is dropped off at his or her destination, we add a row to the dest.csv
file with the employee number of the driver and the destination:
EmpNr,Dest5,The Hague3,Amsterdam9,Rotterdam
Sometimes drivers get a tip, so we want that registered in the tips.csv
file (if this doesn't seem realistic, please feel free to come up with your own story):
EmpNr,Amount5,109,57,2.5
Database-like joins in Pandas can be done with either the merge()
function or the join()
DataFrame method. The join()
method joins onto indices by default, which might not be what you want. In SQL a relational database query language we have the inner join, left outer join, right outer join, and full outer join.
Note
An inner join selects rows from two tables, if and only if values match, for columns specified in the join condition. Outer joins do not require a...