Understanding the differences between concat, join, and merge
The merge
and join
DataFrame (and not Series) methods and the concat
function all provide very similar functionality to combine multiple pandas objects together. As they are so similar and they can replicate each other in certain situations, it can get very confusing when and how to use them correctly. To help clarify their differences, take a look at the following outline:
concat
:- Pandas function
- Combines two or more pandas objects vertically or horizontally
- Aligns only on the index
- Errors whenever a duplicate appears in the index
- Defaults to outer join with option for inner
join
:- DataFrame method
- Combines two or more pandas objects horizontally
- Aligns the calling DataFrame's column(s) or index with the other objects' index (and not the columns)
- Handles duplicate values on the joining columns/index by performing a cartesian product
- Defaults to left join with options for inner, outer, and right
merge
:- DataFrame method
- Combines exactly two...