Comparing data frames for identity
In this recipe, we show you how to check if two data frames are identical and if they contain unique rows.
Getting ready
In this recipe, we will use the grades
dataset, which we have already employed in the Working with categorical data recipe.
Make sure you have the CSV.jl
and DataFrames.jl
packages installed. If they are missing, add them using the following commands:
julia> using Pkg julia> Pkg.add("DataFrames") julia> Pkg.add("CSV")
Before we begin, start the Julia command line and load the grades.csv
file into a data frame, using the following commands:
julia> using CSV, DataFrames julia> df1 = CSV.read("grades.csv") 99×6 DataFrame │Row│Prefix│Assignment│Tutorial│Midterm│TakeHome│Final │ │ │Int64 │Float64 │Float64 │Float64│Float64 │Float64│ │---│------│----------│--------│-------│--------│-------│ │ 1 │ 5 │ 57.14 │ 34.09 │ 64.38 │ 51.48 │ 52.5 │ │ 2 │ 8 │ 95.05 │ 105.49 │ 67.5 │ 99.07 │ 68.33 │ │ 3 │ 8 │ 83.7...