Data understanding, preparation, and recommendations
The one library that we will need for this exercise is recommenderlab
. The package was developed by the Southern Methodist University's Lyle Engineering Lab, and they have an excellent website with supporting documentation at https://lyle.smu.edu/IDA/recommenderlab/:
> library(recommenderlab) > data(Jester5k) > Jester5k 5000 x 100 rating matrix of class 'realRatingMatrix' with 362106 ratings.
The rating matrix contains 362106
total ratings. It is quite easy to get a list of a user's ratings. Let's look at user number 10
. The following output is abbreviated for the first five jokes:
> as(Jester5k[10,], "list") $u12843 j1 j2 j3 j4 j5 ... -1.99 -6.89 2.09 -4.42 -4.90 ...
You can also look at the mean rating for a user (user 10
) and/or the mean rating for a specific joke (joke 1
), as follows:
> rowMeans(Jester5k[10,]) u12843 -1.6 > colMeans(Jester5k[,1]) j1 0.92
One method to get a better understanding...