Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
R Data Analysis Projects

You're reading from   R Data Analysis Projects Build end to end analytics systems to get deeper insights from your data

Arrow left icon
Product type Paperback
Published in Nov 2017
Publisher Packt
ISBN-13 9781788621878
Length 366 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Gopi Subramanian Gopi Subramanian
Author Profile Icon Gopi Subramanian
Gopi Subramanian
Arrow right icon
View More author details
Toc

Table of Contents (9) Chapters Close

Preface 1. Association Rule Mining FREE CHAPTER 2. Fuzzy Logic Induced Content-Based Recommendation 3. Collaborative Filtering 4. Taming Time Series Data Using Deep Neural Networks 5. Twitter Text Sentiment Classification Using Kernel Density Estimates 6. Record Linkage - Stochastic and Machine Learning Approaches 7. Streaming Data Clustering Analysis in R 8. Analyze and Understand Networks Using R

Wrapping up

The final step in any data analysis project is documentation—either generating a report of the findings or documenting the scripts and data used. In our case, we are going to wrap up with a small application. We will use RShiny, an R web application framework. RShiny is a powerful framework for developing interactive web applications using R. We will leverage the code that we have written to generate a simple, yet powerful, user interface for our retail customers.

To keep things simple, we have a set of three screens. The first screen, as shown in the following screenshot, allows the user to vary support and confidence thresholds and view the rules generated. It also has additional interest measures, lift, conviction, and leverage. The user can sort the rule by any of these interest measures:

Another screen is a scatter plot representation of the rules:

Finally, a graph representation to view the product grouping for the easy selection of products for the cross-selling campaign is as follows:

The complete source code is available in <../App.R>:

########################################################################
#
# R Data Analysis Projects
#
# Chapter 1
#
# Building Recommender System
# A step step approach to build Association Rule Mining
#
# Script:
#
# Rshiny app
#
# Gopi Subramanian
#########################################################################
library(shiny)
library(plotly)
library(arules)
library(igraph)
library(arulesViz)
get.txn <- function(data.path, columns){
# Get transaction object for a given data file
#
# Args:
# data.path: data file name location
# columns: transaction id and item id columns.
#
# Returns:
# transaction object
transactions.obj <- read.transactions(file = data.path, format = "single",
sep = ",",
cols = columns,
rm.duplicates = FALSE,
quote = "", skip = 0,
encoding = "unknown")
return(transactions.obj)
}
get.rules <- function(support, confidence, transactions){
# Get Apriori rules for given support and confidence values
#
# Args:
# support: support parameter
# confidence: confidence parameter
#
# Returns:
# rules object
parameters = list(
support = support,
confidence = confidence,
minlen = 2, # Minimal number of items per item set
maxlen = 10, # Maximal number of items per item set
target = "rules"

)

rules <- apriori(transactions, parameter = parameters)
return(rules)
}
find.rules <- function(transactions, support, confidence, topN = 10){
# Generate and prune the rules for given support confidence value
#
# Args:
# transactions: Transaction object, list of transactions
# support: Minimum support threshold
# confidence: Minimum confidence threshold
# Returns:
# A data frame with the best set of rules and their support and confidence values


# Get rules for given combination of support and confidence
all.rules <- get.rules(support, confidence, transactions)

rules.df <-data.frame(rules = labels(all.rules)
, all.rules@quality)

other.im <- interestMeasure(all.rules, transactions = transactions)

rules.df <- cbind(rules.df, other.im[,c('conviction','leverage')])


# Keep the best rule based on the interest measure
best.rules.df <- head(rules.df[order(-rules.df$leverage),],topN)

return(best.rules.df)
}
plot.graph <- function(cross.sell.rules){
# Plot the associated items as graph
#
# Args:
# cross.sell.rules: Set of final rules recommended
# Returns:
# None
edges <- unlist(lapply(cross.sell.rules['rules'], strsplit, split='=>'))
g <- graph(edges = edges)
return(g)

}
columns <- c("order_id", "product_id") ## columns of interest in data file
data.path = '../../data/data.csv' ## Path to data file
transactions.obj <- get.txn(data.path, columns) ## create txn object
server <- function(input, output) {
cross.sell.rules <- reactive({
support <- input$Support
confidence <- input$Confidence
cross.sell.rules <- find.rules( transactions.obj, support, confidence )
cross.sell.rules$rules <- as.character(cross.sell.rules$rules)
return(cross.sell.rules)

})

gen.rules <- reactive({
support <- input$Support
confidence <- input$Confidence
gen.rules <- get.rules( support, confidence ,transactions.obj)
return(gen.rules)

})


output$rulesTable <- DT::renderDataTable({
cross.sell.rules()
})

output$graphPlot <- renderPlot({
g <-plot.graph(cross.sell.rules())
plot(g)
})

output$explorePlot <- renderPlot({
plot(x = gen.rules(), method = NULL,
measure = "support",
shading = "lift", interactive = FALSE)
})


}
ui <- fluidPage(
headerPanel(title = "X-Sell Recommendations"),
sidebarLayout(
sidebarPanel(
sliderInput("Support", "Support threshold:", min = 0.01, max = 1.0, value = 0.01),
sliderInput("Confidence", "Support threshold:", min = 0.05, max = 1.0, value = 0.05)

),
mainPanel(
tabsetPanel(
id = 'xsell',
tabPanel('Rules', DT::dataTableOutput('rulesTable')),
tabPanel('Explore', plotOutput('explorePlot')),
tabPanel('Item Groups', plotOutput('graphPlot'))
)
)
)
)
shinyApp(ui = ui, server = server)

We have described the get.txn, get.rules, and find.rules functions in the previous section. We will not go through them again here. The preceding code is a single page RShiny app code; both the server and the UI component reside in the same file.

The UI component is as follows:

 ui <- fluidPage(
headerPanel(title = "X-Sell Recommendations"),
sidebarLayout(
sidebarPanel(
sliderInput("Support", "Support threshold:", min = 0.01, max = 1.0, value = 0.01),
sliderInput("Confidence", "Support threshold:", min = 0.05, max = 1.0, value = 0.05)

),
mainPanel(
tabsetPanel(
id = 'xsell',
tabPanel('Rules', DT::dataTableOutput('rulesTable')),
tabPanel('Explore', plotOutput('explorePlot')),
tabPanel('Item Groups', plotOutput('graphPlot'))
)
)
)
)

We define the screen layout in this section. This section can also be kept in a separate file called UI.R. The page is defined by two sections, a panel in the left, defined by sidebarPanel, and a main section defined under mainPanel. Inside the side bar, we have defined two slider controls for the support and confidence thresholds respectively. The main panel contains a tab-separated window, defined by tabPanel.

The main panel has three tabs; each tab has a slot defined for the final set of rules, with their interest measures, a scatter plot for the rules, and finally the graph plot of the rules.

The server component is as follows:

server <- function(input, output) {
cross.sell.rules <- reactive({
support <- input$Support
confidence <- input$Confidence
cross.sell.rules <- find.rules( transactions.obj, support, confidence )
cross.sell.rules$rules <- as.character(cross.sell.rules$rules)
return(cross.sell.rules)

})

The cross.sell.rules data frame is defined as a reactive component. When the values of the support and confidence thresholds change in the UI, cross.sell.rules data frame will be recomputed. This frame will be served to the first page, where we have defined a slot for this table, called rulesTable:

 gen.rules <- reactive({
support <- input$Support
confidence <- input$Confidence
gen.rules <- get.rules( support, confidence ,transactions.obj)
return(gen.rules)
})

This reactive component retrieves the calculations and returns the rules object every time the support or/and confidence threshold is changed by the user in the UI:

 output$rulesTable <- DT::renderDataTable({
cross.sell.rules()
})

The preceding code renders the data frame back to the UI:

 output$graphPlot <- renderPlot({
g <-plot.graph(cross.sell.rules())
plot(g)
})

output$explorePlot <- renderPlot({
plot(x = gen.rules(), method = NULL,
measure = "support",
shading = "lift", interactive = FALSE)
})


}

The preceding two pieces of code render the plot back to the UI.

You have been reading a chapter from
R Data Analysis Projects
Published in: Nov 2017
Publisher: Packt
ISBN-13: 9781788621878
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime