The ggplot2 package has become the dominant R package for creating serious plots, mainly due to its beautiful aesthetics. The ggplot package allows the user to define the plots in a sequential (or additive) way, and this great syntax has contributed to its enormous success. As you would expect, this package can handle a wide variety of plots.
Creating barplots using ggplot
Getting ready
In order to run this example, you will need the ggplot2 and the reshape packages. Both can be installed using the install.packages() command.
How to do it...
In this example, we will use a dataset in a wide format (multiple columns for each record), and we will do the appropriate data manipulation in order to transform it into a long format. Finally, we will use the ggplot2 package to make a stacked plot with that transformed data. In particular, we have data for certain companies. The adjusted sales are sales where the taxes have been removed and the unadjusted sales are the raw sales. Naturally, the unadjusted sales will always be greater than the adjusted ones, as shown in the following table:
Company | Adjusted sales | Unadjusted sales |
Company1 | 298 | 394 |
Company2 | 392 | 454 |
Company3 | 453 | 499 |
Company4 | 541 | 598 |
Company5 | 674 | 762 |
- Import the ggplot2 and reshape libraries as follows:
library(ggplot2)
library(reshape)
- Then load the dataset:
datag = read.csv("./ctgs.csv")
- Transform the data into a long format:
transformed_data = melt(datag,id.vars = "Company")
- Use the ggplot function to create the plot:
ggplot(transformed_data, aes(x = Company, y = value, fill = variable)) + geom_bar(stat = "identity")
This results in the following output:
How it works...
In order to build a stacked plot, we need to supply three arguments to the aes() function. The x variable is the x-axis, y is the bar height, and fill is the color. The geom_var variable specifies the type of bar that will be used. The stat=identity value tells ggplot that we don't want to apply any transformation, and leave the data as it is. We will use the reshape package for transforming the data into the format that we need.
The result has one bar for each company, with two colors. The red color corresponds to the Adjusted Sales and the green color corresponds to the Unadjusted Sales.
There's more...
We can change the position of the bars, and place them one next to the other, instead of stacking them up. This can be achieved by using the position=position_dodge() option as shown in the following code block:
ggplot(transformed_data, aes(x = Company, y = value, fill = variable)) + geom_bar(stat = "identity",position=position_dodge())
This results in the following output:
See also
An excellent ggplot2 tutorial can be found at http://r-statistics.co/Complete-Ggplot2-Tutorial-Part2-Customizing-Theme-With-R-Code.html.