ggplot2)March 4, 2026
facilitates understanding of data (i.e., makes data “accessible”)
reveals “hidden” structures, trends and patterns
identifies outliers
helps to communicate results clearly (storytelling)
…
But there are more such as Dot plots, venn diagrams, maps, networks, & many more…!
ggplot2 packageFirst of all, there are other graphical R packages such as:
base (R Core Team, 2025) solution (e.g., plot() function)lattice package (Sarkar, 2008)plotly package (Sievert et al., 2026)The ggplot2 package (Wickham et al., 2026) …
is based on the Grammar of Graphics (Wilkinson & Wills, 2005) → graphs can be composed by independent components
has great flexibility (and add-on packages) → nothing is impossible
has a broad community support (e.g., https://stackoverflow.com/), and AI tools help to generate ggplot2 code
belongs to the tidyverse (Wickham, 2023) package collection
Every ggplot needs 3 components to produce a plot:
geom_point for a scatter plot)
Note
Whereas the data and the aesthetic mappings (aes()) are stated within the ggplot() function, the geom_point layer is added with a +.
ggplot2Setting a theme (optional). This can be done by using theme_set() function. There are some built-in themes such as theme_minimal(), theme_bw(), or theme_classic().
geom_* Layersgeom_bar layer (basic)geom_bar() layer
geom_bar layer (basic, print)geom_bar layer (customized)color argument: Customize color of boarder
linewidth argument: Customize size of boarder
fill argument: Customize color of bars
label argument in scale_x_discrete function can be used to provide labels for categories
labs function: Customize text on x- and y-axes
geom_bar layer (customized, print)geom_histogram layer (basic)geom_histogram layer (basic, print)geom_histogram layer (customized)color: argument for color of the boarders of the bars
fill: argument for the color of the bars (do not use it together with the fill argument in the aes()!)
bins argument: Number of bins
binwidth argument: The width of the bins
labs layer: Provide names of axis and title
geom_histogram layer (customized, print)geom_point layer (basic)geom_point layer (basic, print)geom_point layer (customized)color: argument for color of the boarders of the points (if fill \(\neq\) NULL).
fill: argument for the color of the bars (do not use it together with the fill argument in the aes()!).
size argument: Size of the points.
shape argument: Shape of the points (for more see https://ggplot2.tidyverse.org/articles/ggplot2-specs.html).
labs layer: Provide names of axis and title.
geom_point layer (customized, print)geom_point layer (colored by group)p_scp_c2 <- ggplot(data = dat,
aes(y = mathach, x = ses)) +
1 geom_point(aes(color = factor(female)),
2 alpha = .5,
3 size = 2.5) +
4 scale_color_manual(
5 values = c("black", "red"),
6 labels = c("0" = "Male",
"1" = "Female")) +
7 labs(y = "Math Achievement",
x = "Socioeconomic status",
color = "Gender",
title = "A customized scatterplot")aes() function) to provide information which points should be colored.
alpha argument: Refers to the opacity of the points.
size argument: Size of the points.
scale_color_manual layer: Customize the discrete color scale.
values argument: Change color
labels argument: Provide labels (alternatively, you could use the factor() function in advance)
labs layer: Provide names of axis, legend title, & title
geom_point layer (colored by group, print)geom_boxplot layer (basic)geom_boxplot layer (basic, print)geom_boxplot layer (customized)color: argument for color of the boarders of the points (if fill \(\neq\) NULL).
fill: argument for the color of the bars (do not use it together with the fill argument in the aes()!).
size argument: Size boarders.
width argument: Controls width of the bars.
staplewidth argument: Controls the width of the staples.
labs layer: Provide names of axis and title.
geom_boxplot layer (customized, print)Facetting allows to split data into subsets and display them across different plots. This can be done with the facet_wrap() or facet_grid() functions. To demonstrate the facet_grid() option, I used the customized histogram (p_hist_c) which was generated here.
Labeling factors may helpful for data visualization (not necessarily for data analyses!), because ggplot2 then directly access the labels. This can be done with the levels and labels arguments of the factor() function.
female variableGender variable