Add group mean comparisons to a ggplot. The comparisons can be performed using the t-test, Wilcoxon rank-sum test, one-way ANOVA, or Kruskal-Wallis test.
Usage
stat_compare(
mapping = NULL,
data = NULL,
position = "identity",
...,
nudge = 0,
start = NULL,
breaks = NULL,
labels = NULL,
cutoff = NULL,
method = NULL,
ref_group = NULL,
tip_length = 0.02,
parametric = FALSE,
correction = "none",
panel_indep = FALSE,
method_args = NULL,
comparisons = NULL,
step_increase = 0.1,
inherit.aes = TRUE
)
Arguments
- mapping
Set of aesthetic mappings created by
aes()
. If specified andinherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot. You must supplymapping
if there is no plot mapping.- data
The data to be displayed in this layer. There are three options:
If
NULL
, the default, the data is inherited from the plot data as specified in the call toggplot()
.A
data.frame
, or other object, will override the plot data. All objects will be fortified to produce a data frame. Seefortify()
for which variables will be created.A
function
will be called with a single argument, the plot data. The return value must be adata.frame
, and will be used as the layer data. Afunction
can be created from aformula
(e.g.~ head(.x, 10)
).- position
A position adjustment to use on the data for this layer. This can be used in various ways, including to prevent overplotting and improving the display. The
position
argument accepts the following:The result of calling a position function, such as
position_jitter()
. This method allows for passing extra arguments to the position.A string naming the position adjustment. To give the position as a string, strip the function name of the
position_
prefix. For example, to useposition_jitter()
, give the position as"jitter"
.For more information and other ways to specify the position, see the layer position documentation.
- ...
additional arguments passed on to
geom_bracket()
.- nudge
numeric
, the nudge of start position in fraction of scale range.- start
numeric
, the bracket start position. Defaults to the maximum value ofy
.- breaks
numeric
, the breaks for p-value labels, likec(0, 0.001, 0.01, 0.05, 1)
.- labels
character
, the labels for p-value breaks, likec("***", "**", "*", "ns")
.- cutoff
numeric
, the cutoff for p-value, labels above this value will be removed.- method
function
, the method for the test; it should support formula interface and return a list with componentsp.value
andmethod
(name).- ref_group
character
, the reference group for comparison. other groups will be compared to this group.- tip_length
numeric
, the length of the bracket tips in fraction of scale range.- parametric
logical
, whether to use parametric test (t-test, One-way ANOVA) or non-parametric test (Wilcoxon rank sum test, Kruskal-Wallis test). Applicable only whenmethod
is NULL.- correction
character
, the method for p-value adjustment; options include p.adjust.methods with "none
" as the default.- panel_indep
logical
, whether to correct the p-value only at the panel level. IfFALSE
, the p-value will be corrected at the layer level.- method_args
list
, additional arguments to be passed to the test method.- comparisons
list
, a list of comparisons to be made. Each element should contain two groups to be compared.- step_increase
numeric
, the step increase in fraction of scale range for every additional comparison, in order to avoid overlapping brackets.- inherit.aes
If
FALSE
, overrides the default aesthetics, rather than combining with them. This is most useful for helper functions that define both data and aesthetics and shouldn't inherit behaviour from the default plot specification, e.g.borders()
.
Details
Usually you do not need to specify the test method, you only need to tell stat_compare()
whether you want to perform a parametric test or a nonparametric test, and stat_compare()
will automatically choose the appropriate test method based on your data.
For comparisons between two groups, the p-value is calculated by t-test (parametric) or Wilcoxon rank sum test (nonparametric). For comparisons among more than two groups, the p-value is calculated by One-way ANOVA (parametric) or Kruskal-Wallis test (nonparametric).
Computed variables
p
: p-value of the test.q
: adjusted p-value of the test.label
: the label of the bracket.method
: the method name of the test.xmin
,xmax
,ymin
,ymax
: the position of the bracket.
Examples
if (FALSE) { # interactive()
library(ggplot2)
p <- ggplot(mpg, aes(class, displ, color = class)) +
geom_boxplot(show.legend = FALSE) +
theme_test()
# Global comparison: Each x has only one group.
p + stat_compare()
# If you just want to display text, you can set parameters "bracket" to FALSE.
p + stat_compare(bracket = FALSE)
# If you want to display the test method, you can do this.
p + stat_compare(aes(label = after_stat(sprintf("%s: %s", method, label))))
# Comparison between two groups: specify a reference group.
p + stat_compare(ref_group = "minivan")
# If you only want to display the p-value less or equal to 0.01, you can do this.
p + stat_compare(ref_group = "minivan", cutoff = 0.01)
# If you want to display the significance level, you can do this.
p + stat_compare(ref_group = "minivan", breaks = c(0, 0.001, 0.01, 0.05, 1))
# Comparison between two groups: specify the comparison group.
p + stat_compare(tip_length = 0.05,
step_increase = 0,
comparisons = list(c("compact", "midsize"), c("pickup", "suv")),
arrow = grid::arrow(type = "closed", length = unit(0.1, "inches")))
# Yeah, this supports adding arrows.
# Within-group (grouped by the x-axis) population comparison.
ggplot(mpg, aes(drv, displ, fill = class)) +
geom_boxplot() +
stat_compare() +
stat_compare(aes(group = drv), nudge = 0.1, color = "gray") + # add global comparison
theme_test()
# Better adaptation to faceting.
ggplot(mpg, aes(drv, displ)) +
geom_boxplot() +
stat_compare(comparisons = combn(unique(mpg$drv), 2, simplify = FALSE)) +
facet_grid(cols = vars(class), scales = "free") +
theme_test()
# P-value correction
p <- ggplot(mpg, aes(class, displ)) +
geom_boxplot() +
facet_grid(cols = vars(cyl), scales = "free") +
theme_test()
# Layer-level P-value correction
p + stat_compare(ref_group = 1, correction = "fdr")
# Panel-level P-value correction
p + stat_compare(ref_group = 1, correction = "fdr", panel_indep = TRUE)
}