---
title: "Compute weighted mean with `stat_weighted_mean()`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Compute weighted mean with `stat_weighted_mean()`}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
```{r setup}
library(ggstats)
library(ggplot2)
```
`stat_weighted_mean()` computes mean value of **y** (taking into account any **weight** aesthetic if provided) for each value of **x**. More precisely, it will return a new data frame with one line per unique value of **x** with the following new variables:
- **y**: mean value of the original **y** (i.e. **numerator**/**denominator**)
- **numerator**
- **denominator**
Let's take an example. The following plot shows all tips received according to the day of the week.
```{r}
data(tips, package = "reshape")
ggplot(tips) +
aes(x = day, y = tip) +
geom_point()
```
To plot their mean value per day, simply use `stat_weighted_mean()`.
```{r}
ggplot(tips) +
aes(x = day, y = tip) +
stat_weighted_mean()
```
We can specify the geometry we want using `geom` argument. Note that for lines, we need to specify the **group** aesthetic as well.
```{r}
ggplot(tips) +
aes(x = day, y = tip, group = 1) +
stat_weighted_mean(geom = "line")
```
An alternative is to specify the statistic in `ggplot2::geom_line()`.
```{r}
ggplot(tips) +
aes(x = day, y = tip, group = 1) +
geom_line(stat = "weighted_mean")
```
Of course, it could be use with other geometries. Here a bar plot.
```{r}
p <- ggplot(tips) +
aes(x = day, y = tip, fill = sex) +
stat_weighted_mean(geom = "bar", position = "dodge") +
ylab("mean tip")
p
```
It is very easy to add facets. In that case, computation will be done separately for each facet.
```{r}
p + facet_grid(rows = vars(smoker))
```
`stat_weighted_mean()` could be also used for computing proportions as a proportion is technically a mean of binary values (0 or 1).
```{r}
ggplot(tips) +
aes(x = day, y = as.integer(smoker == "Yes"), fill = sex) +
stat_weighted_mean(geom = "bar", position = "dodge") +
scale_y_continuous(labels = scales::percent) +
ylab("proportion of smoker")
```
Finally, you can use the **weight** aesthetic to indicate weights to take into account for computing means / proportions.
```{r}
d <- as.data.frame(Titanic)
ggplot(d) +
aes(x = Class, y = as.integer(Survived == "Yes"), weight = Freq, fill = Sex) +
geom_bar(stat = "weighted_mean", position = "dodge") +
scale_y_continuous(labels = scales::percent) +
labs(y = "Proportion who survived")
```