The R-package m61r gathers functions similar to the ones present in dplyr and tidyr, but written entirely in Base R without any external dependencies. All functions are designed to work directly with standard data.frames.
The filter_ function subsets a data frame, retaining all rows that satisfy your specific conditions.
tmp <- filter_(CO2, ~Plant == "Qn1")
head(tmp)## Plant Type Treatment conc uptake
## 1 Qn1 Quebec nonchilled 95 16.0
## 2 Qn1 Quebec nonchilled 175 30.4
## 3 Qn1 Quebec nonchilled 250 34.8
## 4 Qn1 Quebec nonchilled 350 37.2
## 5 Qn1 Quebec nonchilled 500 35.3
## 6 Qn1 Quebec nonchilled 675 39.2
The select_ function allows you to zoom in on specific columns of interest.
tmp <- select_(CO2, ~c(Plant, Type))
head(tmp, 2)## Plant Type
## 1 Qn1 Quebec
## 2 Qn1 Quebec
mutate_ adds new variables while preserving existing ones, whereas transmutate_ keeps only the newly created variables.
tmp <- mutate_(CO2, z = ~conc / uptake)
head(tmp, 2)## Plant Type Treatment conc uptake z
## 1 Qn1 Quebec nonchilled 95 16.0 5.937500
## 2 Qn1 Quebec nonchilled 175 30.4 5.756579
summarise_ creates a new data frame with aggregated statistics. It can be used on the whole data frame or on specific groups.
# Global summary
summarise_(CO2, mean = ~mean(uptake), sd = ~sd(uptake))## mean sd
## 1 27.2131 10.81441
# Grouped summary
g_info <- get_group_indices_(CO2, ~c(Type, Treatment))
summarise_(CO2, group_info = g_info, mean = ~mean(uptake))## Type Treatment mean
## 1 Quebec nonchilled 35.33333
## 2 Mississippi nonchilled 25.95238
## 3 Quebec chilled 31.75238
## 4 Mississippi chilled 15.81429
m61r provides a full suite of join functions. Here is an example of an inner join between two datasets.
inner_join_(authors, books, by.x = "surname", by.y = "name")## surname nationality title
## 1 Tukey US EDA
## 2 Venables Australia MASS
## 3 Tierney US LISP-STAT
## 4 Ripley UK Spatial
## 5 McNeil Australia Interactive
The gather_ function transforms data from a “wide” format to a “long” format, making it easier to analyse certain types of data.
df3 <- data.frame(id = 1:2, age = c(40, 50), dose.a1 = c(1, 2), dose.a2 = c(2, 1))
df4 <- gather_(df3, pivot = c("id", "age"))
df4## id age parameters values
## 1 1 40 dose.a1 1
## 2 2 50 dose.a1 2
## 3 1 40 dose.a2 2
## 4 2 50 dose.a2 1