Generators are a simple way of creating iterator functions, i.e. functions that you can call to return a new value. The iteration protocol is described in ?iterator
. Here is a simple iterator that iterates over the elements of 1:3
:
library(coro)
<- as_iterator(1:3)
iterator
# Call the iterator to retrieve new values
iterator()
#> [1] 1
iterator()
#> [1] 2
Once the iterator is exhausted, it returns a sentinel value that signals to its caller that there are no more values available:
# This is the last value
iterator()
#> [1] 3
# This is the exhaustion sentinel
iterator()
#> .__exhausted__.
In R we normally don’t use this sort of iteration to work with vectors. Instead, we use the idiomatic techniques of vectorised programming. Iterator functions are useful for very specific tasks:
Iterating over chunks of data when the whole data doesn’t fit in memory.
Generating sequences when you don’t know in advance how many elements you will need. These sequences may be complex and even infinite.
The iterator protocol is designed to be free of dependency. However, the easiest way to create an iterator is by using the generator factories provided in this package.
Generators create functions that can yield, i.e. suspend themselves. When a generator reaches a yield(value)
statement it returns the value as if you called return(value)
. However, calling the generator again resumes the function right where it left off. Because they preserve their state between invokations, generators are ideal for creating iterator functions.
<- generator(function() {
generate_abc for (x in letters[1:3]) {
yield(x)
} })
generator()
creates an iterator factory. This is a function that returns fresh iterator functions:
# Create the iterator
<- generate_abc()
abc
# Use the iterator by invoking it
abc()
#> [1] "a"
abc()
#> [1] "b"
Once the last loop in a generator has finished iterating (here there is only one), it returns the exhaustion sentinel:
# Last value
abc()
#> [1] "c"
# Exhaustion sentinel
abc()
#> .__exhausted__.
abc()
#> .__exhausted__.
You can also create infinite iterators that can’t be exhausted:
<- generator(function(from = 1L) {
generate_natural_numbers <- from
x repeat {
yield(x)
<- x + 1L
x
}
})
<- generate_natural_numbers(from = 10L)
natural_numbers
# The iterator generates new numbers forever
natural_numbers()
#> [1] 10
natural_numbers()
#> [1] 11
Iterating manually over an iterator function is a bit tricky because you have to watch out for the exhaustion sentinel:
<- generate_abc()
abc
while (!is_exhausted(x <- abc())) {
print(x)
}#> [1] "a"
#> [1] "b"
#> [1] "c"
A simpler way is to iterate with a for
loop using the iterate()
helper. Within iterate()
, for
understands the iterator protocol:
<- generate_abc()
abc
loop(for (x in abc) {
print(x)
})#> [1] "a"
#> [1] "b"
#> [1] "c"
You can also collect all remaning values of an iterator in a list with collect()
:
<- generate_abc()
abc
collect(abc)
#> [[1]]
#> [1] "a"
#>
#> [[2]]
#> [1] "b"
#>
#> [[3]]
#> [1] "c"
Beware that trying to exhaust an infinite iterator is a programming error. This causes an infinite loop that never returns, forcing the user to interrupt R with ctrl-c
. Make sure that you iterate over an infinite iterator only a finite amount of time:
for (x in 1:3) {
print(natural_numbers())
}#> [1] 12
#> [1] 13
#> [1] 14
collect(natural_numbers, n = 3)
#> [[1]]
#> [1] 15
#>
#> [[2]]
#> [1] 16
#>
#> [[3]]
#> [1] 17
A generator factory can take another iterator as argument to modify its values. This pattern is called adapting:
library(magrittr)
<- generator(function(i) {
adapt_toupper for (x in i) {
yield(toupper(x))
}
})
<- generate_abc() %>% adapt_toupper()
ABC
ABC()
#> [1] "A"
ABC()
#> [1] "B"
Once the modified iterator is exhausted, the adaptor automatically closes as well:
ABC()
#> [1] "C"
ABC()
#> .__exhausted__.
As a user, you might not want to create an iterator factory for a one-off adaptor. In this case you can use gen()
instead of generator()
. This enables a more pythonic style of working with iterators:
<- generate_abc()
abc <- gen(for (x in abc) yield(toupper(x)))
ABC
collect(ABC)
#> [[1]]
#> [1] "A"
#>
#> [[2]]
#> [1] "B"
#>
#> [[3]]
#> [1] "C"
Or you can use the general purpose adaptor adapt_map()
. It maps a function over each value of an iterator:
<- generator(function(.i, .fn, ...) {
adapt_map for (x in .i) {
yield(.fn(x, ...))
}
})
<- generate_abc() %>% adapt_map(toupper)
ABC
ABC()
#> [1] "A"