Package website: release | dev
Deep Learning with torch and mlr3.
# Install from CRAN
install.packages("mlr3torch")
# Install the development version from GitHub:
::pak("mlr-org/mlr3torch") pak
Afterwards, you also need to run the command below:
::install_torch() torch
More information about installing torch
can be found here.
mlr3torch
is a deep learning framework for the mlr3
ecosystem built on top
of torch
. It
allows to easily build, train and evaluate deep learning models in a few
lines of codes, without needing to worry about low-level details.
Off-the-shelf learners are readily available, but custom architectures
can be defined by connecting PipeOpTorch
operators in an
mlr3pipelines::Graph
.
Using predefined learners such as a simple multi layer perceptron
(MLP) works just like any other mlr3 Learner
.
library(mlr3torch)
= lrn("classif.mlp",
learner_mlp # defining network parameters
activation = nn_relu,
neurons = c(20, 20),
# training parameters
batch_size = 16,
epochs = 50,
device = "cpu",
# Proportion of data to use for validation
validate = 0.3,
# Defining the optimizer, loss, and callbacks
optimizer = t_opt("adam", lr = 0.1),
loss = t_loss("cross_entropy"),
callbacks = t_clbk("history"), # this saves the history in the learner
# Measures to track
measures_valid = msrs(c("classif.logloss", "classif.ce")),
measures_train = msrs(c("classif.acc")),
# predict type (required by logloss)
predict_type = "prob"
)
Below, we train this learner on the sonar example task:
$train(tsk("sonar")) learner_mlp
Next, we construct the same architecture using
PipeOpTorch
objects. The first pipeop – a
PipeOpTorchIngress
– defines the entrypoint of the network.
All subsequent pipeops define the neural network layers.
= po("torch_ingress_num") %>>%
architecture po("nn_linear", out_features = 20) %>>%
po("nn_relu") %>>%
po("nn_head")
To turn this into a learner, we configure the loss, optimizer, callbacks as well as the training arguments.
= architecture %>>%
graph_mlp po("torch_loss", loss = t_loss("cross_entropy")) %>>%
po("torch_optimizer", optimizer = t_opt("adam", lr = 0.1)) %>>%
po("torch_callbacks", callbacks = t_clbk("history")) %>>%
po("torch_model_classif",
batch_size = 16, epochs = 50, device = "cpu")
= as_learner(graph_mlp) graph_lrn
To work with generic tensors, the lazy_tensor
type can
be used. It wraps a torch::dataset
, but allows to
preprocess the data (lazily) using PipeOp
objects. Below,
we flatten the MNIST task, so we can then train a multi-layer perceptron
on it. Note that this does not transform the data in-memory,
but is only applied when the data is actually loaded.
# load the predefined mnist task
= tsk("mnist")
mnist $head(3L)
mnist#> label image
#> <fctr> <lazy_tensor>
#> 1: 5 <tnsr[1x28x28]>
#> 2: 0 <tnsr[1x28x28]>
#> 3: 4 <tnsr[1x28x28]>
# Flatten the images
= po("trafo_reshape", shape = c(-1, 28 * 28))
flattener = flattener$train(list(mnist))[[1L]]
mnist_flat
$head(3L)
mnist_flat#> label image
#> <fctr> <lazy_tensor>
#> 1: 5 <tnsr[784]>
#> 2: 0 <tnsr[784]>
#> 3: 4 <tnsr[784]>
To actually access the tensors, we can call
materialize()
. We only show a slice of the resulting tensor
for readability:
materialize(
$data(1:2, cols = "image")[[1L]],
mnist_flatrbind = TRUE
1:2, 1:4]
)[#> torch_tensor
#> 0 0 0 0
#> 0 0 0 0
#> [ CPUFloatType{2,4} ]
Below, we define a more complex architecture that has one single
input which is a lazy_tensor
. For that, we first define a
single residual block:
= list(
layer po("nop"),
po("nn_linear", out_features = 50L) %>>%
po("nn_dropout") %>>% po("nn_relu")
%>>% po("nn_merge_sum") )
Next, we create a neural network that takes as input a
lazy_tensor
(po("torch_ingress_ltnsr")
). It
first applies a linear layer and then repeats the above layer using the
special PipeOpTorchBlock
, followed by the network’s head.
After that, we configure the loss, optimizer and the training
parameters. Note that po("nn_linear_0")
is equivalent to
po("nn_linear", id = "nn_linear_0")
and we need this here
to avoid ID clashes with the linear layer from
po("nn_block")
.
= po("torch_ingress_ltnsr") %>>%
deep_network po("nn_linear_0", out_features = 50L) %>>%
po("nn_block", layer, n_blocks = 5L) %>>%
po("nn_head") %>>%
po("torch_loss", loss = t_loss("cross_entropy")) %>>%
po("torch_optimizer", optimizer = t_opt("adam")) %>>%
po("torch_model_classif",
epochs = 100L, batch_size = 32
)
Next, we prepend the preprocessing step that flattens the images so we can directly apply this learner to the unflattened MNIST task.
= as_learner(
deep_learner %>>% deep_network
flattener
)$id = "deep_network" deep_learner
In order to keep track of the performance during training, we use 20% of the data and evaluate it using classification accuracy.
set_validate(deep_learner, 0.2)
$param_set$set_values(
deep_learnertorch_model_classif.measures_valid = msr("classif.acc")
)
All that is left is to train the learner:
$train(mnist) deep_learner
mlr3::Learner
s.Graph
language
from mlr3pipelines
.lazy_tensor
type.lazy_tensor
objects can be stored alongside tabular
data.mlr3
ecosystem.mlr3tuning
and friends.TEST_TORCH = 1
, e.g. by adding it to
.Renviron
.torch
none of this would
have been possible.torch
.PipeOpTorch
operators is
inspired by keras.mlr3torch is a free and open source software project that encourages participation and feedback. If you have any issues, questions, suggestions or feedback, please do not hesitate to open an “issue” about it on the GitHub page!
In case of problems / bugs, it is often helpful if you provide a “minimum working example” that showcases the behaviour (but don’t worry about this if the bug is obvious).
Please understand that the resources of the project are limited: response may sometimes be delayed by a few days, and some feature suggestions may be rejected if they are deemed too tangential to the vision behind the project.