In this document we show how to use the likelihood
method to obtain function handlers for the objective function, and
gradient, (and hessian if using a Kalman filter), for instance to use
another optimization algorithm than stats::nlminb
.
We use the common Ornstein-Uhlenbeck process to showcase the use of
likelihood
.
\[ \mathrm{d}X_{t} = \theta (\mu - X_{t}) \, \mathrm{d}t \, + \sigma_{X} \, \mathrm{d}B_{t} \]
\[ Y_{t_{k}} = X_{t_{k}} + e_{t_{k}}, \qquad e_{t_{k}} \sim \mathcal{N}\left(0,\sigma_{Y}^{2}\right) \] We first create data by simulating the process
# Simulate data using Euler Maruyama
set.seed(10)
theta=10; mu=1; sigma_x=1; sigma_y=1e-1
#
dt.sim = 1e-3
t.sim = seq(0,1,by=dt.sim)
dw = rnorm(length(t.sim)-1,sd=sqrt(dt.sim))
x = 3
for(i in 1:(length(t.sim)-1)) {
x[i+1] = x[i] + theta*(mu-x[i])*dt.sim + sigma_x*dw[i]
}
# Extract observations and add noise
dt.obs = 1e-2
t.obs = seq(0,1,by=dt.obs)
y = x[t.sim %in% t.obs] + sigma_y * rnorm(length(t.obs))
# Create data
.data = data.frame(
t = t.obs,
y = y
)
We now construct the ctsmTMB
model object
# Create model object
obj = ctsmTMB$new()
# Add system equations
obj$addSystem(
dx ~ theta * (mu-x) * dt + sigma_x*dw
)
# Add observation equations
obj$addObs(
y ~ x
)
# Set observation equation variances
obj$setVariance(
y ~ sigma_y^2
)
# Specify algebraic relations
obj$setAlgebraics(
theta ~ exp(logtheta),
sigma_x ~ exp(logsigma_x),
sigma_y ~ exp(logsigma_y)
)
# Specify parameter initial values and lower/upper bounds in estimation
obj$setParameter(
logtheta = log(c(initial = 5, lower = 0, upper = 20)),
mu = c( initial = 0, lower = -10, upper = 10),
logsigma_x = log(c(initial = 1e-1, lower = 1e-5, upper = 5)),
logsigma_y = log(c(initial = 1e-1, lower = 1e-5, upper = 5))
)
# Set initial state mean and covariance
obj$setInitialState(list(x[1], 1e-1*diag(1)))
We are in principle ready to call the estimate
method to
run the optimization scheme using the built-in optimization which uses
stats::nlminb
i.e.
## Building model...
## Checking data...
## Constructing objective function and derivative tables...
## ...took: 0.043 seconds.
## Minimizing the negative log-likelihood...
## 0: 936.11682: 1.60944 0.00000 -2.30259 -2.30259
## 1: 87.083269: 1.05839 0.612612 -1.83165 -1.98751
## 2: 30.831804: 1.42872 0.571166 -1.47012 -1.13285
## 3: 27.093246: 1.22027 1.42418 -1.16175 -1.49867
## 4: 0.42802599: 1.21559 1.07263 -0.807822 -1.53233
## 5: -29.837043: 1.68587 0.793054 0.591933 -2.65227
## 6: -32.378284: 1.71155 0.797170 0.422376 -2.67022
## 7: -33.322237: 1.80730 0.816713 0.294476 -2.60826
## 8: -35.484031: 2.11436 0.813309 0.261892 -2.45452
## 9: -36.943433: 2.20642 0.984187 0.199473 -2.38963
## 10: -38.269996: 2.39678 1.01007 0.128045 -2.32821
## 11: -38.662773: 2.46559 1.03990 0.118501 -2.31645
## 12: -39.208765: 2.61559 1.05578 0.0954213 -2.30526
## 13: -39.271267: 2.67270 1.09444 0.113168 -2.30196
## 14: -39.341842: 2.73859 1.07795 0.123778 -2.32079
## 15: -39.346399: 2.71897 1.07807 0.124258 -2.33025
## 16: -39.347572: 2.71957 1.07662 0.127387 -2.32720
## 17: -39.347641: 2.72307 1.07756 0.127613 -2.32681
## 18: -39.347669: 2.72144 1.07762 0.127756 -2.32677
## 19: -39.347672: 2.72159 1.07745 0.127677 -2.32686
## 20: -39.347672: 2.72164 1.07749 0.127690 -2.32684
## 21: -39.347672: 2.72163 1.07749 0.127690 -2.32684
## Optimization finished!:
## Elapsed time: 0.004 seconds.
## The objective value is: -3.934767e+01
## The maximum gradient component is: 8.4e-06
## The convergence message is: relative convergence (4)
## Iterations: 21
## Evaluations: Fun: 30 Grad: 22
## See stats::nlminb for available tolerance/control arguments.
## Returning results...
## Finished!
Inside the package we optimise the objective function with respect to
the fixed parameters using the construction function handlers from
TMB::MakeADFun
and parsing them to
stats::nlminb
i.e.
The likelihood
method allows you to retrieve the
nll
object that holds the negative log-likelihood, and its
derivatives. The method takes arguments similar to those of
estimate
.
## Checking data...
## Succesfully returned function handlers
The initial parameters (supplied by the user) are stored here
## logtheta mu logsigma_x logsigma_y
## 1.609438 0.000000 -2.302585 -2.302585
The objective function can be evaluated by
## [1] 936.1168
The gradient can be evaluated by
## [,1] [,2] [,3] [,4]
## [1,] 1430.881 -1590.748 -1222.864 -818.151
The hessian can be evaluated by
## [,1] [,2] [,3] [,4]
## [1,] 2348.091 -2949.2028 -1700.6171 -1167.7123
## [2,] -2949.203 1691.7601 2308.7901 874.6161
## [3,] -1700.617 2308.7901 938.3781 1516.2869
## [4,] -1167.712 874.6161 1516.2869 311.1072
We can now use these to optimize the function using
e.g. stats::optim
instead.
You can extract the parameter bounds specified when calling
setParameter()
method by using the
getParameters
method (note that nll$par
and
pars$initial
are identical).
## type estimate initial lower upper
## logtheta free 2.7216294 1.609438 -Inf 2.995732
## mu free 1.0774882 0.000000 -10.00000 10.000000
## logsigma_x free 0.1276898 -2.302585 -11.51293 1.609438
## logsigma_y free -2.3268411 -2.302585 -11.51293 1.609438
stats::optim
We supply the initial parameter values, objective function handler
and gradient handler, and parameter bounds to optim
.
Lets compare the results from using stats::optim
with
the extracted function handler versus the internal optimisation that
uses stats::nlminb
stored in fit
:
## external internal
## logtheta 2.7216300 2.7216294
## mu 1.0774878 1.0774882
## logsigma_x 0.1276904 0.1276898
## logsigma_y -2.3268419 -2.3268411
## external internal
## 1 -39.34767 -39.34767
## external internal
## 1 7.587709e-06 -8.417709e-06
## 2 -5.872425e-05 8.215885e-06
## 3 1.062722e-05 4.106731e-06
## 4 -1.917425e-05 1.643487e-06