---
title: "delaydiscount"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{delaydiscount}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup}
library(delaydiscount)
```

The delaydiscount package is intended to be used to analyze delay discounting data using
the linearized hyperbolic model proposed by Hinds, et al. (2026). To analyze discounting data with this package, you should first get your data in the format expected by the package.

Our package expects to take data in long format, meaning that for each subject 
within group, for each delay, the measurement of the indifference point 
constitutes one observation in the data frame. It expects the data frame to contain
variables named 

* group: Intervention identifier.

* subj: Subject identifier.

* delay: The length of time for which the indifference point is measured.
 This must be a positive value.

* indiff: A value between 0 and 1 (exclusive) expressing the proportion of a
 maximum reward the subject would need to receive to not favor waiting the delay
 to receive the maximum reward.

For each subject within group, the indifference point is expected to be measured for the same set of delays, exactly once for each delay.

Samples are assumed to be independent between different groups. If a subject is in multiple groups, this assumption would be violated. The code will still run, but the analysis may not be valid. The analysis performed by this package would treat a subject in multiple groups as though the observations for that subject in different groups had come from different subjects.

We include the remedi dataset in our package as an example of a properly formatted dataset.

```{r}
head(remedi, n = 10L)
```

Once your delay discounting data is in the expected format, the next step is to 
run the `prepare_data_frame` function on the dataset, which performs helper 
calculations for our other functions.

```{r}
prep_remedi <- prepare_data_frame(remedi)
```

One of the things that the `prepare_data_frame` function does is transform the delay
and indifference variables to linearize the model.

Note that the hyperbolic model for the discounting curve is $D(t) = \frac{1}{1+kt}$.

Then we will have $\log(\frac{1}{D(t)} - 1) = \log(k) + \log(t)$.

We assume that the transformed observed indifference point $\tilde D(t_{ijc})$ will follow the model

$\log(\frac{1}{\tilde D(t_{ijc})} - 1) = \log(k_{ic}) + \log(t_{ijc}) + \epsilon_{ijc},$

where $\epsilon_{ijc} \sim N(0, \sigma^2)$, $i$ is an index for subject, $j$ an index for time point, and $c$ is an index for group, and

$\log(k_{ic}) \sim N(\theta_c, \frac{g\sigma^2}{T})$, where $\theta_c$ is the population mean of the $\log(k_{ic})$ for subjects in group $c$, and $T$ is the number of time points observed for each subject.

Using the prepared data frame, you can get estimates of the $log(k)$ values for
each subject using the `get_subj_est_ln_k` function.

```{r}
ln_k_ests <- get_subj_est_ln_k(prep_remedi)
head(ln_k_ests)
```

Our model treats the $\log(k)$ values for individual subjects as the result of a combination
of a fixed group effect and a random subject effect. To get the fit of the linearized
hyperbolic model, use the `dd_hyperbolic_model` function.

```{r}
model_fit <- dd_hyperbolic_model(prep_remedi)
```

The object produced by this method is a list containing several components.

The group mean $\log(k)$ estimates (i.e. the $\theta_c$) can be gotten from the `ln_k_mean` component.

```{r}
model_fit$ln_k_mean
```

These are estimates of the mean of the distribution from which the $\log(k_{ic})$ values for subjects in group $c$ have been realized.

The variance component estimates can be obtained from the `var` component.

```{r}
model_fit$var
```

`sigma_sq` is the estimate of the variance of the transformed indifference value
conditional on the subject effect.

`g` is related to the variance of the subject random effects: specifically, the
variance of the subject random effect is equal to $\frac{g\sigma^2}{T}$, where 
$T$ is the number of time points observed for each subject.

These mean and variance parameters are estimated through maximum likelihood 
estimation.

An overall F-test for the equality of all group mean parameters can be found in
the `model_test` component.

```{r}
model_fit$model_test
```

Pairwise F-tests for equality of group mean parameters can be obtained from the
`pairwise_f_tests` component. The data frame contains an F-test for each pair.

```{r}
model_fit$pairwise_f_tests
```

Note that, as with other R functions such as `lm` and `glm`, these p-values are not adjusted for multiplicity.

## Citation

Hinds, et al. (2026). "To linearize or not to linearize: That is the
Mazur delay discounting question", submitted to *Journal of Mathematical Psychology*.
