Skip to contents

This vignette describes the parametric delay distributions that are currently available in epinowcast and explains how they are internally discretised.

## 
## Attaching package: 'data.table'
## The following object is masked from 'package:base':
## 
##     %notin%

Available distributions

The currently available parametric delay distributions are continuous probability distributions with (up to) two parameters \(\mu_{g,t}\) and \(\upsilon_{g,t}\). The table below provides a link to the definition of each distribution, specifies how the parameters \(\mu_{g,t}\) and \(\upsilon_{g,t}\) are mapped to the parameters of the distribution (according to the referenced definition), and states the resulting mean of the distribution (before discretization and adjustment for the assumed maximum delay).

Distribution Parametrization Mean
Log-normal \(\mu=\mu_{g,t}\), \(\sigma = \upsilon_{g,t}\) \(\exp(\mu_{g,t}+\frac{\upsilon_{g,t}^2}{2})\)
Exponential \(\beta = \exp(-\mu_{g,t})\) \(\exp(\mu_{g,t})\)
Gamma \(\alpha = \exp(\mu_{g,t})\), \(\beta = \upsilon_{g,t}\) \(\exp(\mu_{g,t})/\upsilon_{g,t}\)

The log-logistic distribution was previously available but has been dropped pending log-logistic support in primarycensored (epinowcast/primarycensored#321).

Discretisation and adjustment for maximum delay

In epinowcast, delays are modelled in discrete time and with an assumed maximum delay (specified via the max_delay argument). The continuous delay distributions must therefore be discretised and adjusted for the maximum delay.

It is helpful to separate two distinct adjustments. The first is discretisation: turning the continuous delay into a probability mass over integer delays \(d = 0, 1, 2, \dots\), with each \(p_d\) defined for an infinite maximum delay so that \(\sum_{d=0}^{\infty} p_d = 1\). The second is conditioning on the maximum delay \(D\): restricting attention to delays \(d \le D\) and renormalising so that the truncated probabilities sum to 1, i.e. \(p^{\prime}_{d} = p_d / \sum_{j=0}^{D} p_j\). The first step is about how a continuous distribution becomes discrete; the second is about right truncation at \(D\).

Double interval censoring with primarycensored

epinowcast discretises the parametric reference delay using the double interval censoring approach from the primarycensored package[1]. This accounts for the primary event window, the secondary (reporting) interval, and right truncation at the maximum delay \(D\).

Let \(F^{\mu_{g,t}, \upsilon_{g,t}}\) be the cumulative distribution function of the continuous delay distribution. The primary event (e.g. infection) is not observed exactly but is assumed uniform over a window of width 1 day. Censoring the continuous delay by this primary window gives \[Q(t) = \int_0^1 F^{\mu_{g,t}, \upsilon_{g,t}}(t - s) \, \mathrm{d}s,\] the cumulative probability that the delay, measured from the start of the primary window, is at most \(t\). The secondary event is observed in a daily reporting interval, so the mass on an integer delay \(d\) is the increment of \(Q\) over that interval, conditioned on the maximum delay \(D\), \[p_{g,t,d} = \frac{Q(d + 1) - Q(d)}{Q(D)}, \qquad d = 0, 1, \dots, D - 1.\] The denominator \(Q(D)\) applies the right truncation, so the discretised probabilities sum to 1.

primarycensored evaluates \(Q\) with analytical solutions for the supported distributions (the exponential is handled as a gamma with shape one), and the Stan implementation is vendored directly from the package. This is applied automatically to all available parametric distributions (lognormal, gamma and exponential); no argument is needed. See the primarycensored documentation for the full derivation, including arbitrary primary and secondary window widths.

The discretised mass function for a lognormal delay (meanlog = 1, sdlog = 0.5) truncated at a maximum delay of 15 days, obtained directly from primarycensored::dprimarycensored():

dmax <- 15
pmf <- data.table(
  delay = 0:(dmax - 1),
  probability = primarycensored::dprimarycensored(
    0:(dmax - 1), plnorm, pwindow = 1, swindow = 1, D = dmax,
    meanlog = 1, sdlog = 0.5
  )
)

ggplot(pmf, aes(x = delay, y = probability)) +
  geom_col(fill = "#3182bd") +
  labs(
    x = "Delay (days)", y = "Probability",
    title = "Discretised lognormal delay (meanlog = 1, sdlog = 0.5)"
  ) +
  theme_bw()

The same primarycensored machinery underpins delay handling elsewhere in the ecosystem. epidist estimates delay distributions from individual line-list data, and EpiNow2::estimate_dist() fits them from aggregated count data; both are powered by primarycensored, with the main difference being the data structure they expect. Estimating a delay with one of those tools and then passing it to enw_reference() keeps the censoring assumptions consistent across the workflow.

1.
Abbott, S., Brand, S., Azam, J. M., Pearson, C., Funk, S., & Charniga, K. (2024). Primarycensored: Primary event censored distributions. https://doi.org/10.5281/zenodo.13632839