
Identify report dates with complete (i.e up to the maximum delay) reference dates
Source:R/model-module-helpers.R
enw_reps_with_complete_refs.RdIdentify report dates with complete (i.e up to the maximum delay) reference dates
Arguments
- new_confirm
new_confirmdata.frameoutput fromenw_preprocess_data().- max_delay
The maximum delay to model in the delay distribution, specified in units of the timestep (e.g., if
timestep = "week", thenmax_delay = 3means 3 weeks). If not specified the maximum observed delay is assumed to be the true maximum delay in the model. Otherwise, an integer greater than or equal to 1 can be specified. Observations with delays larger than the maximum delay will be dropped. If the specified maximum delay is too short, nowcasts can be biased as important parts of the true delay distribution are cut off. At the same time, computational cost scales non-linearly with this setting, so you want the maximum delay to be as long as necessary, but not much longer.Steps to take to determine the maximum delay:
Consider what is realistic and relevant for your application.
Check the proportion of observations reported (
prop_reported) by delay in thenew_confirmoutput ofenw_preprocess_obs.Use
check_max_delay()to check the coverage of a candidatemax_delay.If in doubt, check if increasing the maximum delay noticeably changes the delay distribution or nowcasts as estimated by
epinowcast. If it does, your maximum delay may still be too short.
Note that delays are zero indexed and so include the reference date and
max_delay - 1other intervals (i.e. amax_delayof 1 corresponds to no delay).- by
A character vector describing the stratification of observations. This defaults to no grouping. This should be used when modelling multiple time series in order to identify them for downstream modelling
- copy
A logical; if
TRUE(the default) creates a copy; otherwise, modifiesobsin place.
Value
A data.frame containing a report_date variable, and grouping
variables specified for report dates that have complete reporting.