Ensures that all reference and report dates are present for
all groups based on the maximum and minimum dates found in the data.
This function may be of use to users when preprocessing their data. In
general all features that you may consider using as grouping variables
or as covariates need to be included in the by
variable.
Usage
enw_complete_dates(
obs,
by = NULL,
max_delay,
missing_reference = TRUE,
completion_beyond_max_report = FALSE
)
Arguments
- obs
A
data.frame
containing at least the following variables:reference_date
(index date of interest),report_date
(report date for observations),confirm
(cumulative observations by reference and report date).- by
A character vector describing the stratification of observations. This defaults to no grouping. This should be used when modelling multiple time series in order to identify them for downstream modelling
- max_delay
Numeric defaults to 20 and needs to be greater than or equal to 1 and an integer (internally it will be coerced to one using
as.integer()
). The maximum number of days to include in the delay distribution. Computation scales non-linearly with this setting so consider what maximum makes sense for your data carefully. Note that this is zero indexed and so includes the reference date andmax_delay - 1
other days (i.e. amax_delay
of 1 corresponds with no delay).- missing_reference
Logical, should entries for cases with missing reference date be completed as well?, Default: TRUE
- completion_beyond_max_report
Logical, should entries be completed beyond the maximum date found in the data? Default: FALSE
Value
A data.table
with completed entries for all combinations of
reference dates, groups and possible report dates.
See also
Preprocessing functions
enw_add_delay()
,
enw_add_max_reported()
,
enw_add_metaobs_features()
,
enw_assign_group()
,
enw_construct_data()
,
enw_delay_filter()
,
enw_delay_metadata()
,
enw_extend_date()
,
enw_filter_reference_dates()
,
enw_filter_report_dates()
,
enw_latest_data()
,
enw_metadata()
,
enw_missing_reference()
,
enw_preprocess_data()
,
enw_reporting_triangle_to_long()
,
enw_reporting_triangle()
Examples
obs <- data.frame(
report_date = c("2021-10-01", "2021-10-03"), reference_date = "2021-10-01",
confirm = 1
)
enw_complete_dates(obs)
#> report_date reference_date confirm
#> 1: 2021-10-01 <NA> 0
#> 2: 2021-10-02 <NA> 0
#> 3: 2021-10-03 <NA> 0
#> 4: 2021-10-01 2021-10-01 1
#> 5: 2021-10-02 2021-10-01 1
#> 6: 2021-10-03 2021-10-01 1
#> 7: 2021-10-02 2021-10-02 0
#> 8: 2021-10-03 2021-10-02 0
#> 9: 2021-10-03 2021-10-03 0
# Allow completion beyond the maximum date found in the data
enw_complete_dates(obs, completion_beyond_max_report = TRUE, max_delay = 10)
#> report_date reference_date confirm
#> 1: 2021-10-01 <NA> 0
#> 2: 2021-10-02 <NA> 0
#> 3: 2021-10-03 <NA> 0
#> 4: 2021-10-01 2021-10-01 1
#> 5: 2021-10-02 2021-10-01 1
#> 6: 2021-10-03 2021-10-01 1
#> 7: 2021-10-04 2021-10-01 1
#> 8: 2021-10-05 2021-10-01 1
#> 9: 2021-10-06 2021-10-01 1
#> 10: 2021-10-07 2021-10-01 1
#> 11: 2021-10-08 2021-10-01 1
#> 12: 2021-10-09 2021-10-01 1
#> 13: 2021-10-10 2021-10-01 1
#> 14: 2021-10-11 2021-10-01 1
#> 15: 2021-10-02 2021-10-02 0
#> 16: 2021-10-03 2021-10-02 0
#> 17: 2021-10-04 2021-10-02 0
#> 18: 2021-10-05 2021-10-02 0
#> 19: 2021-10-06 2021-10-02 0
#> 20: 2021-10-07 2021-10-02 0
#> 21: 2021-10-08 2021-10-02 0
#> 22: 2021-10-09 2021-10-02 0
#> 23: 2021-10-10 2021-10-02 0
#> 24: 2021-10-11 2021-10-02 0
#> 25: 2021-10-12 2021-10-02 0
#> 26: 2021-10-03 2021-10-03 0
#> 27: 2021-10-04 2021-10-03 0
#> 28: 2021-10-05 2021-10-03 0
#> 29: 2021-10-06 2021-10-03 0
#> 30: 2021-10-07 2021-10-03 0
#> 31: 2021-10-08 2021-10-03 0
#> 32: 2021-10-09 2021-10-03 0
#> 33: 2021-10-10 2021-10-03 0
#> 34: 2021-10-11 2021-10-03 0
#> 35: 2021-10-12 2021-10-03 0
#> 36: 2021-10-13 2021-10-03 0
#> report_date reference_date confirm