This function is used internally by enw_preprocess_data()
to combine
various pieces of processed observed data into a single object. It
is exposed to the user in order to allow for modular data preprocessing
though this is not currently recommended. See documentation and code
of enw_preprocess_data()
for more on the expected inputs.
Usage
enw_construct_data(
obs,
new_confirm,
latest,
missing_reference,
reporting_triangle,
metareport,
metareference,
metadelay,
by,
max_delay
)
Arguments
- obs
Observations with the addition of empirical reporting proportions and and restricted to the specified maximum delay.
- new_confirm
Incidence of notifications by reference and report date. Empirical reporting distributions are also added.
- latest
The latest available observations.
- missing_reference
A
data.frame
of reported observations that are missing the reference date.- reporting_triangle
Incident observations by report and reference date in the standard reporting triangle matrix format.
- metareport
Metadata for report dates.
- metareference
Metadata reference dates derived from observations.
- metadelay
Metadata for reporting delays produced using
enw_delay_metadata()
.- by
A character vector describing the stratification of observations. This defaults to no grouping. This should be used when modelling multiple time series in order to identify them for downstream modelling
- max_delay
Numeric defaults to 20 and needs to be greater than or equal to 1 and an integer (internally it will be coerced to one using
as.integer()
). The maximum number of days to include in the delay distribution. Computation scales non-linearly with this setting so consider what maximum makes sense for your data carefully. Note that this is zero indexed and so includes the reference date andmax_delay - 1
other days (i.e. amax_delay
of 1 corresponds with no delay).
Value
A data.table containing processed observations as a series of nested data.frames as well as variables containing metadata. These are:
obs
: (observations with the addition of empirical reporting proportions and and restricted to the specified maximum delay).new_confirm
: Incidence of notifications by reference and report date. Empirical reporting distributions are also added.latest
: The latest available observations.missing_reference
: Observations missing reference dates.reporting_triangle
: Incident observations by report and reference date in the standard reporting triangle matrix format.metareference
: Metadata reference dates derived from observations.metrareport
: Metadata for report dates.metadelay
: Metadata for reporting delays produced usingenw_delay_metadata()
.time
: Numeric, number of timepoints in the data.snapshots
: Numeric, number of available data snapshots to use for nowcasting.groups
: Numeric, Number of groups/strata in the supplied observations (set usingby
).max_delay
: Numeric, the maximum delay in the processed datamax_date
: The maximum available report date.
See also
Preprocessing functions
enw_add_delay()
,
enw_add_max_reported()
,
enw_add_metaobs_features()
,
enw_assign_group()
,
enw_complete_dates()
,
enw_delay_filter()
,
enw_delay_metadata()
,
enw_extend_date()
,
enw_filter_reference_dates()
,
enw_filter_report_dates()
,
enw_latest_data()
,
enw_metadata()
,
enw_missing_reference()
,
enw_preprocess_data()
,
enw_reporting_triangle_to_long()
,
enw_reporting_triangle()
Examples
pobs <- enw_example("preprocessed")
enw_construct_data(
obs = pobs$obs[[1]],
new_confirm = pobs$new_confirm[[1]],
latest = pobs$latest[[1]],
missing_reference = pobs$missing_reference[[1]],
reporting_triangle = pobs$reporting_triangle[[1]],
metareport = pobs$metareport[[1]],
metareference = pobs$metareference[[1]],
metadelay = enw_delay_metadata(max_delay = 20),
by = c(),
max_delay = pobs$max_delay[[1]]
)
#> obs new_confirm latest
#> 1: <data.table[671x9]> <data.table[630x11]> <data.table[41x10]>
#> missing_reference reporting_triangle metareference
#> 1: <data.table[41x6]> <data.table[41x22]> <data.table[41x9]>
#> metareport metadelay time snapshots by groups max_delay
#> 1: <data.table[60x12]> <data.table[20x4]> 41 41 1 20
#> max_date
#> 1: 2021-08-22