Skip to contents

This function is used internally by enw_preprocess_data() to combine various pieces of processed observed data into a single object. It is exposed to the user in order to allow for modular data preprocessing though this is not currently recommended. See documentation and code of enw_preprocess_data() for more on the expected inputs.

Usage

enw_construct_data(
  obs,
  new_confirm,
  latest,
  missing_reference,
  reporting_triangle,
  metareport,
  metareference,
  metadelay,
  max_delay,
  timestep,
  by
)

Arguments

obs

Observations with the addition of empirical reporting proportions and and restricted to the specified maximum delay.

new_confirm

Incidence of notifications by reference and report date. Empirical reporting distributions are also added.

latest

The latest available observations.

missing_reference

A data.frame of reported observations that are missing the reference date.

reporting_triangle

Incident observations by report and reference date in the standard reporting triangle matrix format.

metareport

Metadata for report dates.

metareference

Metadata reference dates derived from observations.

metadelay

Metadata for reporting delays produced using enw_metadata_delay().

max_delay

Maximum delay to be modelled by epinowcast.

timestep

The timestep to used in the process model (i.e. the reference date model). This can be a string ("day", "week", "month") or a numeric whole number representing the number of days. If your data does not have this timestep then you may wish to make use of enw_aggregate_cumulative() to aggregate your data to the desired timestep.

by

A character vector describing the stratification of observations. This defaults to no grouping. This should be used when modelling multiple time series in order to identify them for downstream modelling

Value

A data.table containing processed observations as a series of nested data.frames as well as variables containing metadata. These are:

  • obs: (observations with the addition of empirical reporting proportions and restricted to the specified maximum delay).

  • new_confirm: Incidence of notifications by reference and report date. Empirical reporting distributions are also added.

  • latest: The latest available observations.

  • missing_reference: Observations missing reference dates.

  • reporting_triangle: Incident observations by report and reference date in the standard reporting triangle matrix format.

  • metareference: Metadata reference dates derived from observations.

  • metrareport: Metadata for report dates.

  • metadelay: Metadata for reporting delays produced using enw_metadata_delay().

  • max_delay: Maximum delay to be modelled by epinowcast.

  • time: Numeric, number of timepoints in the data.

  • snapshots: Numeric, number of available data snapshots to use for nowcasting.

  • groups: Numeric, Number of groups/strata in the supplied observations (set using by).

  • max_date: The maximum available report date.

Examples

pobs <- enw_example("preprocessed")
enw_construct_data(
  obs = pobs$obs[[1]],
  new_confirm = pobs$new_confirm[[1]],
  latest = pobs$latest[[1]],
  missing_reference = pobs$missing_reference[[1]],
  reporting_triangle = pobs$reporting_triangle[[1]],
  metareport = pobs$metareport[[1]],
  metareference = pobs$metareference[[1]],
  metadelay = pobs$metadelay[[1]],
  max_delay = pobs$max_delay,
  timestep = pobs$timestep[[1]],
  by = c()
)
#>                    obs          new_confirm              latest
#>                 <list>               <list>              <list>
#> 1: <data.table[671x9]> <data.table[630x11]> <data.table[41x10]>
#>     missing_reference  reporting_triangle      metareference
#>                <list>              <list>             <list>
#> 1: <data.table[41x6]> <data.table[41x22]> <data.table[41x9]>
#>             metareport          metadelay max_delay  time snapshots     by
#>                 <list>             <list>     <num> <int>     <int> <list>
#> 1: <data.table[60x12]> <data.table[20x5]>        20    41        41       
#>    groups   max_date timestep
#>     <int>     <IDat>   <char>
#> 1:      1 2021-08-22      day