Skip to contents

This function aggregates observations over a specified timestep, ensuring alignment on the same day of week for report and reference dates. It is useful for aggregating data to a weekly timestep, for example which may be desirable if testing using a weekly timestep or if you are very concerned about runtime. Note that the start of the timestep will be determined by min_date + a single timestep (i.e. the first timestep will be "2022-10-23" if the minimum reference date is "2022-10-16").

Usage

enw_aggregate_cumulative(
  obs,
  timestep = "day",
  by = NULL,
  min_reference_date = min(obs$reference_date, na.rm = TRUE),
  copy = TRUE
)

Arguments

obs

An object coercible to a data.table (such as a data.frame) which must have a new_confirm numeric column, and report_date and reference_date date columns. The input must have a timestep of a day and be complete. See enw_complete_dates() for more information. If NA values are present in the confirm column then these will be set to zero before aggregation this may not be desirable if this missingness is meaningful.

timestep

The timestep to used. This can be a string ("day", "week", "month") or a numeric whole number representing the number of days.

by

A character vector of variables to also aggregate by (i.e. as well as using the reference_date and report_date). If not supplied then the function will aggregate by just the reference_date and report_date.

min_reference_date

The minimum reference date to start the aggregation from. Note that the timestep will start from the minimum reference date + a single time step (i.e. the first timestep will be "2022-10-23" if the minimum reference date is "2022-10-16"). The default is the minimum reference date in the obs object. Other sensible values would be the minimum report date in the obs object + 1 day if reporting is already weekly and you wish to ensure that the timestep of the output matches the reporting timestep.

copy

Should obs be copied (default) or modified in place?

Value

A data.table with aggregated observations.

Examples

nat_hosp <- germany_covid19_hosp[location == "DE"][age_group == "00+"]
enw_aggregate_cumulative(nat_hosp, timestep = "week")
#>      reference_date location age_group confirm report_date
#>              <IDat>   <fctr>    <fctr>   <int>      <IDat>
#>   1:     2021-04-12       DE       00+    3505  2021-04-12
#>   2:     2021-04-12       DE       00+    5276  2021-04-19
#>   3:     2021-04-12       DE       00+    5993  2021-04-26
#>   4:     2021-04-12       DE       00+    6362  2021-05-03
#>   5:     2021-04-12       DE       00+    6594  2021-05-10
#>  ---                                                      
#> 266:     2021-10-04       DE       00+    2091  2021-10-11
#> 267:     2021-10-04       DE       00+    2346  2021-10-18
#> 268:     2021-10-11       DE       00+    1312  2021-10-11
#> 269:     2021-10-11       DE       00+    2163  2021-10-18
#> 270:     2021-10-18       DE       00+    1597  2021-10-18