Skip to contents

Removes observations where the reference_date is earlier than the minimum report_date within each group. Rows with missing reference_date are retained. This is useful for ensuring that observations are only included from the first available report date onwards.

This function is typically called before enw_add_incidence() so that the incidence calculation starts from a valid reporting window. Without this step, reference dates that predate any report date produce spurious leading entries in the incidence output.

Usage

enw_filter_reference_dates_by_report_start(obs, by = NULL, copy = TRUE)

Arguments

obs

A data.frame with reference_date and report_date columns.

by

A character vector describing the stratification of observations. This defaults to no grouping. This should be used when modelling multiple time series in order to identify them for downstream modelling

copy

Should obs be copied (default) or modified in place?

Value

A data.table filtered so that each reference_date is on or after the minimum report_date in its group. Rows with NA reference_date are kept.

Examples

library(data.table)
#> 
#> Attaching package: ‘data.table’
#> The following object is masked from ‘package:base’:
#> 
#>     %notin%
obs <- data.table(
  reference_date = as.IDate(c(
    "2021-10-01", "2021-10-02", "2021-10-03"
  )),
  report_date = as.IDate(c(
    "2021-10-02", "2021-10-02", "2021-10-03"
  ))
)
# The first row has reference_date before the minimum
# report_date, so it is removed
enw_filter_reference_dates_by_report_start(obs)
#>    reference_date report_date
#>            <IDat>      <IDat>
#> 1:     2021-10-02  2021-10-02
#> 2:     2021-10-03  2021-10-03