Split CSV output by Patch and Aggregate by Mate or Dwell-Stage

This function reads in the output files from sim_trajectory_CSV and splits them into smaller files. The files are output by patch, with the appropriate patch numbers for mosquitoes or humans, and specific stages are aggregated by a given metric.

split_aggregate_CSV(
  read_dir,
  write_dir = read_dir,
  stage = c("E", "L", "P", "M", "U", "FS", "FE", "FI", "H"),
  spn_P,
  tmax,
  dt,
  erlang = FALSE,
  sum_fem = FALSE,
  rem_file = FALSE,
  verbose = TRUE
)

Arguments

read_dir	Directory where output was written to
write_dir	Directory to write output to. Default is read_dir
stage	Life stage to print, see details
spn_P	Places object, see details
tmax	The final time to end simulation
dt	The time-step at which to return output (not the time-step of the sampling algorithm)
erlang	Boolean, default is FALSE, to return summaries by genotype
sum_fem	if `TRUE`, in addition to FS, FE, FI output by node and repetition, output an additional file F which sums over infection states (S,E,I). Does nothing if the simulation did not include epi dynamics.
rem_file	Remove original output? Default is FALSE
verbose	Chatty? Default is TRUE

Value

Writes output to files in write_dir

Details

Given the read_dir, this function assumes the follow file structure:

read_dir
- repetition 1
  - M.csv
  - FS.csv
  - ...
- repetition 2
  - M.csv
  - FS.csv
  - ...
- repetition 3
- ...

This function expects the write_dir to be empty, and it sets up the same file structure as the read_dir. For a 2-node simulation, the output will be organized similar to:

write_dir
- repetition 1
  - M_0001.csv
  - M_0002.csv
  - FS_0001.csv
  - FS_0001.csv
  - ...
- repetition 2
  - M_0001.csv
  - M_0002.csv
  - FS_0001.csv
  - FS_0001.csv
  - ...
- repetition 3
- ...

stage defines which life-stages the function will analyze. These stages must be any combination of: "E", "L", "P", "M", "U", "FS", "FE", "FI", "H". These must come from the set of stages provided to sim_trajectory_CSV via the stage argument. It can be less than what was printed by the simulation, but any extra stages provided, but not printed, will throw a warning and then be ignored.

erlang defines how aquatic (eggs, larvae, and pupae) stages and adult females (only mated females) are aggregated. By default, erlang is FALSE, and all of these stages are summarized by genotype only, combining any Erlang-distributed dwell stages (for eggs, larvae, and pupae) or latent infection (for adult females) stages. If erlang is TRUE, summaries are returned by dwell stage or infection status, combining any genotype information.
Female summaries always combine over mate-genotype, so only female genotypes are returned.

The places (spn_P) object is generated from one of the following: spn_P_lifecycle_node, spn_P_lifecycle_network, spn_P_epiSIS_node, spn_P_epiSIS_network, spn_P_epiSEIR_node, or spn_P_epiSEIR_network.

tmax, dt define the last sampling time, and each sampling time in-between.

For more details about using this function to process CSV output see: vignette("data-analysis", package = "MGDrivE2")