This function reads in the output files from sim_trajectory_CSV and splits them into smaller files. The files are output by patch, with the appropriate patch numbers for mosquitoes or humans, and specific stages are aggregated by a given metric.

split_aggregate_CSV(
  read_dir,
  write_dir = read_dir,
  stage = c("E", "L", "P", "M", "U", "FS", "FE", "FI", "H"),
  spn_P,
  tmax,
  dt,
  erlang = FALSE,
  sum_fem = FALSE,
  rem_file = FALSE,
  verbose = TRUE
)

Arguments

read_dir

Directory where output was written to

write_dir

Directory to write output to. Default is read_dir

stage

Life stage to print, see details

spn_P

Places object, see details

tmax

The final time to end simulation

dt

The time-step at which to return output (not the time-step of the sampling algorithm)

erlang

Boolean, default is FALSE, to return summaries by genotype

sum_fem

if TRUE, in addition to FS, FE, FI output by node and repetition, output an additional file F which sums over infection states (S,E,I). Does nothing if the simulation did not include epi dynamics.

rem_file

Remove original output? Default is FALSE

verbose

Chatty? Default is TRUE

Value

Writes output to files in write_dir

Details

Given the read_dir, this function assumes the follow file structure:

  • read_dir

    • repetition 1

      • M.csv

      • FS.csv

      • ...

    • repetition 2

      • M.csv

      • FS.csv

      • ...

    • repetition 3

    • ...

This function expects the write_dir to be empty, and it sets up the same file structure as the read_dir. For a 2-node simulation, the output will be organized similar to:

  • write_dir

    • repetition 1

      • M_0001.csv

      • M_0002.csv

      • FS_0001.csv

      • FS_0001.csv

      • ...

    • repetition 2

      • M_0001.csv

      • M_0002.csv

      • FS_0001.csv

      • FS_0001.csv

      • ...

    • repetition 3

    • ...

stage defines which life-stages the function will analyze. These stages must be any combination of: "E", "L", "P", "M", "U", "FS", "FE", "FI", "H". These must come from the set of stages provided to sim_trajectory_CSV via the stage argument. It can be less than what was printed by the simulation, but any extra stages provided, but not printed, will throw a warning and then be ignored.

erlang defines how aquatic (eggs, larvae, and pupae) stages and adult females (only mated females) are aggregated. By default, erlang is FALSE, and all of these stages are summarized by genotype only, combining any Erlang-distributed dwell stages (for eggs, larvae, and pupae) or latent infection (for adult females) stages. If erlang is TRUE, summaries are returned by dwell stage or infection status, combining any genotype information.
Female summaries always combine over mate-genotype, so only female genotypes are returned.

The places (spn_P) object is generated from one of the following: spn_P_lifecycle_node, spn_P_lifecycle_network, spn_P_epiSIS_node, spn_P_epiSIS_network, spn_P_epiSEIR_node, or spn_P_epiSEIR_network.

tmax, dt define the last sampling time, and each sampling time in-between.

For more details about using this function to process CSV output see: vignette("data-analysis", package = "MGDrivE2")