Package 'ards'

Title: Creates Analysis Results Datasets
Description: Contains functions to help create an Analysis Results Dataset. The dataset follows industry recommended structure. The dataset can be created in multiple passes, using different data frames as input. Analysis Results Datasets are used in the pharmaceutical and biotech industries to capture analysis in a common tabular data structure.
Authors: David Bosak [aut, cre], Kevin Kramer [ctb], Jack Fuller [ctb], Matt Baldwin [ctb]
Maintainer: David Bosak <[email protected]>
License: CC0
Version: 0.1.1
Built: 2024-11-04 05:34:42 UTC
Source: https://github.com/dbosak01/ards

Help Index


Adds data to an Analysis Results Dataset

Description

The add_ards function dumps data from an input analysis dataset to the ARDS dataset. The function is designed to be pipe-friendly, and will return the input dataset unaltered. The parameters on the function define how to extract the desired data from the analysis dataset. The "statvars" parameter defines which columns contain desired analysis results. The values in these columns will be used to populate the "statval" variable in the output dataset. Other parameters are used to define identifying information for the statistics values, and are optional.

The add_ards function should be called immediately after any calculations, while the analysis results are still in numeric form. This recommendation is to ensure that the ARDS will contain full precision of the analysis values. Once the analysis values are dumped into the ARDS, you may proceed to transform and format your analysis data, without affecting the values captured in the ARDS.

Usage

add_ards(
  data,
  statvars,
  statdesc = NULL,
  byvars = NULL,
  trtvar = NULL,
  paramcd = NULL,
  anal_var = NULL,
  anal_val = NULL
)

Arguments

data

The input data to create analysis results for. This parameter is required.

statvars

A vector of column names that identify the desired results. Statvar columns must be numeric. This parameter is required.

statdesc

A vector of values or a column name that identifies a description for each statvar. If passed as a vector of values, the number of values should correspond to the number of 'statvar' variables.

byvars

A vector of column names to use for by variables.

trtvar

A column name to use for the treatment variable.

paramcd

A character string that describes the analysis parameter code or column name that contains the parameter code. If supplied as a column name, the function will populate the 'paramcd' column in the ARDS with the value of the 'paramcd' column.

anal_var

A column name for the analysis variable or a string that identifies the analysis variable.

anal_val

The analysis variable value. Can be identified by a column name or a vector of values. By default, the analysis values will be taken from the values of the variable passed in 'anal_var'. This parameter exists so that you may pass in the values from a different variable, if desired.

Value

The input data frame, unaltered.

See Also

Other ards: get_ards(), init_ards()

Examples

library(ards)
library(dplyr)

# Initialize the ARDS
init_ards(studyid = "MTCARS",
          tableid = "01", adsns = "mtcars",
          population = "all cars",
          time = "1973")

# Perform analysis on MPG
# - Add to ARDS from within continuous variable pipeline
mpgdf <- mtcars |> 
  select(cyl, mpg) |> 
  group_by(cyl) |> 
  summarize(n = n(),
            mean = mean(mpg),
            std = sd(mpg),
            min = min(mpg),
            max = max(mpg)) |> 
  add_ards(statvars = c("n", "mean", "std", "min", "max"),
         anal_var = "mpg", trtvar = "cyl")
            
# Perform analysis on GEAR
# - Add to ARDS from within categorical variable pipeline
geardf <- mtcars |> 
  mutate(denom = n()) |> 
  select(cyl, gear, denom) |> 
  group_by(cyl, gear) |> 
  summarize(cnt = n(), 
            denom = max(denom)) |>
  mutate(pct = cnt / denom * 100) |> 
  add_ards(statvars = c("cnt", "pct", "denom"),
         anal_var = "gear", trtvar = "cyl")

# Get the ARDS
ards <- get_ards() 

# Uncomment to view ards
# View(ards)

Returns the current Analysis Results Dataset

Description

The get_ards function returns the current state of the Analysis Results Dataset (ARDS) as an R data frame. This data frame may be saved to disk, saved in a database, or examined from code. The function takes no parameters.

Usage

get_ards()

Value

A data frame of the current analysis results.

See Also

Other ards: add_ards(), init_ards()

Examples

library(ards)
library(dplyr)

# Initialize the ARDS
# - These values will be common through the dataset
init_ards(studyid = "IRIS",
          tableid = "01", adsns = "iris",
          population = "all flowers",
          time = "1973")

# Perform analysis on Petal.Length
# - Using Species as a by-group
analdf1 <- iris |> 
  select(Petal.Length, Species) |> 
  group_by(Species) |> 
  summarize(n = n(),
            mean = mean(Petal.Length),
            std = sd(Petal.Length),
            min = min(Petal.Length),
            max = max(Petal.Length)) |> 
  add_ards(statvars = c("n", "mean", "std", "min", "max"),
           statdesc = c("Count", "Mean", "STD", "Minimum", "Maximum"),
           anal_var = "Petal.Length", trtvar = "Species")
           
# Perform analysis on Petal.Width
# - Using Species as a by-group
analdf2 <- iris |> 
  select(Petal.Width, Species) |> 
  group_by(Species) |> 
  summarize(n = n(),
            mean = mean(Petal.Width),
            std = sd(Petal.Width),
            min = min(Petal.Width),
            max = max(Petal.Width)) |> 
  add_ards(statvars = c("n", "mean", "std", "min", "max"),
           statdesc = c("Count", "Mean", "STD", "Minimum", "Maximum"),
           anal_var = "Petal.Width", trtvar = "Species")

# Get the ARDS
ards <- get_ards() 

# Uncomment to view ards
# View(ards)

Initialize the Analysis Results Dataset

Description

A function to initialize the Analysis Results Dataset (ARDS). This function will first create a data template in the desired structure, and then populate common values across the dataset from that template. These common values will be repeated on each row of the analysis data frame for subsequent inserts from the add_ards function.

Usage

init_ards(
  studyid = NA,
  tableid = NA,
  adsns = NA,
  population = NA,
  time = NA,
  where = NA,
  reset = TRUE
)

Arguments

studyid

The study for which the analysis was performed. This parameter is optional.

tableid

A table identifier to use for the results. This value identifies the table within the study. Optional string value.

adsns

A vector of source dataset names. This parameter is used to identify the input data for the analysis. This parameter is optional.

population

A description of the analysis population. This parameter is used to identify the population for analysis. This parameter is optional.

time

A description of the time frame used in the analysis. For example, in a clinical study, the "time" value may identify the visit on which the analysis is based.

where

An optional description of the criteria used to subset the data for analysis.

reset

If true, clears out the existing ARDS dataset and replaces with an empty template. Otherwise, just assign new parameter values to the existing template. The default value is TRUE, meaning the ARDS in memory will be cleared every time init_ards is called. If you wish to assign new initialization values, but keep appending to the existing ARDS dataset, set this parameter to FALSE. This feature is used when you are creating two different tables in the same program.

Value

The initialized analysis dataset.

See Also

Other ards: add_ards(), get_ards()

Examples

library(ards)
library(dplyr)

# Initialize the ARDS
# - These values will be common through the dataset
init_ards(studyid = "MTCARS",
          tableid = "01", adsns = "mtcars",
          population = "all cars",
          time = "1973")

# Perform analysis on MPG
# - Using cylinders as a by-group
analdf <- mtcars |> 
  select(cyl, mpg) |> 
  group_by(cyl) |> 
  summarize(n = n(),
            mean = mean(mpg),
            std = sd(mpg),
            min = min(mpg),
            max = max(mpg))

# View analysis data
analdf
#     cyl     n  mean   std   min   max
#   <dbl> <int> <dbl> <dbl> <dbl> <dbl>
# 1     4    11  26.7  4.51  21.4  33.9
# 2     6     7  19.7  1.45  17.8  21.4
# 3     8    14  15.1  2.56  10.4  19.2

# Add analysis data to ARDS
# - These values will be unique per row
add_ards(analdf, 
         statvars = c("n", "mean", "std", "min", "max"),
         anal_var = "mpg", trtvar = "cyl")


# Get the ARDS
ards <- get_ards() 

# Uncomment to view ards
# View(ards)