This function calculates secondary statistics (DE and z-score) on metric statistics for use with developing a multi-metric index.

metric.stats2(
  data_metval,
  data_metstat,
  col_metval_RefStatus = "RefStatus",
  col_metval_DataType = "DataType",
  col_metval_Subset = "INDEX_CLASS",
  col_metstat_RefStatus = "RefStatus",
  col_metstat_DataType = "DataType",
  col_metstat_Subset = "INDEX_CLASS",
  RefStatus_Ref = "Ref",
  RefStatus_Str = "Str",
  RefStatus_Oth = "Oth",
  DataType_Cal = "Cal",
  DataType_Ver = "Ver",
  Subset_Value = NULL
)

Arguments

data_metval

Data frame of metric values.

data_metstat

Data frame of metric statistics

col_metval_RefStatus

Column name for Reference Status. Default = "Ref_Status"

col_metval_DataType

Column name for Data Type – Validation vs. Calibration. Default = "Data_Type"

col_metval_Subset

Column name for INDEX_CLASS in data_metstats. Default = INDEX_CLASS

col_metstat_RefStatus

Column name for Reference Status. Default = "Ref_Status"

col_metstat_DataType

Column name for Data Type – Validation vs. Calibration. Default = "Data_Type"

col_metstat_Subset

Column name for INDEX_CLASS in data_metstats. Default = xx.

RefStatus_Ref

RefStatus value for Reference. Default = "Ref"

RefStatus_Str

RefStatus value for Stressed. Default = "Str"

RefStatus_Oth

RefStatus value for Other. Default = "Oth"

DataType_Cal

DataType value for Calibration. Default = "Cal"

DataType_Ver

DataType value for Verification. Default = "Ver"

Subset_Value

Subset value of INDEX_CLASS (site class). Default = NULL

Value

A data frame of the metric.stats input is returned with new columns (z_score, DE25 and DE75).

Details

Secondary metrics statistics for the data are calculated.

Inputs are metric values and metric stats outputs.

Metric values is a wide format with columns for each metric. Assumes only a single Subset.

Metrics stats is a wide format with columns for each statistic with metrics in a single column. Assumes only a single Subset.

Required fields are RefStatus, DataType, and INDEX_CLASS. The user is allowed to enter their own values for these fields for each input file.

The two statistics calculated are z-score and discrimination efficiency (DE) for each metric within each DataType (cal / val).

Z-scores are calculated using the calibration (or development) data set for a given INDEX_CLASS (or Site Class).

* (mean Ref - mean Str) / sd Ref

DE is calculated without knowing the expected direction of response for each metric for a given INDEX_CLASS (or Site Class). DE is the percentage (0-100) of **stressed** samples that fall **below** the **25th** quantile (for decreaser metrics, e.g., total taxa) or **above** the **75th** quantile (for increaser metrics, e.g., HBI) of the **reference** samples.

A data frame of the metric.stats input is returned with new columns (z_score, DE25 and DE75). The z-score is added for each Ref_Status. DE25 and DE75 are only added where Ref_Status is labeled as Stressed.

Examples

# data, benthos
df_bugs <- data_mmi_dev_small

# Munge Names
names(df_bugs)[names(df_bugs) %in% "BenSampID"]   <- "SAMPLEID"
names(df_bugs)[names(df_bugs) %in% "TaxaID"]      <- "TAXAID"
names(df_bugs)[names(df_bugs) %in% "Individuals"] <- "N_TAXA"
names(df_bugs)[names(df_bugs) %in% "Exclude"]     <- "EXCLUDE"
names(df_bugs)[names(df_bugs) %in% "Class"]       <- "INDEX_CLASS"
names(df_bugs)[names(df_bugs) %in% "Unique_ID"]   <- "SITEID"

# Add Missing Columns
df_bugs$ELEVATION_ATTR <- NA_character_
df_bugs$GRADIENT_ATTR  <- NA_character_
df_bugs$WSAREA_ATTR    <- NA_character_
df_bugs$HABSTRUCT      <- NA_character_
df_bugs$BCG_ATTR2      <- NA_character_
df_bugs$AIRBREATHER    <- NA
df_bugs$UFC            <- NA_real_

# Calc Metrics
cols_keep <- c("Ref_v1",
               "CalVal_Class4",
               "SITEID",
               "CollDate",
               "CollMeth")
# INDEX_NAME and INDEX_CLASS kept by default
df_metval <- metric.values(df_bugs, "bugs", fun.cols2keep = cols_keep)
#> Joining with `by = join_by(SAMPLEID, INDEX_NAME, INDEX_CLASS)`

# Calc Stats
col_metrics   <- names(df_metval)[9:ncol(df_metval)]
col_SampID    <- "SAMPLEID"
col_RefStatus <- "REF_V1"
RefStatus_Ref <- "Ref"
RefStatus_Str <- "Strs"
RefStatus_Oth <- "Other"
col_DataType  <- "CALVAL_CLASS4"
DataType_Cal  <- "cal"
DataType_Ver  <- "verif"
col_Subset    <- "INDEX_CLASS"
Subset_Value  <- "CentralHills"
df_stats <- metric.stats(df_metval,
                         col_metrics,
                         col_SampID,
                         col_RefStatus,
                         RefStatus_Ref,
                         RefStatus_Str,
                         RefStatus_Oth,
                         col_DataType,
                         DataType_Cal,
                         DataType_Ver,
                         col_Subset,
                         Subset_Value)
#> Working on item 1/6; Ref___cal
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Working on item 2/6; Ref___verif
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Working on item 3/6; Strs___cal
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Working on item 4/6; Strs___verif
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Working on item 5/6; Other___cal
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Working on item 6/6; Other___verif
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf
#> Warning: no non-missing arguments to max; returning -Inf

# Calc Stats2 (z-scores and DE)
data_metval           <- df_metval
data_metstat          <- df_stats
col_metval_RefStatus  <- "REF_V1"
col_metval_DataType   <- "CALVAL_CLASS4"
col_metval_Subset     <- "INDEX_CLASS"
col_metstat_RefStatus <- "REF_V1"
col_metstat_DataType  <- "CALVAL_CLASS4"
col_metstat_Subset    <- "INDEX_CLASS"
RefStatus_Ref         <- "Ref"
RefStatus_Str         <- "Strs"
RefStatus_Oth         <- "Other"
DataType_Cal          <- "cal"
DataType_Ver          <- "verif"
Subset_Value          <- "CentralHills"
df_stats2 <- metric.stats2(data_metval,
                           data_metstat,
                           col_metval_RefStatus,
                           col_metval_DataType,
                           col_metval_Subset,
                           col_metstat_RefStatus,
                           col_metstat_DataType,
                           col_metstat_Subset,
                           RefStatus_Ref,
                           RefStatus_Str,
                           RefStatus_Oth,
                           DataType_Cal,
                           DataType_Ver,
                           Subset_Value)
#> Error in metric.stats2(data_metval, data_metstat, col_metval_RefStatus,     col_metval_DataType, col_metval_Subset, col_metstat_RefStatus,     col_metstat_DataType, col_metstat_Subset, RefStatus_Ref,     RefStatus_Str, RefStatus_Oth, DataType_Cal, DataType_Ver,     Subset_Value): object 'col_Subset' not found

# \donttest{
# Save Results
write.table(df_stats2,
            file.path(tempdir(), "metric.stats2.tsv"),
            col.names = TRUE,
            row.names = FALSE,
            sep = "\t")
#> Error: object 'df_stats2' not found
# }