Other model selection metrics

Author

Centre for Research into Ecological and Environmental Modelling
University of St Andrews

Published

September 11, 2025

Demonstration

Alternative model selection metrics

The other metrics

AICc

Defined as:

\[AICc = -2ln(\mathscr{L}) - 2k + \frac{2k(k+1)}{n-k-1}\]

where k is the number of parameters in the model and n is the number of observations. In general, if $n$ is many times larger than $k^2$, then the extra term will be negligible. What do we know, in general, about the magnitude of k and n in typical distance sampling situations? We encourage the collection of 60 to 80 detections ($n$). A hazard rate key function with a four-level factor covariate has $k=5$ parameters. The value of the the $c$ term added to the regular AIC ranges from 1.11 with $n=60$ to 0.81 with $n=80$.

For a two-parameter model (hazard rate or half normal with a single continuous covariate) the value of the additional term would be 0.21 with $n=60$ to 0.16 with $n=80$. Recognise the magnitude of the other terms in the AIC are in the hundreds or thousands (see below).

BIC

Defined as:

\[BIC = -2ln(\mathscr{L}) - k \cdot ln(n)\]

The penalty term changes as a function of sample size; the larger the number of detections, the greater the penalty term. With a 2 parameter model using AIC, the penalty term would be 4 ($2 \times 2$). For a model with the the same number of parameters and 80 detections, the penalty term would be 8.8.

Producing the alternative metrics with the Distance R package

The AIC for any model (or set of models) can be extracted from an object created by ds() via the AIC() function. To compute AICc however, you would need to compute the third term manually in your code and add that third term to the value for a model produced by a call to the AIC() function.

The AIC() function can be convinced to return the BIC value for a model by setting the second argument k= in the AIC() function to the $ln(n)$ multiplier, rather than the default value of 2; e.g.

AIC(object, k=log(object$ds$n))

where object is the name of the object created by ds().

Demonstration with amakihi

The amakihi analysis of Practical 8 (and Marques et al. (2007)) demonstrates the model ranking when using the differing model selection metrics.

library(Distance)
data(amakihi)

A short function that computes all three model selection metrics for a given fitted ds object. A vector of all three scores is returned by the function.

altmetric <- function(modelobj) {
  nval <- length(modelobj$ddf$fitted)
  AICcall <- AIC(modelobj)
  k <- AICcall$df
  AICval <- AICcall$AIC
  AICc <- AICval + (2*k*(k+1)) / (nval-k-1)
  BIC <- AIC(modelobj, k = log(nval))$AIC
  return(c(AICval, AICc, BIC))
}

Fit a series of models to the amakihi data set, described in Marques et al. (2007). Call the altmetric() function for each of the fitted models, combine the resulting vectors into a data frame.

conv <- convert_units("meter", NULL, "hectare")
amak.hn <- ds(amakihi, transect="point", key="hn", convert_units = conv, truncation=82.5)
amak.hr <- ds(amakihi, transect="point", key="hr", convert_units = conv, truncation=82.5)
amak.hr.obs <- ds(amakihi, transect="point", key="hr", formula=~OBs, convert_units = conv, 
                  truncation=82.5)
amak.hr.mas <- ds(amakihi, transect="point", key="hr", formula=~MAS, convert_units = conv, 
                  truncation=82.5)
amak.hr.obs.mas <- ds(amakihi, transect="point", key="hr", formula=~OBs+MAS, convert_units = conv, 
                      truncation=82.5)
results <- data.frame(NULL)
results <- rbind(results, altmetric(amak.hn))
results <- rbind(results, altmetric(amak.hr))
results <- rbind(results, altmetric(amak.hr.obs))
results <- rbind(results, altmetric(amak.hr.mas))
results <- rbind(results, altmetric(amak.hr.obs.mas))
colnames(results) <- c("AIC", "AICc", "BIC")
rownames(results) <- c("HN","HR","HR(obs)","HR(mas)","HR(obs+mas)")

kable(results[order(results$AIC),], caption="Models ordered by AIC") %>%
  kable_paper(full_width = FALSE) %>%
  column_spec(2, bold=TRUE, background="lightblue")

Models ordered by AIC
	AIC	AICc	BIC
HR(obs+mas)	10777.38	10777.42	10803.00
HR(obs)	10778.45	10778.48	10798.95
HN	10799.02	10799.07	10824.65
HR(mas)	10805.63	10805.65	10821.00
HR	10807.55	10807.56	10817.80

kable(results[order(results$AICc),], caption="Models ordered by AICc") %>%
  kable_paper(full_width = FALSE) %>%
  column_spec(3, bold=TRUE, background="lightblue")

Models ordered by AICc
	AIC	AICc	BIC
HR(obs+mas)	10777.38	10777.42	10803.00
HR(obs)	10778.45	10778.48	10798.95
HN	10799.02	10799.07	10824.65
HR(mas)	10805.63	10805.65	10821.00
HR	10807.55	10807.56	10817.80

kable(results[order(results$BIC),], caption="Models ordered by BIC") %>%
  kable_paper(full_width = FALSE) %>%
  column_spec(4, bold=TRUE, background="lightblue")

Models ordered by BIC
	AIC	AICc	BIC
HR(obs)	10778.45	10778.48	10798.95
HR(obs+mas)	10777.38	10777.42	10803.00
HR	10807.55	10807.56	10817.80
HR(mas)	10805.63	10805.65	10821.00
HN	10799.02	10799.07	10824.65

Comparison of selection from the three metrics

There is no difference in model order whether AIC or AICc is used. There is barely any difference in the computed metrics between AIC and AICc for these data.

With the use of BIC, the top two models switch places: model with OBs has smaller BIC that model OBs+MAS as well as models 3 and 4 trading places (HR without covariate out-performing MAS covariate). A partial explanation for this re-ordering of models is the $ln(n)$ multiplier for the penalty term. With 1243 detections in this data set, the BIC penalty term used a multiplier of 7.13 rather than a multiplier of 2 for each parameter added to our detection function model. With smaller data sets, the effect of the $ln(n)$ multiplier will be less profound. When the number of observations is on the order of 60-80, the penalty term multiplier is on the order of 4, making each parameter twice as “expensive” in BIC terms, compared to AIC scores.

References

Marques, T. A., Thomas, L., Fancy, S. G., & Buckland, S. T. (2007). Improving estimates of bird density using multiple covariate distance sampling. The Auk, 127, 1229–1243. https://doi.org/10.1642/0004-8038(2007)124[1229:ieobdu]2.0.co;2