library(Distance)
data(amakihi)
Other model selection metrics
Alternative model selection metrics
The other metrics
AICc
Defined as:
\[AICc = -2ln(\mathscr{L}) - 2k + \frac{2k(k+1)}{n-k-1}\]
where k is the number of parameters in the model and n is the number of observations. In general, if \(n\) is many times larger than \(k^2\), then the extra term will be negligible. What do we know, in general, about the magnitude of k and n in typical distance sampling situations? We encourage the collection of 60 to 80 detections (\(n\)). A hazard rate key function with a four-level factor covariate has \(k=5\) parameters. The value of the the \(c\) term added to the regular AIC ranges from 1.11 with \(n=60\) to 0.81 with \(n=80\).
For a two-parameter model (hazard rate or half normal with a single continuous covariate) the value of the additional term would be 0.21 with \(n=60\) to 0.16 with \(n=80\). Recognise the magnitude of the other terms in the AIC are in the hundreds or thousands (see below).
BIC
Defined as:
\[BIC = -2ln(\mathscr{L}) - k \cdot ln(n)\]
The penalty term changes as a function of sample size; the larger the number of detections, the greater the penalty term. With a 2 parameter model using AIC, the penalty term would be 4 (\(2 \times 2\)). For a model with the the same number of parameters and 80 detections, the penalty term would be 8.8.
Producing the alternative metrics with the Distance R package
The AIC for any model (or set of models) can be extracted from an object created by ds()
via the AIC()
function. To compute AICc however, you would need to compute the third term manually in your code and add that third term to the value for a model produced by a call to the AIC()
function.
The AIC()
function can be convinced to return the BIC value for a model by setting the second argument k=
in the AIC()
function to the \(ln(n)\) multiplier, rather than the default value of 2
; e.g.
AIC(object, k=log(object$ds$n))
where object
is the name of the object created by ds()
.
Demonstration with amakihi
The amakihi analysis of Practical 8 (and Marques et al. (2007)) demonstrates the model ranking when using the differing model selection metrics.
A short function that computes all three model selection metrics for a given fitted ds
object. A vector of all three scores is returned by the function.
<- function(modelobj) {
altmetric <- length(modelobj$ddf$fitted)
nval <- AIC(modelobj)
AICcall <- AICcall$df
k <- AICcall$AIC
AICval <- AICval + (2*k*(k+1)) / (nval-k-1)
AICc <- AIC(modelobj, k = log(nval))$AIC
BIC return(c(AICval, AICc, BIC))
}
Fit a series of models to the amakihi data set, described in Marques et al. (2007). Call the altmetric()
function for each of the fitted models, combine the resulting vectors into a data frame.
<- convert_units("meter", NULL, "hectare")
conv <- ds(amakihi, transect="point", key="hn", convert_units = conv, truncation=82.5)
amak.hn <- ds(amakihi, transect="point", key="hr", convert_units = conv, truncation=82.5)
amak.hr <- ds(amakihi, transect="point", key="hr", formula=~OBs, convert_units = conv,
amak.hr.obs truncation=82.5)
<- ds(amakihi, transect="point", key="hr", formula=~MAS, convert_units = conv,
amak.hr.mas truncation=82.5)
<- ds(amakihi, transect="point", key="hr", formula=~OBs+MAS, convert_units = conv,
amak.hr.obs.mas truncation=82.5)
<- data.frame(NULL)
results <- rbind(results, altmetric(amak.hn))
results <- rbind(results, altmetric(amak.hr))
results <- rbind(results, altmetric(amak.hr.obs))
results <- rbind(results, altmetric(amak.hr.mas))
results <- rbind(results, altmetric(amak.hr.obs.mas))
results colnames(results) <- c("AIC", "AICc", "BIC")
rownames(results) <- c("HN","HR","HR(obs)","HR(mas)","HR(obs+mas)")
kable(results[order(results$AIC),], caption="Models ordered by AIC") %>%
kable_paper(full_width = FALSE) %>%
column_spec(2, bold=TRUE, background="lightblue")
AIC | AICc | BIC | |
---|---|---|---|
HR(obs+mas) | 10777.38 | 10777.42 | 10803.00 |
HR(obs) | 10778.45 | 10778.48 | 10798.95 |
HN | 10799.02 | 10799.07 | 10824.64 |
HR(mas) | 10805.63 | 10805.65 | 10821.00 |
HR | 10807.55 | 10807.56 | 10817.80 |
kable(results[order(results$AICc),], caption="Models ordered by AICc") %>%
kable_paper(full_width = FALSE) %>%
column_spec(3, bold=TRUE, background="lightblue")
AIC | AICc | BIC | |
---|---|---|---|
HR(obs+mas) | 10777.38 | 10777.42 | 10803.00 |
HR(obs) | 10778.45 | 10778.48 | 10798.95 |
HN | 10799.02 | 10799.07 | 10824.64 |
HR(mas) | 10805.63 | 10805.65 | 10821.00 |
HR | 10807.55 | 10807.56 | 10817.80 |
kable(results[order(results$BIC),], caption="Models ordered by BIC") %>%
kable_paper(full_width = FALSE) %>%
column_spec(4, bold=TRUE, background="lightblue")
AIC | AICc | BIC | |
---|---|---|---|
HR(obs) | 10778.45 | 10778.48 | 10798.95 |
HR(obs+mas) | 10777.38 | 10777.42 | 10803.00 |
HR | 10807.55 | 10807.56 | 10817.80 |
HR(mas) | 10805.63 | 10805.65 | 10821.00 |
HN | 10799.02 | 10799.07 | 10824.64 |
Comparison of selection from the three metrics
There is no difference in model order whether AIC or AICc is used. There is barely any difference in the computed metrics between AIC and AICc for these data.
With the use of BIC, the top two models switch places: model with OBs
has smaller BIC that model OBs+MAS
as well as models 3 and 4 trading places (HR without covariate out-performing MAS
covariate). A partial explanation for this re-ordering of models is the \(ln(n)\) multiplier for the penalty term. With 1243 detections in this data set, the BIC penalty term used a multiplier of 7.13 rather than a multiplier of 2 for each parameter added to our detection function model. With smaller data sets, the effect of the \(ln(n)\) multiplier will be less profound. When the number of observations is on the order of 60-80, the penalty term multiplier is on the order of 4, making each parameter twice as “expensive” in BIC terms, compared to AIC scores.