Tutorial–analysis of stratified surveys ✏️

Author

Centre for Research into Ecological and Environmental Modelling
University of St Andrews

Published

August 7, 2024

Photo by Rowan Simpson on Unsplash

Exercise 7 – Analysis of stratified surveys

This set of questions is completely about the practicalities of interpreting output from ds. It is far too easy to be beguiled by the rafts of output generated by a distance sampling analysis; we often just look at the estimated density or abundance and scurry along–particularly if the data are not your own.

Closely examine the results from this particular analysis, where the strata are analysed separately (distinct detection functions). Recall the reason strata were introduced with this data set is because the southern portion of the study area (Southern Ocean) likely possessed a richer food source, hence there was a belief that there would be higher minke whale densities in the south than in the north. Let’s examine the output to see if this belief was supported by the data

Answer these questions by examining at the output above

Before answering the questions, you’ll need to orient yourself to the order in which the stratum results are presented. Hint: the stratum label appears in each piece of output.

  • How many times larger (geographically) is the northern stratum compared to the southern stratum? (to the nearest 0.5)
  • How many times larger is the estimated abundance in the northern stratum, compared to the south? (to the nearest 0.5)
  • Explain the paradox how can the southern stratum have a higher density but lower abundance of minke whales?
Comments about minke stratified analysis
  • It is common for high quality habitat to be more scarce than low quality habitat.
  • This serves as a lesson for future survey design:
    • do not design a survey such that all survey effort is allocated toward sampling the high quality habitat where you think most animals reside. It might be the case that the lower quality habitat actually contains most of your study animals.

About the possible difference in detectability

You can see that the estimate of \(\hat{P}_a\) is different for the two strata. But examine the output to find the estimates of the scale parameter \(\hat{\sigma}\) for the two strata. Notice they both appear to be negative. Use the code below to convert those estimates of \(\hat{\sigma}\) into biologically interpretable quantities. Enter values with 2 decimal places.

  • With your knowledge of \(\hat{\sigma}\) for each stratum, describe the visibility in the northern vs southern strata.

Weather conditions are poorer in the south which is closer to the ice edge.

Statistical difference in density between strata supplement

It is trivially easy to calculate the difference between two density estimates. The messy part is comparing the magnitude of that difference against the uncertainty in that difference (ratio of signal (difference) to noise (uncertainty)). This signal-to-noise ratio is measured by the traditional t-test, with the numerator being the signal (difference) while the denominator is a measure of noise (standard error of the estimated difference).

The challenge is to estimate the variance of the difference in estimated density estimates. The variance of a difference is the sum of variances in the two estimates when those estimates are independent; which they are when the density estimates are independently computed. The calculation is more tricky when the two density estimates share a common detection function; the estimates are no longer independent. That is why there are two different formulas for computing the test statistic for estimated density differences, and the function that computes the test statistic can cope with the two situation.

> density.difference.ds(hazard.pooled)
  n.detect  cv.ER line.length n.transects group.size group.size.cv
1       39 0.2323         484          13     2.1538        0.1397
2       49 0.3724        1370          12     2.3265        0.2303
  nparm.detect  D.hat D.hat.cv     f0   f0cv
1            2 0.0929   0.2916 1.0711 0.1073
2            2 0.0446   0.4508 1.0711 0.1073
  D1.minus.D2 SE.difference t.statistic df.t.stat P.value     LCB    UCB
1      0.0484        0.0402      1.2048   49.2919   0.234 -0.0323 0.1291

> density.difference.ds(ideal.hr, marginal.hr)
  n.detect  cv.ER line.length n.transects group.size group.size.cv
1       39 0.2323         484          13     2.1538        0.1397
2       49 0.3724        1370          12     2.3265        0.2303
  nparm.detect  D.hat D.hat.cv     f0   f0cv
1            2 0.1167   0.3182 1.3450 0.1338
2            2 0.0365   0.4451 0.8781 0.1315
  D1.minus.D2 SE.difference t.statistic df.t.stat P.value     LCB    UCB
1      0.0802        0.0405      1.9779   27.6636   0.058 -0.0029 0.1633

  • Why does the CV(detection function) increase when separate detection functions are fitted to each stratum?
  • Why does the CV(encounter rate) not increase when separate detection functions are fitted to each stratum?
  • What is the relative magnitude of CV(detection function) to CV(encounter rate) for either pooled or separate analyses?
  • What are contributing factors causing the significance value of the independent differences to be smaller (0.058) than the significance of the pooled analysis (0.234)?