Spatially explicit models for distance sampling data

Aims

By the end of this practical, you should feel comfortable:

Making a prediction using the predict() function
Summing the prediction cells to obtain a total abundance for a given area
Plotting a map of predictions
Knowing when to use dsm.var.prop and when to use dsm.var.gam
Estimating variance for a given prediction area
Estimating variance per-cell for a prediction grid
Interpreting the summary() output for uncertainty estimates
Making maps of the coefficient of variation in R

Loading the packages and data

library(knitr)
library(dsm)

## Loading required package: mgcv

## Loading required package: nlme

## This is mgcv 1.8-33. For overview type 'help("mgcv-package")'.

## Loading required package: mrds

## This is mrds 2.2.3
## Built: R 4.0.2; ; 2020-08-01 10:33:56 UTC; unix

## Loading required package: numDeriv

## This is dsm 2.3.0
## Built: R 4.0.2; ; 2020-07-16 23:56:50 UTC; unix

library(ggplot2)
library(patchwork)

Load our models and data:

load("spermwhale.RData")
# load models
load("dsms.RData")
load("dsms-checked.RData")
# rename one model
dsm_nb_xy_ms <- dsm_nb_xy_ms_p1_4

Prediction data

Our prediction data lives in the predgrid object. This data was prepared from various sources and the since some of the covariates were dynamic (SST, NPP and EKE), the dates for these were arbitrarily chosen as 2 June 2004, or Julian date 153.

If you’re preparing this data yourself, the rerddap package can be used to obtain marine data from NOAA for both segments and predictions.

We can take a look at our data using head() to see the first few rows:

head(predgrid)

##            x      y      Depth      SST      NPP DistToCAS          EKE off.set
## 126 547984.6 788254  153.59825 12.04609 1462.521 11788.974 0.0008329031   1e+08
## 127 557984.6 788254  552.31067 12.81379 1465.410  5697.248 0.0009806611   1e+08
## 258 527984.6 778254   96.81992 12.90251 1429.432 13722.626 0.0011575423   1e+08
## 259 537984.6 778254  138.23763 13.21393 1424.862  9720.671 0.0013417297   1e+08
## 260 547984.6 778254  505.14386 13.75655 1379.351  8018.690 0.0026881567   1e+08
## 261 557984.6 778254 1317.59521 14.42525 1348.544  3775.462 0.0045683752   1e+08
##          long      lat
## 126 -66.52252 40.94697
## 127 -66.40464 40.94121
## 258 -66.76551 40.86781
## 259 -66.64772 40.86227
## 260 -66.52996 40.85662
## 261 -66.41221 40.85087

As we saw in the slides, the predict() function will fill in the variables in the model formula with these values to obtain the predictions at each cell.

For example using the model dsm_nb_xy_ms:

pp <- predict(dsm_nb_xy_ms, predgrid, off.set=predgrid$off.set)

This is just a list of numbers – the predicted abundance per cell. We can sum these to get the estimated abundance for the study area:

sum(pp, na.rm=TRUE)

## [1] 1580.121

We can also plot this to get a spatial representation of the predictions:

# assign the predictions to the prediction grid data.frame
predgrid$Nhat_nb_xy <- pp

# plot!
p <- ggplot(predgrid) +
  geom_tile(aes(x=x, y=y, fill=Nhat_nb_xy, width=10*1000, height=10*1000)) +
  coord_equal() + 
  # better colour scheme  
  scale_fill_viridis_c() +
  # cleaner plot
  theme_minimal() +
  # set the label for the fill colour
  labs(fill="Abundance")
print(p)

Predicted surface for abundance estimates with bivariate spatial smooth along with environmental covariates.

Use the chunks above as a template and make predictions for the other models you saved in the previous exercises. In particular, compare the models with only spatial terms to those with environmental covariates included.

Making predictions

Let’s try making predictions with a few models. Let’s compare two count models (negative binomial and Tweedie) an two estimated abundance models (negative binomial and Tweedie).

I’ll make the plots then use the patchwork package to add together plots and put them together at the end.

First the Tweedie count, with a smooth of space.

predgrid$Nhat_tw_xy <- predict(dsm_tw_xy_ms, predgrid, 
                               off.set=predgrid$off.set)

# plot
p_tw_xy <- ggplot(predgrid) +
  geom_tile(aes(x=x, y=y, fill=Nhat_tw_xy,
                width=10*1000, height=10*1000)) +
  coord_equal() + 
  scale_fill_viridis_c() +
  theme_minimal() +
  ggtitle("dsm_tw_xy_ms") +
  labs(fill="Abundance")

Now make a plot for the negative binomial habitat count model:

predgrid$Nhat_nb_noxy <- predict(dsm_nb_noxy_ms_refit,
                                 predgrid, 
                                 off.set=predgrid$off.set)

# plot
p_nb_noxy <- ggplot(predgrid) +
  geom_tile(aes(x=x, y=y, fill=Nhat_nb_noxy,
                width=10*1000, height=10*1000)) +
  coord_equal() + 
  scale_fill_viridis_c() +
  theme_minimal() +
  ggtitle("dsm_nb_noxy_ms_refit") +
  labs(fill="Abundance")

Now the estimated abundance, with negative binomial response:

predgrid$Nhat_nb_ae <- predict(dsm_nb_xy_ms_ae, predgrid, 
                               off.set=predgrid$off.set)

# plot
p_nb_ae <- ggplot(predgrid) +
  geom_tile(aes(x=x, y=y, fill=Nhat_nb_ae,
                width=10*1000, height=10*1000)) +
  coord_equal() + 
  scale_fill_viridis_c() +
  theme_minimal() +
  ggtitle("dsm_nb_xy_ms_ae") +
  labs(fill="Abundance")

Finally estimated abundance, with Tweedie response:

predgrid$Nhat_tw_ae <- predict(dsm_tw_xy_ms_ae, predgrid, 
                               off.set=predgrid$off.set)

# plot
p_tw_ae <- ggplot(predgrid) +
  geom_tile(aes(x=x, y=y, fill=Nhat_tw_ae,
                width=10*1000, height=10*1000)) +
  coord_equal() + 
  scale_fill_viridis_c() +
  theme_minimal() +
  ggtitle("dsm_tw_xy_ms_ae") +
  labs(fill="Abundance")

Now plotting these all together:

(p_tw_xy + p_nb_noxy)/(p_tw_ae + p_nb_ae)

We can see a similar pattern generally in all the plots with high values at the shelf edge, though we can see that dsm_nb_noxy_ms_refit has much more fine detail in it than the other models.

Now make some abundance estimates:

sum(predgrid$Nhat_tw_xy)

## [1] 1595.535

sum(predgrid$Nhat_nb_noxy)

## [1] 1198.791

sum(predgrid$Nhat_tw_ae)

## [1] 2373.881

sum(predgrid$Nhat_nb_ae)

## [1] 2402.684

We do see a big difference here between the estimated abundance models and the count models. Bear in mind that the count models were both fitted just using the half-normal detection function.

load("df-models.RData")

# need to build the binned Beaufort covariate as was
# in the detection function
segs$BeaufortCut <- cut(segs$Beaufort, seq(0, 5, by=1),
                        include.lowest=TRUE,
                        ordered.result=TRUE)

# do a quick model selection, based off previous results
dsm_nb_xy_ms_refit <- dsm(count~s(x,y, bs="ts", k=60) +
                             s(Depth, bs="ts", k=20) +
                             #s(DistToCAS, bs="ts") + # 1
                             #s(SST, bs="ts") + # 2
                             s(EKE, bs="ts", k=20),# +
                             #s(NPP, bs="ts"), # 3
                      df_hr_beau, segs, obs,
                      family=nb())

## Warning in make.data(response, ddf.obj, segment.data, observation.data, : Some
## observations are outside of detection function truncation!

summary(dsm_nb_xy_ms_refit)

## 
## Family: Negative Binomial(0.114) 
## Link function: log 
## 
## Formula:
## count ~ s(x, y, bs = "ts", k = 60) + s(Depth, bs = "ts", k = 20) + 
##     s(EKE, bs = "ts", k = 20) + offset(off.set)
## 
## Parametric coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -20.4266     0.2262  -90.31   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##             edf Ref.df Chi.sq  p-value    
## s(x,y)   1.8750     59 19.804 2.29e-05 ***
## s(Depth) 3.4571     19 48.197  < 2e-16 ***
## s(EKE)   0.8012     19  3.473   0.0365 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.089   Deviance explained = 38.9%
## -REML = 380.46  Scale est. = 1         n = 949

Plotting that resulting model:

predgrid$Nhat_nb_xy <- predict(dsm_nb_xy_ms_refit,
                               predgrid, 
                               off.set=predgrid$off.set)

# plot
p_nb_xy <- ggplot(predgrid) +
  geom_tile(aes(x=x, y=y, fill=Nhat_nb_xy,
                width=10*1000, height=10*1000)) +
  coord_equal() + 
  scale_fill_viridis_c() +
  theme_minimal() +
  labs(fill="Abundance")
print(p_nb_xy)

Now we get something more like the estimated abundance models, so we can see that including sea state can have a really big effect!

sum(predgrid$Nhat_nb_xy)

## [1] 2206.73

And let’s do the same thing for the Tweedie:

# do a quick model selection, based off previous results
dsm_tw_xy_ms_refit <- dsm(count~s(x,y, bs="ts", k=60) +
                             s(Depth, bs="ts", k=20) +
                             #s(DistToCAS, bs="ts", k=20) + # 1
                             s(SST, bs="ts", k=20),# + 
                             #s(EKE, bs="ts", k=20),# +
                             #s(NPP, bs="ts", k=20), # 2
                      df_hr_beau, segs, obs,
                      family=tw())

## Warning in make.data(response, ddf.obj, segment.data, observation.data, : Some
## observations are outside of detection function truncation!

summary(dsm_tw_xy_ms_refit)

## 
## Family: Tweedie(p=1.254) 
## Link function: log 
## 
## Formula:
## count ~ s(x, y, bs = "ts", k = 60) + s(Depth, bs = "ts", k = 20) + 
##     s(SST, bs = "ts", k = 20) + offset(off.set)
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -20.572      0.248  -82.94   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##            edf Ref.df     F  p-value    
## s(x,y)   1.761     59 0.191   0.0014 ** 
## s(Depth) 3.851     19 2.301  < 2e-16 ***
## s(SST)   5.566     19 1.400 2.91e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.221   Deviance explained = 40.9%
## -REML = 381.55  Scale est. = 4.0286    n = 949

So now with SST in the model, let’s see what that looks like:

predgrid$Nhat_tw_xy <- predict(dsm_tw_xy_ms_refit,
                               predgrid, 
                               off.set=predgrid$off.set)

# plot
p_tw_xy <- ggplot(predgrid) +
  geom_tile(aes(x=x, y=y, fill=Nhat_tw_xy,
                width=10*1000, height=10*1000)) +
  coord_equal() + 
  scale_fill_viridis_c() +
  theme_minimal() +
  labs(fill="Abundance")
print(p_tw_xy)

However this doesn’t change the predicted abundance.

sum(predgrid$Nhat_tw_xy)

## [1] 1461.226

However, let’s remember that this was not our favoured model previously and remains so: the quantile-quantile plot shows the negative binomial model has a better fit:

par(mfrow=c(1,2))
qq.gam(dsm_nb_xy_ms_refit, rep=200, main="negative binomial")
qq.gam(dsm_tw_xy_ms_refit, rep=200, main="negative binomial")

So we won’t pursue this model (dsm_tw_xy_ms_refit) further.

Variance estimation

Depending on the covariates in the model, we can use either dsm.var.prop or dsm.var.gam. Remember: - dsm.var.gam - assumes spatial model and detection function are independent - dsm.var.prop - propagates uncertainty from detection function to spatial model - only works for count models - covariates can only vary at segment level

If there are no covariates in the detection function, use dsm.var.gam since there’s no covariance between the detection function and the spatial model.

dsm_nb_xy_ms doesn’t include any covariates at the observer level in the detection function, so we can use dsm.var.gam to estimate the uncertainty.

# now estimate variance
var_nb_xy_ms <- dsm.var.gam(dsm_nb_xy_ms, predgrid, 
                             off.set=predgrid$off.set)

To summarise the results of this variance estimate:

summary(var_nb_xy_ms)

## Summary of uncertainty in a density surface model calculated
##  analytically for GAM, with delta method
## 
## Approximate asymptotic confidence interval:
##     2.5%     Mean    97.5% 
## 1111.871 1580.121 2245.569 
## (Using log-Normal approximation)
## 
## Point estimate                 : 1580.121 
## CV of detection function       : 0.06670757 
## CV from GAM                    : 0.168 
## Total standard error           : 285.6377 
## Total coefficient of variation : 0.1808

Try this out for some of the other models you’ve saved.

Variance estimation

We can use either the delta method (dsm.var.gam) or variance propagation (dsm.var.prop) with our count model:

dsm_nb_vargam <- dsm.var.gam(dsm_nb_xy_ms_refit, predgrid,
                             off.set=predgrid$off.set)
summary(dsm_nb_vargam)

## Summary of uncertainty in a density surface model calculated
##  analytically for GAM, with delta method
## 
## Approximate asymptotic confidence interval:
##     2.5%     Mean    97.5% 
## 1234.617 2206.730 3944.263 
## (Using log-Normal approximation)
## 
## Point estimate                 : 2206.73 
## CV of detection function       : 0.2503225 
## CV from GAM                    : 0.1706 
## Total standard error           : 668.4868 
## Total coefficient of variation : 0.3029

dsm_nb_varprop <- dsm.var.prop(dsm_nb_xy_ms_refit, predgrid,
                             off.set=predgrid$off.set)
summary(dsm_nb_varprop)

## Summary of uncertainty in a density surface model calculated
##  by variance propagation.
## 
## Probability of detection in fitted model and variance model
##   BeaufortCut Fitted.model Fitted.model.se Refitted.model
## 4       [0,1]   0.78346539      0.32022998     0.51500357
## 3       (1,2]   0.36995577      0.08459385     0.54063974
## 1       (2,3]   0.70626890      0.09957053     0.64889155
## 2       (3,4]   0.28615108      0.08823118     0.26889664
## 5       (4,5]   0.04465498      0.03348484     0.05885038
## 
## Approximate asymptotic confidence interval:
##     2.5%     Mean    97.5% 
## 1301.437 1944.461 2905.193 
## (Using log-Normal approximation)
## 
## Point estimate                 : 1944.461 
## Standard error                 : 402.5554 
## Coefficient of variation       : 0.207

Looking at these summaries we see that we actually get a reduction in the CV for our predictions! This is due to negative covariance between the spatial and detection parts of the model.

The table at the top of the dsm.var.prop output lets us check whether anything went wrong during the variance estimation. Values in the Refitted.model column should be no more than twice Fitted.model.se away from their corresponding Fitted.model values.

For the estimated abundance models, we must use dsm.var.gam:

dsm_nb_vargam_ae <- dsm.var.gam(dsm_nb_xy_ms_ae, predgrid,
                             off.set=predgrid$off.set)
summary(dsm_nb_vargam_ae)

## Summary of uncertainty in a density surface model calculated
##  analytically for GAM, with delta method
## 
## Approximate asymptotic confidence interval:
##     2.5%     Mean    97.5% 
## 1328.487 2402.684 4345.464 
## (Using log-Normal approximation)
## 
## Point estimate                 : 2402.684 
## CV of detection function       : 0.2081943 
## CV from GAM                    : 0.2288 
## Total standard error           : 743.3098 
## Total coefficient of variation : 0.3094

dsm_tw_vargam_ae <- dsm.var.gam(dsm_tw_xy_ms_ae, predgrid,
                             off.set=predgrid$off.set)
summary(dsm_tw_vargam_ae)

## Summary of uncertainty in a density surface model calculated
##  analytically for GAM, with delta method
## 
## Approximate asymptotic confidence interval:
##     2.5%     Mean    97.5% 
## 1484.918 2373.881 3795.032 
## (Using log-Normal approximation)
## 
## Point estimate                 : 2373.881 
## CV of detection function       : 0.2081943 
## CV from GAM                    : 0.125 
## Total standard error           : 576.4863 
## Total coefficient of variation : 0.2428

Again the Tweedie model here is just for example, we didn’t prefer this model based on quantile-quantile plots.

Plotting

We can plot a map of the coefficient of variation, but we first need to estimate the variance per prediction cell, rather than over the whole area.

# use the split function to make each row of the predictiond data.frame into
# an element of a list
predgrid$height <- predgrid$width <- 10*1000
predgrid_var_split <- split(predgrid, 1:nrow(predgrid))
var_split_nb_xy_ms <- dsm.var.gam(dsm_nb_xy_ms, predgrid_var_split, 
                                   off.set=predgrid$off.set)

Now we have the per-cell coefficients of variation, we assign that to a column of the prediction grid data and plot it as usual:

cv <- sqrt(var_split_nb_xy_ms$pred.var)/unlist(var_split_nb_xy_ms$pred)
predgrid$CV <- cv
p <- ggplot(predgrid) +
  geom_tile(aes(x=x, y=y, fill=CV, width=10*1000, height=10*1000)) +
  scale_fill_viridis_c() +
  coord_equal() +
  geom_point(aes(x=x,y=y, size=count), alpha=0.4,
             data=dsm_nb_xy_ms$data[dsm_nb_xy_ms$data$count>0,]) +
  theme_minimal()
print(p)

Uncertainty (CV) in prediction surface from bivariate spatial smooth with environmental covariates. Sightings overlaid.

Note that here we overplot the segments where sperm whales were observed (and scale the size of the point according to the number observed), using geom_point(). The alpha= argument makes the points semi-transparent.

Try this with the other models you fitted and see what the differences are between the maps of coefficient of variation.

Try plotting the effort over a CV plot, using the data in the model.

Plotting uncertainty

Let’s make plots to compare the maps using the delta method and variance propagation for the count model dsm_nb_xy_ms_refit:

# vargam plot
dsm_nb_vargam_plot <- dsm.var.gam(dsm_nb_xy_ms_refit, predgrid_var_split,
                             off.set=predgrid$off.set)
# can use plot=FALSE to get the ggplot object that we can modify
pl_vg <- plot(dsm_nb_vargam_plot, plot=FALSE, observations=FALSE)
# modify plot
pl_vg <- pl_vg + 
  geom_point(aes(x=x, y=y), data=dsm_nb_noxy_ms_refit$data,
             colour="lightgrey", alpha=.5, size=0.2) +
  theme(legend.position="bottom") +
  scale_fill_viridis_c()

## Scale for 'fill' is already present. Adding another scale for 'fill', which
## will replace the existing scale.

# varprop plot
dsm_nb_varprop_plot <- dsm.var.prop(dsm_nb_xy_ms_refit, predgrid_var_split,
                             off.set=predgrid$off.set)
pl_vp <- plot(dsm_nb_varprop_plot, plot=FALSE, observations=FALSE)
pl_vp <- pl_vp +
  geom_point(aes(x=x, y=y), data=dsm_nb_noxy_ms_refit$data,
             colour="lightgrey", alpha=.5, size=0.2) +
  theme(legend.position="bottom") +
  scale_fill_viridis_c()

## Scale for 'fill' is already present. Adding another scale for 'fill', which
## will replace the existing scale.

Putting that together in a plot:

pl_vg + pl_vp

We can see some differences here in the spatial pattern of the ucnertainty as well as the overall size we see in the summary difference above.

One more thing…

What about using an interaction of EKE and Depth? I’ll try this just for the count model with negative binomial response.

We can construct this using a tensor product which takes two (or more) univariate smooths and create a 2 (or more) dimensional smooth. We can set k for each dimension (we can even set different basis functions in each direction) but note that it’s the term will use the product of these ks number of parameters (k=c(10,10) leads to a 100 parameter term!). The tensor product includes the univariate terms (it’s a full interaction), so we need to remove them from the model:

dsm_nb_te <- dsm(count~s(x,y, bs="ts", k=30) +
                       #s(DistToCAS, bs="ts") + # 1
                       #s(SST, bs="ts") + # 3
                       te(EKE, Depth, bs="ts", k=c(10, 10)),# +
                       #s(NPP, bs="ts"), # 2
                      df_hr_beau, segs, obs,
                      family=nb())

## Warning in make.data(response, ddf.obj, segment.data, observation.data, : Some
## observations are outside of detection function truncation!

summary(dsm_nb_te)

## 
## Family: Negative Binomial(0.117) 
## Link function: log 
## 
## Formula:
## count ~ s(x, y, bs = "ts", k = 30) + te(EKE, Depth, bs = "ts", 
##     k = c(10, 10)) + offset(off.set)
## 
## Parametric coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -20.551      0.273  -75.27   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                 edf Ref.df Chi.sq  p-value    
## s(x,y)        1.865     29  15.66  0.00017 ***
## te(EKE,Depth) 6.153     87  53.05 6.63e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.0954   Deviance explained = 40.6%
## -REML = 380.29  Scale est. = 1         n = 949

We can plot this just like any other model:

plot(dsm_nb_te, scheme=2, shade=TRUE)

A useful test here of whether we need the tensor would be using the ti form of the tensor. This allows us to look at the parts of the interaction on their own. We can write a tensor product te(x, y) as ti(x) + ti(y) + ti(x, y), so we get the main effects and interactions separately in the model summary table. Doing that:

dsm_nb_ti <- dsm(count~s(x,y, bs="ts", k=30) +
                       #s(DistToCAS, bs="ts") +
                       #s(SST, bs="ts") + 
                       ti(EKE, bs="ts", k=10) +
                       ti(Depth, bs="ts", k=10) +
                       ti(EKE, Depth, bs="ts", k=c(10, 10)),# +
                       #s(NPP, bs="ts"),
                      df_hr_beau, segs, obs,
                      family=nb())

## Warning in make.data(response, ddf.obj, segment.data, observation.data, : Some
## observations are outside of detection function truncation!

summary(dsm_nb_ti)

## 
## Family: Negative Binomial(0.115) 
## Link function: log 
## 
## Formula:
## count ~ s(x, y, bs = "ts", k = 30) + ti(EKE, bs = "ts", k = 10) + 
##     ti(Depth, bs = "ts", k = 10) + ti(EKE, Depth, bs = "ts", 
##     k = c(10, 10)) + offset(off.set)
## 
## Parametric coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -20.4445     0.2281  -89.62   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                    edf Ref.df Chi.sq  p-value    
## s(x,y)        1.880157     29 19.094 3.23e-05 ***
## ti(EKE)       1.606484      9  6.188    0.020 *  
## ti(Depth)     3.302363      9 49.849  < 2e-16 ***
## ti(EKE,Depth) 0.000435     70  0.000    0.872    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.0831   Deviance explained = 39.5%
## -REML =  376.6  Scale est. = 1         n = 949

We can see that the interaction term ti(EKE, Depth) has a very small EDF and isn’t significant at the 0.05 level. So I think this one isn’t a go-er unfortunately, even if the biological/oceanographic story is more compelling.