Aims

By the end of this practical, you should feel comfortable:

Load data and packages

library(Distance)
## Loading required package: mrds
## This is mrds 2.2.2
## Built: R 4.0.0; ; 2020-06-08 11:39:43 UTC; unix
## 
## Attaching package: 'Distance'
## The following object is masked from 'package:mrds':
## 
##     create.bins
library(dsm)
## Loading required package: mgcv
## Loading required package: nlme
## This is mgcv 1.8-31. For overview type 'help("mgcv-package")'.
## Loading required package: numDeriv
## This is dsm 2.3.0
## Built: R 4.0.0; ; 2020-06-08 19:51:53 UTC; unix
library(ggplot2)
library(knitr)

Load the data and the fitted dsm objects from the previous exercises:

load("spermwhale.RData")
load("dsms.RData")
load("df-models.RData")

We’re going to check the DSMs that we fitted in the previous practical, then save those that we think are good!

Looking at how changing k changes smooths

First checking that k is big enough, we should really do this during model fitting, but we’ve separated this up for the practical exercises.

First look at the text output of gam.check, are the values of k' for your models close to the edf in the outputted table. Here’s a silly example where I’ve deliberately set k too small:

dsm_k_check_eg <- dsm(count ~ s(Depth, k=4),
                    df_hn, segs, obs,
                    family=tw())
## Warning in make.data(response, ddf.obj, segment.data, observation.data, : Some
## observations are outside of detection function truncation!
gam.check(dsm_k_check_eg)

## 
## Method: REML   Optimizer: outer newton
## full convergence after 8 iterations.
## Gradient range [-1.66079e-07,1.171305e-07]
## (score 392.2823 & scale 5.216272).
## Hessian positive definite, eigenvalue range [0.8702242,302.9225].
## Model rank =  4 / 4 
## 
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
## 
##           k' edf k-index p-value
## s(Depth) 3.0 2.9    0.78    0.13

Note here again I’m using a “chunk” option to suppress the plots printed by gam.check

Generally if the EDF is close to the value of k you supplied, it is worth doubling k and refitting to see what happens. You can always switch back to the smaller k if there is little difference. The ?choose.k manual page can offer some guidance.

Continuing with that example, if we double k:

dsm_k_check_eg <- dsm(count ~ s(Depth, k=8),
                    df_hn, segs, obs,
                    family=tw())
## Warning in make.data(response, ddf.obj, segment.data, observation.data, : Some
## observations are outside of detection function truncation!
gam.check(dsm_k_check_eg)

## 
## Method: REML   Optimizer: outer newton
## full convergence after 8 iterations.
## Gradient range [-5.675949e-08,4.752562e-08]
## (score 392.2128 & scale 5.216315).
## Hessian positive definite, eigenvalue range [1.313321,300.6316].
## Model rank =  8 / 8 
## 
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
## 
##            k'  edf k-index p-value
## s(Depth) 7.00 4.47    0.78    0.15

We get something much more reasonable. Doubling again

dsm_k_check_eg <- dsm(count ~ s(Depth, k=16),
                    df_hn, segs, obs,
                    family=tw())
## Warning in make.data(response, ddf.obj, segment.data, observation.data, : Some
## observations are outside of detection function truncation!
gam.check(dsm_k_check_eg)