By the end of this practical, you should feel comfortable:
k
changes smoothsgam.check
and rqgam.check
obs_exp
library(Distance)
## Loading required package: mrds
## This is mrds 2.2.3
## Built: R 4.0.2; ; 2020-08-01 10:33:56 UTC; unix
##
## Attaching package: 'Distance'
## The following object is masked from 'package:mrds':
##
## create.bins
library(dsm)
## Loading required package: mgcv
## Loading required package: nlme
## This is mgcv 1.8-33. For overview type 'help("mgcv-package")'.
## Loading required package: numDeriv
## This is dsm 2.3.0
## Built: R 4.0.2; ; 2020-07-16 23:56:50 UTC; unix
library(ggplot2)
library(knitr)
Load the data and the fitted dsm
objects from the previous exercises:
load("spermwhale.RData")
load("dsms.RData")
load("df-models.RData")
We’re going to check the DSMs that we fitted in the previous practical, then save those that we think are good!
k
changes smoothsFirst checking that k
is big enough, we should really do this during model fitting, but we’ve separated this up for the practical exercises.
First look at the text output of gam.check
, are the values of k'
for your models close to the edf
in the outputted table. Here’s a silly example where I’ve deliberately set k
too small:
dsm_k_check_eg <- dsm(count ~ s(Depth, k=4),
df_hn, segs, obs,
family=tw())
## Warning in make.data(response, ddf.obj, segment.data, observation.data, : Some
## observations are outside of detection function truncation!
gam.check(dsm_k_check_eg)
##
## Method: REML Optimizer: outer newton
## full convergence after 8 iterations.
## Gradient range [-1.66079e-07,1.171305e-07]
## (score 392.2823 & scale 5.216272).
## Hessian positive definite, eigenvalue range [0.8702242,302.9225].
## Model rank = 4 / 4
##
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
##
## k' edf k-index p-value
## s(Depth) 3.0 2.9 0.78 0.18
Note here again I’m using a “chunk” option to suppress the plots printed by gam.check
Generally if the EDF is close to the value of k
you supplied, it is worth doubling k
and refitting to see what happens. You can always switch back to the smaller k
if there is little difference. The ?choose.k
manual page can offer some guidance.
Continuing with that example, if we double k
:
dsm_k_check_eg <- dsm(count ~ s(Depth, k=8),
df_hn, segs, obs,
family=tw())
## Warning in make.data(response, ddf.obj, segment.data, observation.data, : Some
## observations are outside of detection function truncation!
gam.check(dsm_k_check_eg)
##
## Method: REML Optimizer: outer newton
## full convergence after 8 iterations.
## Gradient range [-5.675949e-08,4.752562e-08]
## (score 392.2128 & scale 5.216315).
## Hessian positive definite, eigenvalue range [1.313321,300.6316].
## Model rank = 8 / 8
##
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
##
## k' edf k-index p-value
## s(Depth) 7.00 4.47 0.78 0.2
We get something much more reasonable. Doubling again
dsm_k_check_eg <- dsm(count ~ s(Depth, k=16),
df_hn, segs, obs,
family=tw())
## Warning in make.data(response, ddf.obj, segment.data, observation.data, : Some
## observations are outside of detection function truncation!
gam.check(dsm_k_check_eg)