class: title-slide, inverse, center, middle # Lecture 3: Multivariate smoothing<br/>&<br/>model selection <div style="position: absolute; bottom: 15px; vertical-align: center; left: 10px"> <img src="images/02-foundation-vertical-white.png" height="200"> </div> --- # The story so far... - How GAMs work - How to include detection info - Simple spatial-only models --- # Life isn't that simple - Which enivronmental covariates? - Which response distribution? - Which response? <p align="center"><b>How to select between possible models?</b></p> --- class: inverse, middle, center # Adding covariates --- # Model formulation - Pure spatial, pure environmental, mixed? - Prior knowledge of biology/ecology of species - What are drivers of distribution? - What data is available? --- # Sperm whale covariates <img src="dsm3-multiple_smooths-section_files/figure-html/plotall-1.png" height="800" /> --- class: inverse, middle, center # Tobler's first law of geography ## *"Everything is related to everything else, but near things are more related than distant things"* ### Tobler (1970) --- # Implications of Tobler's law ![](dsm3-multiple_smooths-section_files/figure-html/pairrrrs-1.png)<!-- --> --- # Adding smooths - Already know that `+` is our friend - Can build a big model... ```r dsm_all <- dsm(count~s(x, y) + s(Depth) + s(DistToCAS) + s(SST) + s(EKE) + s(NPP), ddf.obj=df_hr, segment.data=segs, observation.data=obs, family=tw()) ``` --- # Each `s()` has its own options - `s(..., k=...)` to adjust basis size - `s(..., bs="...")` for basis type - lots more options (we'll see a few here) --- class: inverse, middle, center # Now we have a huge model, what do we do? --- # Term selection .pull-left[ Two popular approaches (using `\(p\)`-values) **Stepwise selection** - path dependence **All possible subsets** - computationally expensive (fishing?) ] .pull-right[ <img src="images/gnome.jpg"> ] --- # p-values - Test for *zero effect* of a smooth - They are **approximate** for GAMs (but useful) - Reported in `summary` --- # `summary(dsm_all)` ``` ## ## Family: Tweedie(p=1.25) ## Link function: log ## ## Formula: ## count ~ s(x, y) + s(Depth) + s(DistToCAS) + s(SST) + s(EKE) + ## s(NPP) + offset(off.set) ## ## Parametric coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -20.6368 0.2751 -75 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Approximate significance of smooth terms: ## edf Ref.df F p-value ## s(x,y) 5.225 7.153 1.233 0.2920 ## s(Depth) 3.568 4.439 6.641 1.82e-05 *** ## s(DistToCAS) 1.000 1.000 1.504 0.2204 ## s(SST) 5.927 6.986 2.068 0.0407 * ## s(EKE) 1.763 2.225 2.579 0.0693 . ## s(NPP) 2.393 3.068 0.856 0.4678 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## R-sq.(adj) = 0.224 Deviance explained = 44.9% ## -REML = 368.97 Scale est. = 3.9337 n = 949 ``` --- # Path dependence is an issue here - (silly) Strategy: want all `\(p\approx 0\)` (`***`), remove terms 1-by-1 - Two different universes appear: ![](images/pathology.png) This isn't very satisfactory! --- # Term selection during fitting .pull-left[ - Already selecting wigglyness of terms - (via a penalty) - What about using it to remove the whole term? ] .pull-right[ ![animation of penalty in action](images/shrinky_dink.gif) ] --- # Shrinkage approach .pull-left[ - Basis `s(..., bs="ts")` - thin plate splines *with shrinkage* - remove the wiggles **then** remove the "linear" bits - nullspace should be shrunk less than the wiggly part ] .pull-right[ ![animation of penalty in action](images/shrinky_dink_ts.gif) ] --- # Shrinkage example ```r dsm_ts_all <- dsm(count~s(x, y, bs="ts") + s(Depth, bs="ts") + s(DistToCAS, bs="ts") + s(SST, bs="ts") + s(EKE, bs="ts") + s(NPP, bs="ts"), ddf.obj=df_hr, segment.data=segs, observation.data=obs, family=tw()) ``` --- # Model with no shrinkage ![](dsm3-multiple_smooths-section_files/figure-html/smooth-no-shrinkage-1.png)<!-- --> --- # ... with shrinkage ![](dsm3-multiple_smooths-section_files/figure-html/smooth-shrinkage-1.png)<!-- --> --- # `summary(dsm_ts_all)` ``` ## ## Family: Tweedie(p=1.277) ## Link function: log ## ## Formula: ## count ~ s(x, y, bs = "ts") + s(Depth, bs = "ts") + s(DistToCAS, ## bs = "ts") + s(SST, bs = "ts") + s(EKE, bs = "ts") + s(NPP, ## bs = "ts") + offset(off.set) ## ## Parametric coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -20.260 0.234 -86.59 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Approximate significance of smooth terms: ## edf Ref.df F p-value ## s(x,y) 1.8875209 29 0.705 4.33e-06 *** ## s(Depth) 3.6794182 9 4.811 < 2e-16 *** ## s(DistToCAS) 0.0000934 9 0.000 0.6797 ## s(SST) 0.3826654 9 0.063 0.2160 ## s(EKE) 0.8196256 9 0.499 0.0178 * ## s(NPP) 0.0003570 9 0.000 0.8372 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## R-sq.(adj) = 0.11 Deviance explained = 35.1% ## -REML = 385.04 Scale est. = 4.5486 n = 949 ``` --- # EDF comparison <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> tp </th> <th style="text-align:right;"> ts </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> s(x,y) </td> <td style="text-align:right;"> 5.2245 </td> <td style="text-align:right;"> 1.8875 </td> </tr> <tr> <td style="text-align:left;"> s(Depth) </td> <td style="text-align:right;"> 3.5679 </td> <td style="text-align:right;"> 3.6794 </td> </tr> <tr> <td style="text-align:left;"> s(DistToCAS) </td> <td style="text-align:right;"> 1.0001 </td> <td style="text-align:right;"> 0.0001 </td> </tr> <tr> <td style="text-align:left;"> s(SST) </td> <td style="text-align:right;"> 5.9267 </td> <td style="text-align:right;"> 0.3827 </td> </tr> <tr> <td style="text-align:left;"> s(EKE) </td> <td style="text-align:right;"> 1.7631 </td> <td style="text-align:right;"> 0.8196 </td> </tr> <tr> <td style="text-align:left;"> s(NPP) </td> <td style="text-align:right;"> 2.3931 </td> <td style="text-align:right;"> 0.0004 </td> </tr> </tbody> </table> --- # Removing terms? 1. EDF - Terms with EDF<1 may not be useful (can we remove?) 2. non-significant `\(p\)`-value - Decide on a significance level and use that as a rule (In some sense leaving "shrunk" terms in is more "consistent" in terms of variance estimation, but can be computationally annoying) --- class: inverse, middle, center # Comparing models --- # Comparing models - Usually have >1 option - How can we pick? - Even if we have 1 model, is it any good? (This can be subtle, more in model checking tomorrow!) --- # Akaike's "An Information Criterion" - As for many other models, we can get an AIC from our model - Comparison of AIC fine **but**: - can't compare Tweedie (continuous) and negative binomial (discrete) distributions! - (*within* distribution is fine) ```r AIC(dsm_all) ``` ``` ## [1] 1238.288 ``` ```r AIC(dsm_ts_all) ``` ``` ## [1] 1225.822 ``` --- class: inverse, middle, center # Selecting between response distributions --- # Goodness of fit - Q-Q plots - Closer to the line is better - But what does "close" mean? ![](dsm3-multiple_smooths-section_files/figure-html/gof-qq-1.png)<!-- --> --- # Using reference bands - What is down to random variation? - Where does the model actually fail? - Resampling the response, generate bands ``` qq.gam(dsm_all, asp=1, main="Tweedie", cex=5, rep=100) ``` ![](dsm3-multiple_smooths-section_files/figure-html/gof-qq-ref-1.png)<!-- --> --- class: inverse, middle, center # Which response type? --- # Count model `count~...` - Effort is effective effort - Response is count per segment ![](images/esw_change.png) --- # Estimated abundance `abundance.est~...` - Effort is area of each segment - Response is estimated abundance per segment ![](images/Nhat_change.png) --- # When to use each approach? - *Practical choice* - 2 detection function covariate "levels" - "Observer"/"observation" -- change **within** segment - "Segment" -- change **between** segments - "Count model" only lets us use segment-level covariates - "Estimated abundance" lets us use either --- # Sperm whale response example (either) .pull-left[ - Detection covariate: Beaufort - Changes at segment level - `count` or `abundance.est` ] .pull-right[ ![](dsm3-multiple_smooths-section_files/figure-html/plot-beauf-1.png)<!-- --> ] --- # Sperm whale response example (`abundance.est`) .pull-left[ - Detection covariate: group size (`size`) - Changes at observation level - `abundance.est` only ] .pull-right[ ![](dsm3-multiple_smooths-section_files/figure-html/plot-size-1.png)<!-- --> ] --- class: inverse, middle, center # Recap --- # Recap - Adding smooths - Path dependence - Removing smooths - `\(p\)`-values - shrinkage - Comparing models - Comparing response distributions