Lecture 1: distance sampling & density surface models

class: title-slide, inverse, center, middle

# Lecture 1: distance sampling & density surface models

---
class: inverse, middle, center
# Why model abundance spatially?

---
class: inverse, middle, center
# Maps

---

.pull-left[
![](images/ak-talk.png)
![](images/bigbears.png)
]

.pull-right[
- Black bears in Alaska
- Heterogeneous spatial distribution

]

---
class: inverse, middle, center
# Spatial decision making

---

.pull-left[
![](images/block_island.png)
![](images/plot-loon-preds.png)
]

.pull-right[
- Block Island, Rhode Island
- First offshore wind in the USA
- Spatial impact assessment

![](images/block-island-windfarms.png)
]

---
class: inverse, middle, center
# Back to regular distance sampling

---
# How many animals are there? (500!)

![](dsm1-refresher-what_is_a_dsm_files/figure-html/plot-1.png)

---

# Plot sampling

![](dsm1-refresher-what_is_a_dsm_files/figure-html/plotsampling-1.png)

- Surveyed 10 quadrats (each `$0.1^2$` units)
  - Total covered area `$a=10 * 0.1^2 =$` 0.1
- Saw `$n=$` 59 animals
- Estimated density `$\hat{D}=n/a=$` 590
- Total area `$A=1$`
- Estimated abundance `$\hat{N}=\hat{D}A=$` 590

---

# Strip transect

![](dsm1-refresher-what_is_a_dsm_files/figure-html/strip-1.png)

- Surveyed 4 lines (each `$1*0.025$` units)
  - Total covered area `$a=4*1*0.025 =$` 0.1
- Saw `$n=$` 57 animals
- Estimated density `$\hat{D}=n/a=$` 570
- Total area `$A=1$`
- Estimated abundance `$\hat{N}=\hat{D}A=$` 570

---
# Detectability matters!

- We've assumed certain detection so far
- This rarely happens in the field
- Distance to the **object** is important
- Detectability should decrease with increasing distance

---
# Distance and detectability

<small>Credit <a href="http://www.nordhavn.com/egret/captains_log_sept11.php">Scott and Mary Flanders</a></small>

---
# Line transect

![](dsm1-refresher-what_is_a_dsm_files/figure-html/lt-1.png)

---
# Line transects - distances

![](dsm1-refresher-what_is_a_dsm_files/figure-html/distance-hist-1.png)
 
- Distances from the **line** (sampler) to animal
- Now we recorded distances, what do they look like?
- "Fold" distribution over, left/right doesn't matter
- Drop-off in # observations w. increasing distance

---
# Distance sampling animation

![Animation of line transect survey](images/distanim.gif)

---
# Detection function

![](dsm1-refresher-what_is_a_dsm_files/figure-html/df-fit-1.png)

---
# Distance sampling estimate

- Surveyed 5 lines (each area 1 `$*$` 2 `$*$` 0.025)
  - Total covered area `$a=$` 5 `$*$` 1 `$*$` (2 `$*$` 0.025) = 0.25
- Probability of detection `$\hat{p} =$` 0.546
- Saw `$n=$` 76 animals
- Inflate to `$n/\hat{p}=$` 139.198
- Estimated density `$\hat{D}=\frac{n/\hat{p}}{a}=$` 556.8
- Total area `$A=1$`
- Estimated abundance `$\hat{N}=\hat{D}A=$` 556.8

---
# Reminder of assumptions

1. Animals are distributed independent of lines

2. On the line, detection is certain

3. Distances are recorded correctly

4. Animals don't move before detection

---
# What are detection functions?

- `$\mathbb{P}\left( \text{detection } \vert \text{ animal at distance } x \right)$`
- "Integrate out distance" == "area under curve" == `$\hat{p}$`
- Many different forms, depending on the data
- All share some characteristics

![](dsm1-refresher-what_is_a_dsm_files/figure-html/df-hn-1.png)

---
# Fitting detection functions (in R!)

- Using the package `Distance`
- Function `ds()` does most of the work
- More on this in the practical!

```r
library(Distance)
df_hn <- ds(distdata, truncation=6000)
```

---
# Horvitz-Thompson-like estimators

- Once we have `$\hat{p}$` how do we get `$\hat{N}$`?
- Rescale the (flat) density and extrapolate

$$
\hat{N} = \frac{\text{study area}}{\text{covered area}}\sum_{i=1}^n \frac{s_i}{\hat{p}_i}
$$

- `$s_i$` are group/cluster sizes
- `$\hat{p}_i$` is the detection probability (from detection function)

---
# Hidden in this formula is a simple assumption

- Probability of sampling every point in the study area is equal
- Is this true? Sometimes.
- If (and only if) the design is randomised

---
# Many faces of randomisation

---
# Randomisation & coverage probability

- H-T equation above assumes even coverage
  - (or you can estimate)

---
# Extra information

![](dsm1-refresher-what_is_a_dsm_files/figure-html/plottracks-1.png)

---
# Extra information - depth

![](dsm1-refresher-what_is_a_dsm_files/figure-html/plotdepth-1.png)

---
# Extra information - SST

![](dsm1-refresher-what_is_a_dsm_files/figure-html/plotsst-1.png)

---
class: inverse, middle, center
# We should model that!

---
# DSM flow diagram

---
# Modelling requirements

- Account for effort
- Flexible/interpretable effects
- Predictions over an arbitrary area
- Include detectability

---
class: inverse, middle, center
# Accounting for effort

---
# Effort

.pull-left[
![](dsm1-refresher-what_is_a_dsm_files/figure-html/tracks2-1.png)
]

.pull-right[
- Have transects
- Variation in counts and covars along them
- Want a sample unit w/ minimal variation
- "Segments": chunks of effort
]

---
# Chopping up transects

[Physeter catodon by Noah Schlottman](http://phylopic.org/image/dc76cbdb-dba5-4d8f-8cf3-809515c30dbd/)

---
class: inverse, middle, center
# Flexible, interpretable effects

---
# Smooth response

![](dsm1-refresher-what_is_a_dsm_files/figure-html/plotsmooths-1.png)

---
# Explicit spatial effects

![](dsm1-refresher-what_is_a_dsm_files/figure-html/plot-spat-smooths-1.png)

---
class: inverse, middle, center
# Predictions

---
# Predictions over an arbitrary area

.pull-left[

![](dsm1-refresher-what_is_a_dsm_files/figure-html/predplot-1.png)
]

.pull-right[
- Don't want to be restricted to predict on segments
- Predict within survey area
- Extrapolate outside (with caution)
- Working on a grid of cells
]

---
class: inverse, middle, center
# Detection information

---
# Including detection information

- Two options:
  - adjust areas to account for **effective effort**
  - use **Horvitz-Thompson estimates** as response

---
# Count model

- Area of each segment, `$A_j$`
  - use `$A_j\hat{p}_j$`
- 💭 effective strip width ( `$\hat{\mu} = w\hat{p}$` )
- Response is counts per segment
- "Adjusting for effort"

![](images/esw_change.png)

---

# Estimated abundance

- Effort is area of each segment
- Estimate H-T abundance per segment

$$
\hat{n}_j = \sum_i \frac{s_i}{\hat{p}_i}
$$

(where the `$i$` observations are in segment `$j$`)

![](images/Nhat_change.png)

---
# Detectability and covariates

- 2 covariate "levels" in detection function
  - "Observer"/"observation" -- change **within** segment
  - "Segment" -- change **between** segments
- "Count model" only lets us use segment-level covariates
- "Estimated abundance" lets us use either

---
# When to use each approach?

- Generally "nicer" to adjust effort
- Keep response (counts) close to what was observed
- **Unless** you want observation-level covariates

---
class: inverse, middle, center
# Data requirements

---
# What do we need?

- Need to "link" data
  - ✅ Distance data/detection function
  - ✅ Segment data
  - ✅ Observation data (segments 🔗 detections)

More info on course website.

---

![](images/dsm_tables.png)

---
class: inverse, middle, center
# Example data

---
# Example data

---
# Example data

---
# Sperm whales

.pull-left[
<img src="images/spermwhale.png" width="100%">
]
.pull-right[

- Hang out near canyons, eat squid
- Surveys in 2004, US east coast
- Thanks to Debi Palka (NOAA NEFSC), Lance Garrison (NOAA SEFSC) for data. Jason Roberts (Duke University) for data prep.
]

---
# Recap

- Model counts or estimated abundance
- The effort is accounted for differently
- Flexible models are good
- Incorporate detectability
- 2 tables + detection function needed