index.qmd

---
title: "Methods for Reconstructing Paleo Food Webs"
author:
  - name: Tanya Strydom
    id: ts
    orcid: 0000-0001-6067-1349
    corresponding: true
    email: t.strydom@sheffield.ac.uk
    role: 
    - conceptualization: lead
    - methodology: lead
    affiliation:
      - id: sheffield
        name: School of Biosciences, University of Sheffield, Sheffield, UK
  - name: Andrew P. Beckerman
    id: apb
    orcid: 0000-0002-7859-8394
    corresponding: false
    role: 
    - conceptualization: lead
    - methodology: supporting
    affiliations:
      - ref: sheffield
funding: "The author(s) received no specific funding for this work. Well they did I just haven't done the homework"
keywords:
  - food web
  - network construction
abstract: |
  TODO.
date: last-modified
bibliography: references.bib
citation:
  container-title: TBD
number-sections: true
---

# Why build paleo food webs?

-   Because its interesting?

-   Value in using hindcasting to aid in forecasting. *e.g.,* the Toarcian ms [@dunhill2024] shows how we can use these paleo communities to understand trophic-level responses to extinctions.

# How do we do it?

-   There is an evolving body of work that focuses on developing tools specifically for the task of predicting food webs.

-   There are a handful that have been developed specifically in the context of paleo settings *e.g.,* TODO but we can also talk about those that might have been developed/tested in contemporary settings but still have applicability in paleo ones.

-   Different underlying theory though

    -   Focus here on the idea of different 'currencies' but also aggregations - energy vs compatibility.

-   Insert brief overview of the different methods as they pertain to approach (so the T4T triangle)

-   Challenges we face (even in contemporary settings)?

    -   keep high level - I think the argument here should fall more in the data trade offs...

# Understanding how networks are different

It is important to be aware that networks can be configured in different ways depending on how the interactions are defined (Strydom, in prep). Basically we have metawebs, which represent *potential* interactions, and realised networks, which represent the subset of potential that are realised as a result of community and environmental context.

# Challenges specific to paleo communities/networks

Although there are a suite of tools and methods that have been developed to predict species interactions and networks they will not all be suitable for the prediction of paleo communities. Some of these include the fact that the fossil record is incomplete/preservation is biased \[REF\] which means that we have an incomplete picture of the entire community. Fossils are 2D and only represent specific 'parts' of an individual (hard and bone-y bits), this means we don't have a complete picture of the physical traits of species *e.g.,* no body mass (but yes size), behaviours, or ability to construct well resolved phylogenetic trees the deeper we go back in time. Also owing to the patchy nature of fossils one often has to aggregate over large spatial scales, and also fossils are preserved in 2D so no *real* idea of spatial arrangements, compounded that fossils aren't necessarily conserved/found 'in situ' but can be moved (*e.g.,* alluvial deposits). Methodologically speaking some tools that 'learn' from contemporary communities (*e.g.,* @strydom2023, @caron2022) will become 'worse' the further one goes back in time since species then look very different from now but can still be useful for 'recent' communities (*e.g.,* @fricke2022). Something about the intersectionality of the data we don't have for paleo communities and the data we need for some of the different modelling approaches.

# Dataset Overview

-   Species

-   Time/space

-   And probably some other paleo things that will be relevant...

![It would be very sexy if we could get a figure that looks something like this together...](figures/data_mock.png)

# Methods

## Models

| Model | Predicts | Notes |
|-------------------|---------------------|--------------------------------|
| Allometric diet breadth model | Realised network |  |
| Body size ratio model | Metaweb (?) |  |
| Niche model | Structural network | Is not species specific - cannot apply species metadata |
| Paleo food web inference model | Realised network (if downsampling) |  |

: A summary of the different families of tools that can be used to generate paleo food webs. {#tbl-models}

### Paleo food web inference model

The Paleo food web inference model (PFIM; @shaw2024) uses a series of rules for a set of trait categories (such as habitat and body size) to determine if an interaction can feasibly occur between a species pair. If all conditions are met for the different rule classes then an interaction is deemed to be feasible. The original work put forward in @shaw2024 also includes a 'downsampling' step developed by @roopnarine2006 that uses a power law, defined by the link distribution, to 'prune' down some of the links. It is worth mentioning that this approach is similar to that developed by @roopnarine2017 with the exception that Shaw does not specifically bin species into guilds, and so we choose to use the method developed by Shaw since both methods should produce extremely similar networks as they are built on the same underlying philosophy.

#### Defining organism ecologies, feeding interactions and trophic guilds

> This is currently verbatim from the Dunhill ms...

Modes of life were defined for each fossil species based on the ecological traits defined in the Bambach ecospace model [@bambach2007]. Ecological traits were assigned based on interpretations from the published literature which are largely based on functional morphology and information from extant relatives. Information on the body size of each species was also recorded by summarising mean specimen sizes from the section into a categorical classification. The following ecological characteristics were recorded for each fossil species; motility (fast, slow, facultative, non-motile), tiering (pelagic, erect, surficial, semi-infaunal, shallow infaunal, deep infaunal), feeding (predator, suspension feeder, deposit feeder, mining, grazer), and size: gigantic (\>500 mm), very large (\>300–500 mm), large (\>100–300 mm), medium (\>50–100 mm), small (\>10–50 mm), tiny (≤10 mm). Size categories are defined by the longest axis of the fossil, estimates of tracemaker size from trace fossils based on literature accounts, or by extrapolating the total length for belemnites from the preserved guard using established approaches.

### Allometric diet breadth model

The Allometric diet breadth model (ADBM; @petchey2008) is rooted in feeding theory and allocates the links between species based on energetics, which predicts the diet of a consumer based on energy intake. This means that the model is focused on predicting not only the number of links in a network but also the arrangement of these links based on the diet breadth of a species, where the diet ($K$) is defined as follows:

$$
K = \frac{\sum_{i=1}^{k}\lambda_{ij}E_{i}}{1+\sum_{i=1}^{k}\lambda_{ij}H_{ij}}
$$ {#eq-adbm}

where $\lambda_{ij}$ is the handling time, which is the product of the attack rate $A_{i}$ and resource density $N_{i}$, $E_{i}$ is the energy content of the resource and $H_{ij}$ is the ratio handling time, with the relationship being dependent on the ratio of predator and prey bodymass as follows:

$$
H_{ij} = \frac{h}{b - \frac{M_{i}}{M_{j}}} if \frac{M_{i}}{M_{j}} < b
$$

or

$$
H_{ij} = \infty \geq b
$$

Refer to @petchey2008 for more details as to how these different terms are parametrised.

### Body size ratio model

The body size ratio model [@rohr2010] determines feeding interactions using the ratio between consumer ($M_{i}$) and resource ($M_{j}$) body sizes - which supposedly stems from niche theory (still trying to reconcile that). The probability of a link existing between a consumer and resource (in its most basic form) is defined as follows:

$$
P_{ij} = \frac{p}{1+p}
$$

where

$$
p = exp[\alpha + \beta log(\frac{M_{i}}{M_{j}}) + \gamma log^{2}(\frac{M_{i}}{M_{j}})]
$$ {#eq-bodymass}

The original latent-trait model developed by @rohr2010 also included an additional latent trait term $v_{i} \delta f_{j}$ however for simplicity we will use @eq-bodymass as per @yeakel2014 Based on @rohr2010 it is possible to estimate the parameters $\alpha$, $\delta$, and $\gamma$ using a GLM but we will use the parameters from @yeakel2014, which was 'trained' on the Serengeti food web data and are as follows: $\alpha$ = 1.41, $\delta$ = 3.75, and $\gamma$ = 1.87.

### L matrix

For now we can link to thATNr package [@gauzens2023] until I can find a more suitable manuscript that breaks down this construction method. @schneider2016 Interactions are determined by allometric rules (ratio of consumer ($M_{i}$) and resource ($M_{j}$) body sizes) and a Ricker function as defined by $R_{opt}$ and $\gamma$ and returns The probability of a link ($P_{ij}$) existing between a consumer and resource, and is defined as follows:

$$
P_{ij} = (L \times \exp(1 - L))^{\gamma}
$$

where

$$
L = \frac{M_{i}}{M_{j} \times R_{opt}}
$$

It is also possible to apply a threshold value to $P_{ij}$, whereby any probabilities below that threshold are set to zero.

### Niche model

The niche model [@williams2000] introduces the idea that species interactions are based on the 'feeding niche' of a species. Broadly, all species are randomly assigned a 'feeding niche' range and all species that fall in this range can be consumed by that species (thereby allowing for cannibalism). The niche of each species is randomly assigned and the range of each species' niche is (in part) constrained by the specified connectance of the network. The niche model has also been modified, although it appears that adding to the 'complexity' of the niche model does not improve on its ability to generate a more ecologically 'correct' network [@williams2008].

## Assessing model performance

In terms of wanting to asses and compare across the different models it is beneficial to approach this task by thinking about the different aspects of the network as well as interactions that are being predicted by the different models. It is perhaps beneficial to think of these across different 'scales' of organisation within the network, namely macro (the entire network), meso (smaller interacting units within the network), and micro (species-level attributes). Although there are a myriad of possible ways to 'measure' and analyse ecological networks [@delmas2018] we do still lack a clear set of guidelines for assessing how well models recover network structure [@allesina2008] and it is beneficial to use a small subset of metrics that can clearly be tied to broader aspects of network function or capturing a ecological process.

### Macro network properties

**Connectance** [@martinez1992] has been shown to be the feature of networks that underpin a series of other properties and function [@strydom2021b] and so it is perhaps the most important structural attribute for a model to be able to retrieve correctly. Additionally we consider the **complexity** of networks by calculating their SVD entropy (this gives us an estimate of the physical as opposed to behavioural complexity of networks; @strydom2021), we could also look at the rank/rank deficiency of networks which (theoretically) represents the number fo unique interaction strategies in the network [@strydom2021], which may be specifically interesting in terms of looking at pre and post extinction but also as a way to unpack 'functional redundancy' that some models may introduce.

### Meso network properties

Motifs represent smaller subset of interactions between three species, and are argued to capture dynamics that are likely to be ecologically relevant [@milo2002; @stouffer2007]. Here we specifically look at the number of **linear chains**, **omnivory**, **apparent competition**, and **direct competition** motifs. In the broader context the ability of a model in being able to capture these smaller motifs will inform as to its suitability of use understanding the more dynamic component of network ecology.

### Micro network properties

The number of interactions established (**generality**) or received (**vulnerability**) by each species [@schoener1989], are (broadly) indicative of consumer-resource relationships and diet breadth of species [ref]. Although this is usually determined at the species level the standard deviation of the generality and vulnerability of species is often used when benchmarking predicted networks [*e.g.,* @williams2008; @petchey2008].

The **specificity** of species in a network is measured as a function of the proportion of resources they effectively use [@poisot2012]

> **Shape:** to determine if the 'shape' of the network is correct we are looking at the ratio of 'top':'basal' species (where 'top' species are those that have a vulnerability of 0 and 'basal' species have a generality of 0) as well as the distance to base from one of the top species (this will represent the shortest path but a large discrepancy between the real and predicted network would be indicative that the model is not predicting a similar 'shape'). This will allow is to see if the models construct tall 'pencil' vs flat 'pancake' networks (Beckerman 2024, pers comms). A small (\< 1) number will thus be indicative of a 'bottom-heavy' network and the opposite for larger numbers

### Interactions

**Interaction turnover** [@poisot2012] tells us which interactions are 'conserved' (shared) across the networks from the same period but constructed using different models.

# Results

## Comparing predicted networks

![stuff... For display purposes the counts for the different motifs are log transformed](figures/summary.png){#fig-summary}

## Comparing inference

## Extinctions

![Dashed line indicates the (mean) extinction simulation results (post value, start values are those estimated by the relevant model). For display purposes the counts for the different motifs are log transformed](figures/extinction.png){#fig-extinction}

![Dark line indicates 'real' extinction simulation results the lighter lines show each model individually, which is also denoted by the linetype. For display purposes the counts for the different motifs are log transformed](figures/extinction_all_results.png)

# References {.unnumbered}

::: {#refs}
:::