# Model Occurrence of Multiple Radiocarbon Samples as a Variable-Rate Poisson Process

Source:`R/PPcalibrate.R`

`PPcalibrate.Rd`

This function calibrates a set of radiocarbon (\({}^{14}\)C) samples, and provides an estimate
of how the underlying rate at which the samples occurred varies over calendar time (including any
specific changepoints in the rate). The method can be used as an alternative approach to summarise
calendar age information contained in a set of related \({}^{14}\)C samples, enabling inference
on the latent *activity* rate which led to their creation.

It takes as an input a set of \({}^{14}\)C determinations and associated \(1\sigma\) uncertainties, as well as the radiocarbon age calibration curve to be used. The (calendar age) occurrence of these radiocarbon (\({}^{14}\)C) samples is modelled as a Poisson process. The underlying rate of this Poisson process \(\lambda(t)\), which represents the rate at which the samples occurred, is considered unknown and assumed to vary over calendar time.

Specifically, the sample occurrence rate \(\lambda(t)\) is modelled as piecewise constant, but with an unknown number of changepoints, which can occur at unknown times. The value of \(\lambda(t)\) between any set of changepoints is also unknown. The function jointly calibrates the given \({}^{14}\)C samples under this model, and simultaneously provides an estimate of \(\lambda(t)\). Fitting is performed using reversible-jump MCMC within Gibbs.

It returns estimates for the calendar age of each individual radiocarbon sample in the input set; and broader output on the estimated value of \(\lambda(t)\) which can be used by other library functions. Analysis of the estimated changepoints in the piecewise \(\lambda(t)\) permits wider inference on whether the occurrence rate of samples changed significantly at any particular calendar time and, if so, when and by how much.

For more information read the vignette: `vignette("Poisson-process-modelling", package = "carbondate")`

## Usage

```
PPcalibrate(
rc_determinations,
rc_sigmas,
calibration_curve,
F14C_inputs = FALSE,
n_iter = 1e+05,
n_thin = 10,
use_F14C_space = TRUE,
show_progress = TRUE,
calendar_age_range = NA,
calendar_grid_resolution = 1,
prior_h_shape = NA,
prior_h_rate = NA,
prior_n_internal_changepoints_lambda = 3,
k_max_internal_changepoints = 30,
rescale_factor_rev_jump = 0.9,
bounding_range_prob_cutoff = 0.001,
initial_n_internal_changepoints = 10,
grid_extension_factor = 0.1,
use_fast = TRUE,
fast_approx_prob_cutoff = 0.001
)
```

## Arguments

- rc_determinations
A vector of observed radiocarbon determinations. Can be provided either as \({}^{14}\)C ages (in \({}^{14}\)C yr BP) or as F\({}^{14}\)C concentrations.

- rc_sigmas
A vector of the (1-sigma) measurement uncertainties for the radiocarbon determinations. Must be the same length as

`rc_determinations`

and given in the same units.- calibration_curve
A dataframe which must contain one column

`calendar_age_BP`

, and also columns`c14_age`

and`c14_sig`

or`f14c`

and`f14c_sig`

(or both sets). This format matches the curves supplied with this package, e.g., intcal20, intcal13, which contain all 5 columns.- F14C_inputs
`TRUE`

if the provided`rc_determinations`

are F\({}^{14}\)C concentrations and`FALSE`

if they are radiocarbon ages. Defaults to`FALSE`

.- n_iter
The number of MCMC iterations (optional). Default is 100,000.

- n_thin
How much to thin the MCMC output (optional). Will store every

`n_thin`

\({}^\textrm{th}\) iteration. 1 is no thinning, while a larger number will result in more thinning. Default is 10. Must choose an integer greater than 1. Overall number of MCMC realisations stored will be \(n_{\textrm{out}} = \textrm{floor}( n_{\textrm{iter}}/n_{\textrm{thin}}) + 1\) so do not choose`n_thin`

too large to ensure there are enough samples from the posterior to use for later inference.- use_F14C_space
If

`TRUE`

(default) the calculations within the function are carried out in F\({}^{14}\)C space. If`FALSE`

they are carried out in \({}^{14}\)C age space. We recommend selecting`TRUE`

as, for very old samples, calibrating in F\({}^{14}\)C space removes the potential affect of asymmetry in the radiocarbon age uncertainty.*Note:*This flag can be set independently of the format/scale on which`rc_determinations`

were originally provided.- show_progress
Whether to show a progress bar in the console during execution. Default is

`TRUE`

.- calendar_age_range
(Optional) Overall minimum and maximum calendar ages (in cal yr BP) permitted for the set of radiocarbon samples, i.e.,

`calendar_age_range[1]`

< \(\theta_1, \ldots, \theta_n\) <`calendar_age_range[1]`

. This is used to bound the start and end of the Poisson process (so no events will be permitted to occur outside this interval). If not selected then automated selection will be made based on given`rc_determinations`

and value of`bounding_range_prob_cutoff`

- calendar_grid_resolution
The spacing of the calendar age grid on which to restrict the potential calendar ages of the samples, e.g., calendar ages of samples are limited to being one of

`t, t + resolution, t + 2 * resolution, ...`

Default is 1 (i.e., all calendar years in the overall calendar range are considered). Primarily used to speed-up code if have large range, when may select larger resolution.- prior_h_shape, prior_h_rate
(Optional) Prior for the value of the Poisson Process rate (the height

`rate_h`

) in any specific interval: $$\textrm{rate}\_\textrm{h} \sim \textrm{Gamma}( \textrm{shape} = \textrm{prior}\_\textrm{h}\_\textrm{shape}, \textrm{rate} = \textrm{prior}\_\textrm{h}\_\textrm{rate}).$$ If they are both`NA`

then`prior_h_shape`

is selected to be 1 (so`rate_h`

follows an Exponential distribution) and`prior_h_rate`

chosen adaptively (internally) to match`n_observations`

.- prior_n_internal_changepoints_lambda
Prior mean for the number of internal changepoints in the rate \(\lambda(t)\). $$\textrm{n}\_\textrm{internal}\_\textrm{changepoints} \sim \textrm{Po}(\textrm{prior}\_\textrm{n}\_\textrm{internal}\_\textrm{changepoints}\_\textrm{lambda})$$

- k_max_internal_changepoints
Maximum permitted number of internal changepoints

- rescale_factor_rev_jump
Factor weighting probability of dimension change in the reversible jump update step for Poisson process

`rate_h`

and`rate_s`

- bounding_range_prob_cutoff
Probability cut-off for choosing the bounds for the potential calendar ages for the observations

- initial_n_internal_changepoints
Number of internal changepoints to initialise MCMC sampler with. The default is 10 (so initialise with diffuse state). Will place these initial changepoints uniformly at random within overall calendar age range.

- grid_extension_factor
If you adaptively select the

`calendar_age_range`

from`rc_determinations`

, how far you wish to extend the grid beyond this adaptive minimum and maximum. The final range will be extended (equally on both sides) to cover`(1 + grid_extension_factor) * (calendar_age_range)`

- use_fast, fast_approx_prob_cutoff
A flag to allow trimming the calendar age likelihood (i.e., reducing the range of calendar ages) for each individual sample to speed up the sampler. If

`TRUE`

(default), for each individual sample, those tail calendar ages (in the overall`calendar_age_grid`

) with very small likelihoods will be trimmed (speeding up the updating of the calendar ages). If`TRUE`

the probability cut-off to remove the tails is`fast_approx_prob_cutoff`

.

## Value

A list with 7 items. The first 4 items contain output of the model, each of which has one dimension of size \(n_{\textrm{out}} = \textrm{floor}( n_{\textrm{iter}}/n_{\textrm{thin}}) + 1\). The rows in these items store the state of the MCMC from every \(n_{\textrm{thin}}\)\({}^\textrm{th}\) iteration:

`rate_s`

A list of length \(n_{\textrm{out}}\) each entry giving the current set of (calendar age) changepoint locations in the piecewise-constant rate \(\lambda(t)\).

`rate_h`

A list of length \(n_{\textrm{out}}\) each entry giving the current set of heights (values for the rate) in each piecewise-constant section of \(\lambda(t)\).

`calendar_ages`

An \(n_{\textrm{out}}\) by \(n_{\textrm{obs}}\) matrix. Gives the current estimate for the calendar age of each individual observation.

`n_internal_changes`

A vector of length \(n_{\textrm{out}}\) giving the current number of internal changes in the value of \(\lambda(t)\).

where \(n_{\textrm{obs}}\) is the number of radiocarbon observations, i.e.,
the length of `rc_determinations`

.

The remaining items give information about input data, input parameters (or those calculated) and update_type

`update_type`

A string that always has the value "RJPP".

`input_data`

A list containing the \({}^{14}\)C data used, and the name of the calibration curve used.

`input_parameters`

A list containing the values of the fixed parameters

`pp_cal_age_range`

,`prior_n_internal_changepoints_lambda`

,`k_max_internal_changepoints`

,`prior_h_shape`

,`prior_h_rate`

,`rescale_factor_rev_jump`

,`calendar_age_grid`

,`calendar_grid_resolution`

,`n_iter`

and`n_thin`

.