Package 'bayesmove' reference manual

Title:	Non-Parametric Bayesian Analyses of Animal Movement
Description:	Methods for assessing animal movement from telemetry and biologging data using non-parametric Bayesian methods. This includes features for pre- processing and analysis of data, as well as the visualization of results from the models. This framework does not rely on standard parametric density functions, which provides flexibility during model fitting. Further details regarding part of this framework can be found in Cullen et al. (2022) <doi:10.1111/2041-210X.13745>.
Authors:	Joshua Cullen [aut, cre, cph] , Denis Valle [aut, cph]
Maintainer:	Joshua Cullen <[email protected]>
License:	GPL-3
Version:	0.2.3
Built:	2025-03-07 05:36:41 UTC
Source:	https://github.com/joshcullen/bayesmove

Assign behavior estimates to observations

Description

Assign behavior estimates to observations

Usage

assign_behavior(dat.orig, dat.seg.list, theta.estim.long, behav.names = NULL)
assign_behavior(dat.orig, dat.seg.list, theta.estim.long, behav.names = NULL)

Arguments

`dat.orig`	A data frame that contains all of the original data for all animal IDs. Must be same as was used to originally segment the tracks. Must have columns `obs` and `time1` generated by `filter_time`.
`dat.seg.list`	A list of data associated with each animal ID where names of list elements are the ID names and tracks have already been segmented. Must have columns `obs` and `time1` generated by `filter_time`.
`theta.estim.long`	A data frame in long format where each observation (time1) of each track segment (tseg) of each animal ID (id) has separate rows for behavior proportion estimates per state. Columns for behavior and proportion estimates should be labeled behavior and prop, respectively. Date (in POSIXct format) should also be included as a column labeled date.
`behav.names`	deprecated. Now taken from the `theta.estim.long` object.

Value

A data frame of all animal IDs where columns (with names from behav.names) include proportions of each behavioral state per observation, as well as a column that stores the dominant behavior within a given track segment for which the observation belongs (behav). This is merged with the original data frame dat.orig, so any observations that were excluded (not at primary time interval) will show NA for behavior estimates.

Examples



#load original and segmented data
data(tracks)
data(tracks.seg)

#convert segmented dataset into list
tracks.list<- df_to_list(dat = tracks.seg, ind = "id")

#select only id, tseg, SL, and TA columns
tracks.seg2<- tracks.seg[,c("id","tseg","SL","TA")]

#summarize data by track segment
obs<- summarize_tsegs(dat = tracks.seg2, nbins = c(5,8))

#cluster data with LDA
res<- cluster_segments(dat = obs, gamma1 = 0.1, alpha = 0.1, ngibbs = 1000,
                       nburn = 500, nmaxclust = 7, ndata.types = 2)

#Extract proportions of behaviors per track segment
theta.estim<- extract_prop(res = res, ngibbs = 1000, nburn = 500, nmaxclust = 7)

#Create augmented matrix by replicating rows (tsegs) according to obs per tseg
theta.estim.long<- expand_behavior(dat = tracks.seg, theta.estim = theta.estim, obs = obs,
                               nbehav = 3, behav.names = c("Encamped","ARS","Transit"),
                               behav.order = c(1,2,3))

#Run function
dat.out<- assign_behavior(dat.orig = tracks, dat.seg.list = tracks.list,
                          theta.estim.long = theta.estim.long)



#load original and segmented data
data(tracks)
data(tracks.seg)

#convert segmented dataset into list
tracks.list<- df_to_list(dat = tracks.seg, ind = "id")

#select only id, tseg, SL, and TA columns
tracks.seg2<- tracks.seg[,c("id","tseg","SL","TA")]

#summarize data by track segment
obs<- summarize_tsegs(dat = tracks.seg2, nbins = c(5,8))

#cluster data with LDA
res<- cluster_segments(dat = obs, gamma1 = 0.1, alpha = 0.1, ngibbs = 1000,
                       nburn = 500, nmaxclust = 7, ndata.types = 2)

#Extract proportions of behaviors per track segment
theta.estim<- extract_prop(res = res, ngibbs = 1000, nburn = 500, nmaxclust = 7)

#Create augmented matrix by replicating rows (tsegs) according to obs per tseg
theta.estim.long<- expand_behavior(dat = tracks.seg, theta.estim = theta.estim, obs = obs,
                               nbehav = 3, behav.names = c("Encamped","ARS","Transit"),
                               behav.order = c(1,2,3))

#Run function
dat.out<- assign_behavior(dat.orig = tracks, dat.seg.list = tracks.list,
                          theta.estim.long = theta.estim.long)

Add segment numbers to observations

Description

After breakpoints have been extracted for each animal ID, this function assigns the associated segment number to observations for each animal ID. These segments of observations will be used in the second stage of the model framework to perform mixed-membership clustering by Latent Dirichlet Allocation.

Usage

assign_tseg(dat, brkpts)
assign_tseg(dat, brkpts)

Arguments

`dat`	A list where each element stores the data for a unique animal ID. Each element is a data frame that contains all data associated for a given animal ID and must include a column labeled `time1` that numbers each of the observations in consecutive order. This variable is automatically generated by the `filter_time` function during data preparation.
`brkpts`	A data frame of breakpoints for each animal ID (as generated by `get_breakpts`).

Value

A data frame that updates the original data object by including the segment number associated with each observation in relation to the extracted breakpoints.

Examples


#load data
data(tracks.list)

#subset only first track
tracks.list<- tracks.list[1]

#only retain id and discretized step length (SL) and turning angle (TA) columns
tracks.list2<- purrr::map(tracks.list,
                   subset,
                  select = c(id, SL, TA))


set.seed(1)

# Define model params
alpha<- 1
ngibbs<- 1000
nbins<- c(5,8)

#future::plan(future::multisession)  #run all MCMC chains in parallel
dat.res<- segment_behavior(data = tracks.list2, ngibbs = ngibbs, nbins = nbins,
                           alpha = alpha)


# Determine MAP iteration for selecting breakpoints and store breakpoints
MAP.est<- get_MAP(dat = dat.res$LML, nburn = ngibbs/2)
brkpts<- get_breakpts(dat = dat.res$brkpts, MAP.est = MAP.est)


# Assign track segments to all observations by ID
tracks.seg<- assign_tseg(dat = tracks.list, brkpts = brkpts)


#load data
data(tracks.list)

#subset only first track
tracks.list<- tracks.list[1]

#only retain id and discretized step length (SL) and turning angle (TA) columns
tracks.list2<- purrr::map(tracks.list,
                   subset,
                  select = c(id, SL, TA))


set.seed(1)

# Define model params
alpha<- 1
ngibbs<- 1000
nbins<- c(5,8)

#future::plan(future::multisession)  #run all MCMC chains in parallel
dat.res<- segment_behavior(data = tracks.list2, ngibbs = ngibbs, nbins = nbins,
                           alpha = alpha)


# Determine MAP iteration for selecting breakpoints and store breakpoints
MAP.est<- get_MAP(dat = dat.res$LML, nburn = ngibbs/2)
brkpts<- get_breakpts(dat = dat.res$brkpts, MAP.est = MAP.est)


# Assign track segments to all observations by ID
tracks.seg<- assign_tseg(dat = tracks.list, brkpts = brkpts)

Internal function that adds segment numbers to observations

Description

Usage

assign_tseg_internal(dat, brkpts)
assign_tseg_internal(dat, brkpts)

Arguments

`dat`	A data frame that contains all data associated for a given animal ID. Must include a column labeled `time1` that numbers each of the observations in consecutive order, which is automatically generated by `filter_time`.
`brkpts`	A data frame of breakpoints for each animal ID (as generated by `get_breakpts`).

Value

A data frame that updates the original data object by including the segment number associated with each observation in relation to the extracted breakpoints.

Internal function that runs RJMCMC on a single animal ID

Description

This function serves as a wrapper for samp_move by running this sampler for each iteration of the MCMC chain. It is called by segment_behavior to run the RJMCMC on all animal IDs simultaneously.

Usage

behav_gibbs_sampler(dat, ngibbs, nbins, alpha, breakpt, p)
behav_gibbs_sampler(dat, ngibbs, nbins, alpha, breakpt, p)

Arguments

`dat`	A data frame that only contains columns for the animal IDs and for each of the discretized movement variables.
`ngibbs`	numeric. The total number of iterations of the MCMC chain.
`nbins`	numeric. A vector of the number of bins used to discretize each movement variable. These must be in the same order as the columns within `dat`.
`alpha`	numeric. A single value used to specify the hyperparameter for the prior distribution. A standard value for `alpha` is typically 1, which corresponds with a vague prior on the Dirichlet distribution.
`breakpt`	numeric. A vector of breakpoints if pre-specifying where they may occur, otherwise `NULL`.
`p`	An object storing information from `progressr::progessor` to produce a progress bar.

Value

A list of the breakpoints, the number of breakpoints, and the log marginal likelihood at each MCMC iteration, as well as the time it took the model to finish running. This is only provided for the data of a single animal ID.

Internal function that transforms a vector of bin numbers to a presence-absence matrix

Description

Transforms vectors of bin numbers into full matrices for plotting as a heatmap.

Usage

behav_seg_image(dat, nbins)
behav_seg_image(dat, nbins)

Arguments

`dat`	A data frame for a single animal ID that contains only columns for the ID and each of the movement variables that were analyzed by `segment_behavior`. The ID column must be first.
`nbins`	numeric. A vector of the number of bins used to discretize each movement variable. These must be in the same order as the columns within `data`.

Value

A list where each element stores the presence-absence matrix for each of the movement variables.

Cluster observations into behavioral states

Description

This function uses a Gibbs sampler within a mixture model to estimate the optimal number of behavioral states, the state-dependent distributions, and to assign behavioral states to each observation. This model does not assume an underlying mechanistic process.

Usage

cluster_obs(dat, alpha, ngibbs, nmaxclust, nburn)
cluster_obs(dat, alpha, ngibbs, nmaxclust, nburn)

Arguments

`dat`	A data frame that only contains columns for the discretized movement variables.
`alpha`	numeric. A single value used to specify the hyperparameter for the prior distribution.
`ngibbs`	numeric. The total number of iterations of the MCMC chain.
`nmaxclust`	numeric. A single number indicating the maximum number of clusters to test.
`nburn`	numeric. The length of the burn-in phase.

Details

The mixture model analyzes all animal IDs pooled together, thus providing a population-level estimate of behavioral states.

Value

A list of model results is returned where elements include the phi matrix for each data stream, theta matrix, log likelihood estimates for each iteration of the MCMC chain loglikel, a list of the MAP estimates of the latent states for each observation z.MAP, a matrix of the whole posterior of state assignments per observation z.posterior, and a vector gamma1 of estimates for the gamma hyperparameter.

Examples


data(tracks.list)

#convert from list to data frame
tracks.list<- dplyr::bind_rows(tracks.list)

#only retain id and discretized step length (SL) and turning angle (TA) columns
tracks<- subset(tracks.list, select = c(SL, TA))


set.seed(1)

# Define model params
alpha=0.1
ngibbs=1000
nburn=ngibbs/2
nmaxclust=7

dat.res<- cluster_obs(dat = tracks, alpha = alpha, ngibbs = ngibbs,
                           nmaxclust = nmaxclust, nburn = nburn)


data(tracks.list)

#convert from list to data frame
tracks.list<- dplyr::bind_rows(tracks.list)

#only retain id and discretized step length (SL) and turning angle (TA) columns
tracks<- subset(tracks.list, select = c(SL, TA))


set.seed(1)

# Define model params
alpha=0.1
ngibbs=1000
nburn=ngibbs/2
nmaxclust=7

dat.res<- cluster_obs(dat = tracks, alpha = alpha, ngibbs = ngibbs,
                           nmaxclust = nmaxclust, nburn = nburn)

Cluster time segments into behavioral states

Description

This function performs a Gibbs sampler within the Latent Dirichlet Allocation (LDA) model to estimate proportions of each behavioral state for all time segments generated by segment_behavior. This is the second stage of the two-stage Bayesian model that estimates proportions of behavioral states by first segmenting individual tracks into relatively homogeneous segments of movement.

Usage

cluster_segments(dat, gamma1, alpha, ngibbs, nmaxclust, nburn, ndata.types)
cluster_segments(dat, gamma1, alpha, ngibbs, nmaxclust, nburn, ndata.types)

Arguments

`dat`	A data frame returned by `summarize_tsegs` that summarizes the counts of observations per bin and movement variable for all animal IDs.
`gamma1`	numeric. A hyperparameter for the truncated stick-breaking prior for estimating the `theta` matrix.
`alpha`	numeric. A hyperparameter for the Dirichlet distribution when estimating the `phi` matrix.
`ngibbs`	numeric. The total number of iterations of the MCMC chain.
`nmaxclust`	numeric. A single number indicating the maximum number of clusters to test.
`nburn`	numeric. The length of the burn-in phase.
`ndata.types`	numeric. A vector of the number of bins used to discretize each movement variable. These must be in the same order as the columns within `dat`.

Details

The LDA model analyzes all animal IDs pooled together, thereby providing population-level estimates of behavioral states.

Value

Examples


#load data
data(tracks.seg)

#select only id, tseg, SL, and TA columns
tracks.seg2<- tracks.seg[,c("id","tseg","SL","TA")]

#summarize data by track segment
obs<- summarize_tsegs(dat = tracks.seg2, nbins = c(5,8))

#cluster data with LDA
res<- cluster_segments(dat = obs, gamma1 = 0.1, alpha = 0.1, ngibbs = 1000,
                       nburn = 500, nmaxclust = 7, ndata.types = 2)


#load data
data(tracks.seg)

#select only id, tseg, SL, and TA columns
tracks.seg2<- tracks.seg[,c("id","tseg","SL","TA")]

#summarize data by track segment
obs<- summarize_tsegs(dat = tracks.seg2, nbins = c(5,8))

#cluster data with LDA
res<- cluster_segments(dat = obs, gamma1 = 0.1, alpha = 0.1, ngibbs = 1000,
                       nburn = 500, nmaxclust = 7, ndata.types = 2)

Internal function that calculates the inverted cumsum

Description

Internal function that calculates the inverted cumsum

Usage

CumSumInv(ntsegm, nmaxclust, z)
CumSumInv(ntsegm, nmaxclust, z)

Arguments

`ntsegm`	An integer.
`nmaxclust`	An integer.
`z`	An integer matrix.

Convert data frame to a list by animal ID

Description

Converts an object of class data.frame to a list where each element is a separate animal ID. This function prepares the data for further analysis and when mapping other functions onto the data for separate animal IDs.

Usage

df_to_list(dat, ind)
df_to_list(dat, ind)

Arguments

`dat`	A data frame containing the data for each animal ID.
`ind`	character. The name of the column storing the animal IDs.

Value

A list where each element stores the data for a separate animal ID.

Examples

#load data
data(tracks)

#convert to list
dat.list<- df_to_list(dat = tracks, ind = "id")


#load data
data(tracks)

#convert to list
dat.list<- df_to_list(dat = tracks, ind = "id")

Discretize movement variables

Description

Convert movement variables from continuous to discrete values for analysis by segment_behavior.

Usage

discrete_move_var(dat, lims, varIn, varOut)
discrete_move_var(dat, lims, varIn, varOut)

Arguments

`dat`	A data frame that contains the variable(s) of interest to convert from continuous to discrete values.
`lims`	A list of the bin limits for each variable. Each element of the list should be a vector of real numbers.
`varIn`	A vector of names for the continuous variable stored as columns within `dat`.
`varOut`	A vector of names for the storage of the discrete variables returned by the function.

Value

A data frame with new columns of discretized variables as labeled by varOut.

Examples

#load data
data(tracks)

#subset only first track
tracks<- tracks[tracks$id == "id1",]

#calculate step lengths and turning angles
tracks<- prep_data(dat = tracks, coord.names = c("x","y"), id = "id")

#round times to nearest interval of interest (e.g. 3600 s or 1 hr)
tracks<- round_track_time(dat = tracks, id = "id", int = 3600, tol = 180, time.zone = "UTC",
                                 units = "secs")

#create list from data frame
tracks.list<- df_to_list(dat = tracks, ind = "id")

#filter observations to only 1 hr (or 3600 s)
tracks_filt.list<- filter_time(dat.list = tracks.list, int = 3600)

#define bin number and limits for turning angles and step lengths
angle.bin.lims=seq(from=-pi, to=pi, by=pi/4)  #8 bins
dist.bin.lims=quantile(tracks[tracks$dt == 3600,]$step,
                      c(0,0.25,0.50,0.75,0.90,1), na.rm=TRUE)  #5 bins


# Assign bins to observations
tracks_disc.list<- purrr::map(tracks_filt.list,
                      discrete_move_var,
                      lims = list(dist.bin.lims, angle.bin.lims),
                      varIn = c("step", "angle"),
                      varOut = c("SL", "TA"))


#load data
data(tracks)

#subset only first track
tracks<- tracks[tracks$id == "id1",]

#calculate step lengths and turning angles
tracks<- prep_data(dat = tracks, coord.names = c("x","y"), id = "id")

#round times to nearest interval of interest (e.g. 3600 s or 1 hr)
tracks<- round_track_time(dat = tracks, id = "id", int = 3600, tol = 180, time.zone = "UTC",
                                 units = "secs")

#create list from data frame
tracks.list<- df_to_list(dat = tracks, ind = "id")

#filter observations to only 1 hr (or 3600 s)
tracks_filt.list<- filter_time(dat.list = tracks.list, int = 3600)

#define bin number and limits for turning angles and step lengths
angle.bin.lims=seq(from=-pi, to=pi, by=pi/4)  #8 bins
dist.bin.lims=quantile(tracks[tracks$dt == 3600,]$step,
                      c(0,0.25,0.50,0.75,0.90,1), na.rm=TRUE)  #5 bins


# Assign bins to observations
tracks_disc.list<- purrr::map(tracks_filt.list,
                      discrete_move_var,
                      lims = list(dist.bin.lims, angle.bin.lims),
                      varIn = c("step", "angle"),
                      varOut = c("SL", "TA"))

Expand behavior estimates from track segments to observations

Description

Expand behavior estimates from track segments to observations

Usage

expand_behavior(dat, theta.estim, obs, nbehav, behav.names, behav.order)
expand_behavior(dat, theta.estim, obs, nbehav, behav.names, behav.order)

Arguments

`dat`	A data frame of the animal ID, track segment labels, and all other data per observation. Animal ID, date, track segment, and observation number columns must be labeled id, date, tseg, and time1, respectively.
`theta.estim`	A matrix (returned by `extract_prop`) containing the proportions of each behavioral state as separate columns for each track segment (rows).
`obs`	A data frame summarizing the number of observations within each bin per movement variable that is returned by `summarize_tsegs`.
`nbehav`	numeric. The number of behavioral states that will be retained in 1 to nmaxclust.
`behav.names`	character. A vector of names to label each state (in order).
`behav.order`	numeric. A vector that identifies the order in which the user would like to rearrange the behavioral states. If satisfied with order returned by the LDA model, this still must be specified.

Value

A new data frame that expands behavior proportions for each observation within all track segments, including the columns labeled time1 and date from the original dat data frame.

Examples



#load data
data(tracks.seg)

#select only id, tseg, SL, and TA columns
tracks.seg2<- tracks.seg[,c("id","tseg","SL","TA")]

#summarize data by track segment
obs<- summarize_tsegs(dat = tracks.seg2, nbins = c(5,8))

#cluster data with LDA
res<- cluster_segments(dat = obs, gamma1 = 0.1, alpha = 0.1, ngibbs = 1000,
                       nburn = 500, nmaxclust = 7, ndata.types = 2)

#Extract proportions of behaviors per track segment
theta.estim<- extract_prop(res = res, ngibbs = 1000, nburn = 500, nmaxclust = 7)

#Create augmented matrix by replicating rows (tsegs) according to obs per tseg
theta.estim.long<- expand_behavior(dat = tracks.seg, theta.estim = theta.estim, obs = obs,
                               nbehav = 3, behav.names = c("Encamped","ARS","Transit"),
                               behav.order = c(1,2,3))



#load data
data(tracks.seg)

#select only id, tseg, SL, and TA columns
tracks.seg2<- tracks.seg[,c("id","tseg","SL","TA")]

#summarize data by track segment
obs<- summarize_tsegs(dat = tracks.seg2, nbins = c(5,8))

#cluster data with LDA
res<- cluster_segments(dat = obs, gamma1 = 0.1, alpha = 0.1, ngibbs = 1000,
                       nburn = 500, nmaxclust = 7, ndata.types = 2)

#Extract proportions of behaviors per track segment
theta.estim<- extract_prop(res = res, ngibbs = 1000, nburn = 500, nmaxclust = 7)

#Create augmented matrix by replicating rows (tsegs) according to obs per tseg
theta.estim.long<- expand_behavior(dat = tracks.seg, theta.estim = theta.estim, obs = obs,
                               nbehav = 3, behav.names = c("Encamped","ARS","Transit"),
                               behav.order = c(1,2,3))

Extract behavior proportion estimates for each track segment

Description

Calculates the mean of the posterior for the proportions of each behavior within track segments. These results can be explored to determine the optimal number of latent behavioral states.

Usage

extract_prop(res, ngibbs, nburn, nmaxclust)
extract_prop(res, ngibbs, nburn, nmaxclust)

Arguments

`res`	A list of results returned by `cluster_segments`. Element `theta` stores estimate for behavior proportions for all time segments.
`ngibbs`	numeric. The total number of iterations of the MCMC chain.
`nburn`	numeric. The length of the burn-in phase.
`nmaxclust`	numeric. A single number indicating the maximum number of clusters to test.

Value

A matrix that stores the proportions of each state/cluster (columns) per track segment (rows).

Examples



#load data
data(tracks.seg)

#select only id, tseg, SL, and TA columns
tracks.seg2<- tracks.seg[,c("id","tseg","SL","TA")]

#summarize data by track segment
obs<- summarize_tsegs(dat = tracks.seg2, nbins = c(5,8))

#cluster data with LDA
res<- cluster_segments(dat = obs, gamma1 = 0.1, alpha = 0.1, ngibbs = 1000,
                       nburn = 500, nmaxclust = 7, ndata.types = 2)

#Extract proportions of behaviors per track segment
theta.estim<- extract_prop(res = res, ngibbs = 1000, nburn = 500, nmaxclust = 7)



#load data
data(tracks.seg)

#select only id, tseg, SL, and TA columns
tracks.seg2<- tracks.seg[,c("id","tseg","SL","TA")]

#summarize data by track segment
obs<- summarize_tsegs(dat = tracks.seg2, nbins = c(5,8))

#cluster data with LDA
res<- cluster_segments(dat = obs, gamma1 = 0.1, alpha = 0.1, ngibbs = 1000,
                       nburn = 500, nmaxclust = 7, ndata.types = 2)

#Extract proportions of behaviors per track segment
theta.estim<- extract_prop(res = res, ngibbs = 1000, nburn = 500, nmaxclust = 7)

Filter observations for time interval of interest

Description

Selects observations that belong to the time interval of interest and removes all others. This function also removes entire IDs from the dataset when there is one or fewer observations at this time interval. This function works closely with round_track_time to only retain observations sampled at a regular time interval, which is important for analyzing step lengths and turning angles. Column storing the time intervals must be labeled dt.

Usage

filter_time(dat.list, int)
filter_time(dat.list, int)

Arguments

`dat.list`	A list of data associated with each animal ID where names of list elements are the ID names.
`int`	numeric. The time interval of interest.

Value

A list where observations for each animal ID (element) has been filtered for int. Two columns (obs and time1) are added for each list element (ID), which store the original observation number before filtering and the new observation number after filtering, respectively.

Examples

#load data
data(tracks)

#subset only first track
tracks<- tracks[tracks$id == "id1",]

#calculate step lengths and turning angles
tracks<- prep_data(dat = tracks, coord.names = c("x","y"), id = "id")

#round times to nearest interval of interest (e.g. 3600 s or 1 hr)
tracks<- round_track_time(dat = tracks, id = "id", int = 3600, tol = 180, time.zone = "UTC",
                              units = "secs")

#create list from data frame
tracks.list<- df_to_list(dat = tracks, ind = "id")

#filter observations to only 1 hr (or 3600 s)
tracks_filt.list<- filter_time(dat.list = tracks.list, int = 3600)

#load data
data(tracks)

#subset only first track
tracks<- tracks[tracks$id == "id1",]

#calculate step lengths and turning angles
tracks<- prep_data(dat = tracks, coord.names = c("x","y"), id = "id")

#round times to nearest interval of interest (e.g. 3600 s or 1 hr)
tracks<- round_track_time(dat = tracks, id = "id", int = 3600, tol = 180, time.zone = "UTC",
                              units = "secs")

#create list from data frame
tracks.list<- df_to_list(dat = tracks, ind = "id")

#filter observations to only 1 hr (or 3600 s)
tracks_filt.list<- filter_time(dat.list = tracks.list, int = 3600)

Find changes for integer variable

Description

Identify changes within a discrete variable. These values can be used to pre-specify breakpoints within the segmentation model using segment_behavior.

Usage

find_breaks(dat, ind)
find_breaks(dat, ind)

Arguments

`dat`	A data frame containing the data for each animal ID.
`ind`	character. The name of the column storing the discrete variable of interest.

Value

A vector of breakpoints is returned based on the data provided. If wishing to identify separate breakpoints per animal ID, this function should be mapped onto a list generated by df_to_list.

Examples

#simuluate data
var<- sample(1:3, size = 50, replace = TRUE)
var<- rep(var, each = 20)
id<- rep(1:10, each = 100)

#create data frame
dat<- data.frame(id, var)

#create list
dat.list<- df_to_list(dat = dat, ind = "id")

#run function using purrr::map()
breaks<- purrr::map(dat.list, ~find_breaks(dat = ., ind = "var"))

#or with lapply()
breaks1<- lapply(dat.list, find_breaks, ind = "var")

#simuluate data
var<- sample(1:3, size = 50, replace = TRUE)
var<- rep(var, each = 20)
id<- rep(1:10, each = 100)

#create data frame
dat<- data.frame(id, var)

#create list
dat.list<- df_to_list(dat = dat, ind = "id")

#run function using purrr::map()
breaks<- purrr::map(dat.list, ~find_breaks(dat = ., ind = "var"))

#or with lapply()
breaks1<- lapply(dat.list, find_breaks, ind = "var")

Extract bin estimates from Latent Dirichlet Allocation or mixture model

Description

Pulls model results for the estimates of bin proportions per movement variable from the posterior distribution. This can be used for visualization of movement variable distribution for each behavior estimated.

Usage

get_behav_hist(dat, nburn, ngibbs, nmaxclust, var.names)
get_behav_hist(dat, nburn, ngibbs, nmaxclust, var.names)

Arguments

`dat`	The list object returned by the LDA model (`cluster_segments`) or mixture model (`cluster_obs`). Used for extracting the element phi.
`nburn`	numeric. The length of the burn-in phase.
`ngibbs`	numeric. The total number of iterations of the MCMC chain.
`nmaxclust`	numeric. The maximum number of clusters on which to attribute behaviors.
`var.names`	character. A vector of names used for each of the movement variables. Must be in the same order as were listed within the data frame returned by `summarize_tsegs` (if running LDA model).

Value

A data frame that contains columns for bin number, behavioral state, proportion represented by a given bin, and movement variable name. This is displayed in a long format, which is easier to visualize using ggplot2.

Examples



#load data
data(tracks.seg)

#select only id, tseg, SL, and TA columns
tracks.seg2<- tracks.seg[,c("id","tseg","SL","TA")]

#summarize data by track segment
obs<- summarize_tsegs(dat = tracks.seg2, nbins = c(5,8))

#cluster data with LDA
res<- cluster_segments(dat = obs, gamma1 = 0.1, alpha = 0.1, ngibbs = 1000,
                       nburn = 500, nmaxclust = 7, ndata.types = 2)

#Extract proportions of behaviors per track segment
theta.estim<- extract_prop(res = res, ngibbs = 1000, nburn = 500, nmaxclust = 7)

#run function for clustered segments
behav.res<- get_behav_hist(dat = res, nburn = 500, ngibbs = 1000, nmaxclust = 7,
                           var.names = c("Step Length","Turning Angle"))


#load data
data(tracks.seg)

#select only id, tseg, SL, and TA columns
tracks.seg2<- tracks.seg[,c("id","tseg","SL","TA")]

#summarize data by track segment
obs<- summarize_tsegs(dat = tracks.seg2, nbins = c(5,8))

#cluster data with LDA
res<- cluster_segments(dat = obs, gamma1 = 0.1, alpha = 0.1, ngibbs = 1000,
                       nburn = 500, nmaxclust = 7, ndata.types = 2)

#Extract proportions of behaviors per track segment
theta.estim<- extract_prop(res = res, ngibbs = 1000, nburn = 500, nmaxclust = 7)

#run function for clustered segments
behav.res<- get_behav_hist(dat = res, nburn = 500, ngibbs = 1000, nmaxclust = 7,
                           var.names = c("Step Length","Turning Angle"))

Extract breakpoints for each animal ID

Description

Extract breakpoints for each animal ID

Usage

get_breakpts(dat, MAP.est)
get_breakpts(dat, MAP.est)

Arguments

`dat`	A list of lists where animal IDs are separated as well as the breakpoints estimated for each iteration of the MCMC chain. This is stored within `breakpts` of model results returned after running `segment_behavior`.
`MAP.est`	numeric. A vector of values at which the maximum a posteriori (MAP) estimate was identified for each of the animal IDs as returned by `get_MAP`. These must be in the same order as the data for the IDs supplied to `segment_behavior()`.

Value

A data frame where breakpoints are returned per animal ID within each row. For animal IDs that have fewer breakpoints than the maximum number that were estimated, NA values are used as place holders for these breakpoints that do not exist.

Examples


#load data
data(tracks.list)

#subset only first track
tracks.list<- tracks.list[1]

#only retain id and discretized step length (SL) and turning angle (TA) columns
tracks.list2<- purrr::map(tracks.list,
                   subset,
                  select = c(id, SL, TA))


set.seed(1)

# Define model params
alpha<- 1
ngibbs<- 1000
nbins<- c(5,8)

#future::plan(future::multisession)  #run all MCMC chains in parallel
dat.res<- segment_behavior(data = tracks.list2, ngibbs = ngibbs, nbins = nbins,
                           alpha = alpha)


# Determine MAP iteration for selecting breakpoints and store breakpoints
MAP.est<- get_MAP(dat = dat.res$LML, nburn = ngibbs/2)
brkpts<- get_breakpts(dat = dat.res$brkpts, MAP.est = MAP.est)


#load data
data(tracks.list)

#subset only first track
tracks.list<- tracks.list[1]

#only retain id and discretized step length (SL) and turning angle (TA) columns
tracks.list2<- purrr::map(tracks.list,
                   subset,
                  select = c(id, SL, TA))


set.seed(1)

# Define model params
alpha<- 1
ngibbs<- 1000
nbins<- c(5,8)

#future::plan(future::multisession)  #run all MCMC chains in parallel
dat.res<- segment_behavior(data = tracks.list2, ngibbs = ngibbs, nbins = nbins,
                           alpha = alpha)


# Determine MAP iteration for selecting breakpoints and store breakpoints
MAP.est<- get_MAP(dat = dat.res$LML, nburn = ngibbs/2)
brkpts<- get_breakpts(dat = dat.res$brkpts, MAP.est = MAP.est)

Find the maximum a posteriori (MAP) estimate of the MCMC chain

Description

Identify the MCMC iteration that holds the MAP estimate. This will be used to inform get_breakpts as to which breakpoints should be retained on which to assign track segments to the observations of each animal ID.

Usage

get_MAP(dat, nburn)
get_MAP(dat, nburn)

Arguments

`dat`	A data frame where each row holds the log marginal likelihood values at each iteration of the MCMC chain.
`nburn`	numeric. The size of the burn-in phase after which the MAP estimate will be identified.

Value

A numeric vector of iterations at which the MAP estimate was found for each animal ID.

Examples


#load data
data(tracks.list)

#subset only first track
tracks.list<- tracks.list[1]

#only retain id and discretized step length (SL) and turning angle (TA) columns
tracks.list2<- purrr::map(tracks.list,
                   subset,
                  select = c(id, SL, TA))


set.seed(1)

# Define model params
alpha<- 1
ngibbs<- 1000
nbins<- c(5,8)

#future::plan(future::multisession)  #run all MCMC chains in parallel
dat.res<- segment_behavior(data = tracks.list2, ngibbs = ngibbs, nbins = nbins,
                           alpha = alpha)


# Determine MAP iteration for selecting breakpoints and store breakpoints
MAP.est<- get_MAP(dat = dat.res$LML, nburn = ngibbs/2)


#load data
data(tracks.list)

#subset only first track
tracks.list<- tracks.list[1]

#only retain id and discretized step length (SL) and turning angle (TA) columns
tracks.list2<- purrr::map(tracks.list,
                   subset,
                  select = c(id, SL, TA))


set.seed(1)

# Define model params
alpha<- 1
ngibbs<- 1000
nbins<- c(5,8)

#future::plan(future::multisession)  #run all MCMC chains in parallel
dat.res<- segment_behavior(data = tracks.list2, ngibbs = ngibbs, nbins = nbins,
                           alpha = alpha)


# Determine MAP iteration for selecting breakpoints and store breakpoints
MAP.est<- get_MAP(dat = dat.res$LML, nburn = ngibbs/2)

Internal function that calculates the sufficient statistics for the segmentation model

Description

An internal function that calculates the sufficient statistics to be used within the reversible-jump MCMC Gibbs sampler called by link{samp_move}.

Usage

get_summary_stats(breakpt, dat, max.time, nbins, ndata.types)
get_summary_stats(breakpt, dat, max.time, nbins, ndata.types)

Arguments

`breakpt`	numeric. A vector of breakpoints.
`dat`	A matrix that only contains columns storing discretized data for each of the movement variables.
`max.time`	numeric. The number of of the last observation of `dat`.
`nbins`	numeric. A vector of the number of bins used to discretize each movement variable. These must be in the same order as the columns within `dat`.
`ndata.types`	numeric. The length of `nbins`.

Value

Returns the sufficient statistics associated with the provided breakpoints for a given animal ID.

Internal function to calculate the log-likelihood for iteration of mixture model

Description

Calculates the log-likelihood of the mixture model based on estimates for theta and phi.

Usage

get.llk.mixmod(phi, theta, ndata.types, dat, nobs, nmaxclust)
get.llk.mixmod(phi, theta, ndata.types, dat, nobs, nmaxclust)

Arguments

`phi`	A list of proportion estimates that characterize distributions (bins) for each data stream and possible behavioral state.
`theta`	numeric. A vector of values that sum to one.
`ndata.types`	numeric. The number of data streams being analyzed.
`dat`	A data frame containing only columns of the discretized data streams for all observations.
`nobs`	numeric. The total number of rows in the dataset.
`nmaxclust`	numeric. A single number indicating the maximum number of clusters to test.

Value

A numeric value of the log-likelihood based upon the current values for phi and theta.

Internal function to calculate theta parameter

Description

Calculates values of theta matrix within Gibbs sampler. Not for calling directly by users.

Usage

get.theta(v, nmaxclust, ntsegm)
get.theta(v, nmaxclust, ntsegm)

Arguments

`v`	A matrix returned by `sample.v`
`nmaxclust`	numeric. A single number indicating the maximum number of clusters to test.
`ntsegm`	numeric. The total number of time segments from all animal IDs.

Value

A matrix of proportion estimates that represent proportions of different behavioral states per time segment.

Insert NA gaps to regularize a time series

Description

Insert NA gaps to regularize a time series

Usage

insert_NAs(data, int, units)
insert_NAs(data, int, units)

Arguments

`data`	A data frame that minimally contains columns for animal ID, date, and time step. These must be labeled `id`, `date`, and `dt`, respectively, where date is of class `POSIXct`.
`int`	integer. An integer that characterizes the desired interval on which to insert new rows.
`units`	character. The units of the selected time interval `int`, which can be selected from one of "secs", "mins", "hours", "days", or "weeks".

Value

A data frame where new rows have been inserted to regularize the date column. This results in values provided for id, date, and dt while inserting NAs for all other columns. Additionally, observations with duplicate date-times are removed.

Examples

#load data
data(tracks)

#remove rows to show how function works (create irregular time series)
set.seed(1)
ind<- sort(sample(2:15003, 500))

tracks.red<- tracks[-ind,]

#calculate step lengths, turning angles, net-squared displacement, and time steps
tracks.red<- prep_data(dat = tracks.red, coord.names = c("x","y"), id = "id")

#round times to nearest interval
tracks.red<- round_track_time(dat = tracks.red, id = "id", int = c(3600, 7200, 10800, 14400),
                              tol = 300, units = "secs")

#insert NA gaps
dat.out<- insert_NAs(tracks.red, int = 3600, units = "secs")


#load data
data(tracks)

#remove rows to show how function works (create irregular time series)
set.seed(1)
ind<- sort(sample(2:15003, 500))

tracks.red<- tracks[-ind,]

#calculate step lengths, turning angles, net-squared displacement, and time steps
tracks.red<- prep_data(dat = tracks.red, coord.names = c("x","y"), id = "id")

#round times to nearest interval
tracks.red<- round_track_time(dat = tracks.red, id = "id", int = c(3600, 7200, 10800, 14400),
                              tol = 300, units = "secs")

#insert NA gaps
dat.out<- insert_NAs(tracks.red, int = 3600, units = "secs")

Internal function that calculates the log marginal likelihood of each model being compared

Description

An internal function that is used to calculate the log marginal likelihood of models for the current and proposed sets of breakpoints. Called within samp_move.

Usage

log_marg_likel(alpha, summary.stats, nbins, ndata.types)
log_marg_likel(alpha, summary.stats, nbins, ndata.types)

Arguments

`alpha`	numeric. A single value used to specify the hyperparameter for the prior distribution. A standard value for `alpha` is typically 1, which corresponds with a vague prior on the Dirichlet distribution.
`summary.stats`	A matrix of sufficient statistics returned from `get_summary_stats`.
`nbins`	numeric. A vector of the number of bins used to discretize each movement variable.
`ndata.types`	numeric. The length of `nbins`.

Value

The log marginal likelihood is calculated for a model with a given set of breakpoints and the discretized data.

Plot breakpoints over a time series of each movement variable

Description

Visualize the breakpoints estimated by the segmentation model as they relate to either the original (continuous) or discretized data. These plots assist in determining whether too many or too few breakpoints were estimated as well as whether the user needs to redefine how they discretized their data before analysis.

Usage

plot_breakpoints(data, as_date = FALSE, var_names, var_labels = NULL, brkpts)
plot_breakpoints(data, as_date = FALSE, var_names, var_labels = NULL, brkpts)

Arguments

`data`	A list where each element stores a data frame for a given animal ID. Each of these data frames contains columns for the ID, date or time1 generated by `filter_time`, as well as each of the movement variables analyzed by `segment_behavior`.
`as_date`	logical. If `TRUE`, plots breakpoints and data streams over the date. By default, this is set to `FALSE`.
`var_names`	A vector of the column names for the movement variables to be plotted over time.
`var_labels`	A vector of the labels to be plotted on the y-axis for each movement variable. Set to `NULL` by default.
`brkpts`	A data frame that contains the breakpoints associated with each animal ID. This data frame is returned by `get_breakpts`.

Value

A line plot per animal ID for each movement variable showing how the estimated breakpoints relate to the underlying data. Depending on the user input for var_names, this may either be on the scale of the original continuous data or the discretized data.

Examples



#load data
data(tracks.list)

#subset only first track
tracks.list<- tracks.list[1]

#only retain id and discretized step length (SL) and turning angle (TA) columns
tracks.list2<- purrr::map(tracks.list,
                   subset,
                  select = c(id, SL, TA))


set.seed(1)

# Define model params
alpha<- 1
ngibbs<- 1000
nbins<- c(5,8)

#future::plan(future::multisession)  #run all MCMC chains in parallel
dat.res<- segment_behavior(data = tracks.list2, ngibbs = ngibbs, nbins = nbins,
                           alpha = alpha)


# Determine MAP iteration for selecting breakpoints and store breakpoints
MAP.est<- get_MAP(dat = dat.res$LML, nburn = ngibbs/2)
brkpts<- get_breakpts(dat = dat.res$brkpts, MAP.est = MAP.est)


#run function
plot_breakpoints(data = tracks.list, as_date = FALSE, var_names = c("step","angle"),
    var_labels = c("Step Length (m)", "Turning Angle (rad)"), brkpts = brkpts)


#load data
data(tracks.list)

#subset only first track
tracks.list<- tracks.list[1]

#only retain id and discretized step length (SL) and turning angle (TA) columns
tracks.list2<- purrr::map(tracks.list,
                   subset,
                  select = c(id, SL, TA))


set.seed(1)

# Define model params
alpha<- 1
ngibbs<- 1000
nbins<- c(5,8)

#future::plan(future::multisession)  #run all MCMC chains in parallel
dat.res<- segment_behavior(data = tracks.list2, ngibbs = ngibbs, nbins = nbins,
                           alpha = alpha)


# Determine MAP iteration for selecting breakpoints and store breakpoints
MAP.est<- get_MAP(dat = dat.res$LML, nburn = ngibbs/2)
brkpts<- get_breakpts(dat = dat.res$brkpts, MAP.est = MAP.est)


#run function
plot_breakpoints(data = tracks.list, as_date = FALSE, var_names = c("step","angle"),
    var_labels = c("Step Length (m)", "Turning Angle (rad)"), brkpts = brkpts)

Internal function for plotting breakpoints over each of the data streams

Description

An internal function for plotting the results of the segmentation model.

Usage

plot_breakpoints_behav(data, as_date, var_names, var_labels, brkpts)
plot_breakpoints_behav(data, as_date, var_names, var_labels, brkpts)

Arguments

`data`	A data frame for a single animal ID that contains columns for the ID, date or time variable, and each of the movement variables that were analyzed by `segment_behavior`. Data streams can be in continuous or discrete form.
`as_date`	logical. If `TRUE`, plots breakpoints and data streams over the date. By default, this is set to `FALSE`.
`var_names`	A vector of the column names for the movement variables to be plotted over time.
`var_labels`	A vector of the labels to be plotted on the y-axis for each movement variable. Set to `NULL` by default.
`brkpts`	A data frame that contains the breakpoints associated with each animal ID. This data frame is returned by `get_breakpts`.

Value

A line plot for each movement variable showing how the estimated breakpoints relate to the underlying data. Depending on the user input for var_names, this may either be on the scale of the original continuous data or the discretized data.

Calculate step lengths, turning angles, net-squared displacement, and time steps

Description

Calculates step lengths, turning angles, and net-squared displacement based on coordinates for each animal ID and calculates time steps based on the date-time. Provides a self-contained method to calculate these variables without needing to rely on other R packages (e.g., adehabitatLT). However, functions from other packages can also be used to perform this step in data preparation.

Usage

prep_data(dat, coord.names, id)
prep_data(dat, coord.names, id)

Arguments

`dat`	A data frame that contains a column for animal IDs, the columns associated with the x and y coordinates, and a column for the date. For easier interpretation of the model results, it is recommended that coordinates be stored in a UTM projection (meters) as opposed to unprojected in decimal degrees (map units). Date-time should be of class `POSIXct` and be labeled `date` within the data frame.
`coord.names`	character. A vector of the column names under which the coordinates are stored. The name for the x coordinate should be listed first and the name for the y coordinate second.
`id`	character. The name of the column storing the animal IDs.

Value

A data frame where all original data are returned and new columns are added for step length (step), turning angle (angle), net-squared displacement (NSD), and time step (dt). Names for coordinates are changed to x and y. Units for step and NSD depend on the projection of the coordinates, angle is returned in radians, and dt is returned in seconds.

Examples

#load data
data(tracks)

#subset only first track
tracks<- tracks[tracks$id == "id1",]

#calculate step lengths and turning angles
tracks<- prep_data(dat = tracks, coord.names = c("x","y"), id = "id")

#load data
data(tracks)

#subset only first track
tracks<- tracks[tracks$id == "id1",]

#calculate step lengths and turning angles
tracks<- prep_data(dat = tracks, coord.names = c("x","y"), id = "id")

Internal function to calculate step lengths, turning angles, and time steps

Description

An internal function that calculates step lengths, turning angles, and time steps for a given animal ID.

Usage

prep_data_internal(dat, coord.names)
prep_data_internal(dat, coord.names)

Arguments

`dat`	A data frame that contains the columns associated with the x and y coordinates as well as the date-time. For easier interpretation of the model results, it is recommended that coordinates be stored after UTM projection (meters) as opposed to unprojected in decimal degrees (map units). Date-time should be of class `POSIXct` and be labeled `date` within the data frame.
`coord.names`	character. A vector of the column names under which the coordinates are stored. The name for the x coordinate should be listed first and the name for the y coordinate second.

Value

A data frame where all original data are returned and new columns are added for step length (step), turning angle (angle), net-squared displacement (NSD), and time step (dt).

Internal function that samples z's from a categorical distribution

Description

Internal function that samples z's from a categorical distribution

Usage

rmultinom1(prob, randu)
rmultinom1(prob, randu)

Arguments

`prob`	A numeric matrix.
`randu`	A numeric vector.

Internal function that samples z's from a multinomial distribution

Description

Internal function that samples z's from a multinomial distribution

Usage

rmultinom2(prob, n, randu, nmaxclust)
rmultinom2(prob, n, randu, nmaxclust)

Arguments

`prob`	A numeric vector.
`n`	An integer.
`randu`	A numeric vector.
`nmaxclust`	An integer.

Round time to nearest interval

Description

Rounds sampling intervals that are close, but not exactly the time interval of interest (e.g., 240 s instead of 300 s). This can be performed on multiple time intervals, but only using a single tolerance value. This function prepares the data to be analyzed by segment_behavior, which requires that all time intervals exactly match the primary time interval when analyzing step lengths and turning angles. Columns storing the time intervals and dates must be labeled dt and date, respectively, where dates are of class POSIXct.

Usage

round_track_time(dat, id, int, tol, time.zone = "UTC", units)
round_track_time(dat, id, int, tol, time.zone = "UTC", units)

Arguments

`dat`	A data frame that contains the sampling interval of the observations.
`id`	character. The name of the column storing the animal IDs.
`int`	numeric. A vector of the time interval(s) of on which to perform rounding.
`tol`	numeric. A single tolerance value on which to round any `int` that were specified.
`time.zone`	character. Specify the time zone for which the date-times were recorded. Set to UTC by default. Refer to `base::OlsonNames` to view all possible time zones.
`units`	character. The units of the selected time interval `int`, which can be selected from one of "secs", "mins", "hours", "days", or "weeks".

Value

A data frame where dt and date are both adjusted based upon the rounding of time intervals according to the specified tolerance.

Examples

#load data
data(tracks)

#subset only first track
tracks<- tracks[tracks$id == "id1",]

#calculate step lengths and turning angles
tracks<- prep_data(dat = tracks, coord.names = c("x","y"), id = "id")

#round times to nearest interval of interest (e.g. 3600 s or 1 hr)
tracks<- round_track_time(dat = tracks, id = "id", int = 3600, tol = 180, time.zone = "UTC",
                          units = "secs")

#load data
data(tracks)

#subset only first track
tracks<- tracks[tracks$id == "id1",]

#calculate step lengths and turning angles
tracks<- prep_data(dat = tracks, coord.names = c("x","y"), id = "id")

#round times to nearest interval of interest (e.g. 3600 s or 1 hr)
tracks<- round_track_time(dat = tracks, id = "id", int = 3600, tol = 180, time.zone = "UTC",
                          units = "secs")

Internal function for the Gibbs sampler within the reversible-jump MCMC algorithm

Description

This is RJMCMC algorithm that drives the proposal and selection of breakpoints for the data based on the difference in log marginal likelihood. This function is called within behav_gibbs_sampler.

Usage

samp_move(breakpt, max.time, dat, alpha, nbins, ndata.types)
samp_move(breakpt, max.time, dat, alpha, nbins, ndata.types)

Arguments

`breakpt`	numeric. A vector of breakpoints.
`max.time`	numeric. The number of of the last observation of `dat`.
`dat`	A matrix that only contains columns storing discretized data for each of the movement variables used within `get_summary_stats`.
`alpha`	numeric. A single value used to specify the hyperparameter for the prior distribution. A standard value for `alpha` is typically 1, which corresponds with a vague prior on the Dirichlet distribution.
`nbins`	numeric. A vector of the number of bins used to discretize each movement variable. These must be in the same order as the columns within `dat`.
`ndata.types`	numeric. The length of `nbins`.

Value

The breakpoints and log marginal likelihood are retained from the selected model from the Gibbs sampler and returned as elements of a list. This is performed for each iteration of the MCMC algorithm.

Internal function to sample the gamma hyperparameter

Description

Internal function to sample the gamma hyperparameter

Usage

sample.gamma.mixmod(v, ngroup, gamma.possib)
sample.gamma.mixmod(v, ngroup, gamma.possib)

Arguments

`v`	numeric. A vector of proportions for each of the possible clusters.
`ngroup`	numeric. The total number of possible clusters.
`gamma.possib`	numeric. A vector of possible values that gamma can take ranging between 0.1 and 1.

Value

A single numeric value for gamma that falls within gamma.possib for calculation of the log-likelihood.

Internal function to sample bin estimates for each movement variable

Description

Estimates values of phi matrix for use in characterizing distributions of the movement variables. Not for calling directly by users.

Usage

sample.phi(z.agg, alpha, nmaxclust, nbins, ndata.types)
sample.phi(z.agg, alpha, nmaxclust, nbins, ndata.types)

Arguments

`z.agg`	A list of latent cluster estimates provided by `sample.z`.
`alpha`	numeric. A hyperparameter for the Dirichlet distribution.
`nmaxclust`	numeric. A single number indicating the maximum number of clusters to test.
`nbins`	numeric. A vector of the number of bins used to discretize each movement variable. These must be in the same order as the columns within `y`.
`ndata.types`	numeric. The number of data streams being analyzed.

Value

A matrix of proportion estimates that characterize distributions (bins) for each movement variable and possible behavioral state.

Internal function to sample bin estimates for each movement variable

Description

Estimates values of phi matrix for use in characterizing distributions of the movement variables. Not for calling directly by users.

Usage

sample.phi.mixmod(alpha, nmaxclust, nbins, ndata.types, nmat)
sample.phi.mixmod(alpha, nmaxclust, nbins, ndata.types, nmat)

Arguments

`alpha`	numeric. A hyperparameter for the Dirichlet distribution.
`nmaxclust`	numeric. A single number indicating the maximum number of clusters to test.
`nbins`	numeric. A vector of the number of bins used to discretize each data stream. These must be in the same order as the columns within `dat`.
`ndata.types`	numeric. The number of data streams being analyzed.
`nmat`	A list based on `SummarizeDat` C++ function to help with multinomial draws.

Value

A list of proportion estimates that characterize distributions (bins) for each data stream and possible behavioral state.

Internal function to sample parameter for truncated stick-breaking prior

Description

This function samples the latent v parameter within the Gibbs sampler. Calls on the CumSumInv function written in C++. Not for calling directly by users.

Usage

sample.v(z.agg, gamma1, ntsegm, ndata.types, nmaxclust)
sample.v(z.agg, gamma1, ntsegm, ndata.types, nmaxclust)

Arguments

`z.agg`	A list of latent cluster estimates provided by `sample.z`.
`gamma1`	numeric. Hyperparameter for the truncated stick-breaking prior.
`ntsegm`	numeric. The total number of time segments from all animal IDs.
`ndata.types`	numeric. The number of data streams being analyzed.
`nmaxclust`	numeric. A single number indicating the maximum number of clusters to test.

Value

A matrix with estimates for v for each of the number of time segments and possible states.

Internal function to sample parameter for truncated stick-breaking prior

Description

This function samples the latent v parameter within the Gibbs sampler. Not for calling directly by users.

Usage

sample.v.mixmod(z, gamma1, nmaxclust)
sample.v.mixmod(z, gamma1, nmaxclust)

Arguments

`z`	A vector of latent cluster estimates provided by `sample.z.mixmod`.
`gamma1`	numeric. Hyperparameter for the truncated stick-breaking prior.
`nmaxclust`	numeric. A single number indicating the maximum number of clusters to test.

Value

A list with estimates for v and theta for each of the possible states.

Internal function to sample latent clusters

Description

This function samples the latent z parameter within the Gibbs sampler. Calls on the SampleZAgg function written in C++. Not for calling directly by users.

Usage

sample.z(ntsegm, nbins, y, nmaxclust, phi, ltheta, zeroes, ndata.types)
sample.z(ntsegm, nbins, y, nmaxclust, phi, ltheta, zeroes, ndata.types)

Arguments

`ntsegm`	numeric. The total number of time segments from all animal IDs.
`nbins`	numeric. A vector of the number of bins used to discretize each movement variable. These must be in the same order as the columns within `y`.
`y`	A list where each element stores separate aggregated count data per bin per time segment for each movement variable being analyzed. These are stored as matrices.
`nmaxclust`	numeric. A single number indicating the maximum number of clusters to test.
`phi`	A list where each element stores separate proportions per bin per time segment for each movement variable.
`ltheta`	A matrix storing the log-transformed values from the `theta` parameter.
`zeroes`	A list of arrays that contain only zero values which are three dimensional (ntsegm,nbins[i],nmaxclust).
`ndata.types`	numeric. The number of data streams being analyzed.

Value

A list with estimates for z where the number of elements is equal to the number of movement variables.

Internal function to sample latent clusters (for observations)

Description

This function samples the latent z parameter within the Gibbs sampler. Calls on the rmultinom1 function written in C++. Not for calling directly by users.

Usage

sample.z.mixmod(nobs, nmaxclust, dat, ltheta, lphi, ndata.types)
sample.z.mixmod(nobs, nmaxclust, dat, ltheta, lphi, ndata.types)

Arguments

`nobs`	numeric. The total number of rows in the dataset.
`nmaxclust`	numeric. A single number indicating the maximum number of clusters to test.
`dat`	A data frame containing only columns of the discretized data streams for all observations.
`ltheta`	numeric. A vector of log-transformed estimates for parameter theta.
`lphi`	A list containing log-transformed estimates for each data stream of the phi parameter.
`ndata.types`	numeric. The number of data streams being analyzed.

Value

A vector with estimates for z for each observation within dat.

Internal function that samples z1 aggregate

Description

Internal function that samples z1 aggregate

Usage

SampleZAgg(ntsegm, b1, y1, nmaxclust, lphi1, ltheta, zeroes)
SampleZAgg(ntsegm, b1, y1, nmaxclust, lphi1, ltheta, zeroes)

Arguments

`ntsegm`	An integer.
`b1`	An integer.
`y1`	An integer matrix.
`nmaxclust`	An integer.
`lphi1`	A numeric matrix.
`ltheta`	A numeric matrix.
`zeroes`	A numeric vector.

Segmentation model to estimate breakpoints

Description

This function performs the reversible-jump MCMC algorithm using a Gibbs sampler, which estimates the breakpoints of the movement variables for each of the animal IDs. This is the first stage of the two-stage Bayesian model that estimates proportions of behavioral states by first segmenting individual tracks into relatively homogeneous segments of movement.

Usage

segment_behavior(
  data,
  ngibbs,
  nbins,
  alpha,
  breakpt = purrr::map(names(data), ~NULL)
)
segment_behavior(
  data,
  ngibbs,
  nbins,
  alpha,
  breakpt = purrr::map(names(data), ~NULL)
)

Arguments

`data`	A list where each element stores the data for a separate animal ID. List elements are data frames that only contain columns for the animal ID and for each of the discretized movement variables.
`ngibbs`	numeric. The total number of iterations of the MCMC chain.
`nbins`	numeric. A vector of the number of bins used to discretize each movement variable. These must be in the same order as the columns within `data`.
`alpha`	numeric. A single value used to specify the hyperparameter for the prior distribution. A standard value for `alpha` is typically 1, which corresponds with a vague prior on the Dirichlet distribution.
`breakpt`	A list where each element stores a vector of breakpoints if pre-specifying where they may occur for each animal ID. By default this is set to `NULL`.

Details

This model is run in parallel using the future package. To ensure that the model is run in parallel, the plan must be used with future::multisession as the argument for most operating systems. Otherwise, model will run sequentially by default if this is not set before running segment_behavior.

Value

A list of model results is returned where elements include the breakpoints, number of breakpoints, and log marginal likelihood at each iteration of the MCMC chain for all animal IDs. The time it took the model to finish running for each animal ID are also stored and returned.

Examples


#load data
data(tracks.list)

#subset only first track
tracks.list<- tracks.list[1]

#only retain id and discretized step length (SL) and turning angle (TA) columns
tracks.list2<- purrr::map(tracks.list,
                   subset,
                  select = c(id, SL, TA))


set.seed(1)

# Define model params
alpha<- 1
ngibbs<- 1000
nbins<- c(5,8)

future::plan(future::multisession, workers = 3)  #run all MCMC chains in parallel

dat.res<- segment_behavior(data = tracks.list2, ngibbs = ngibbs, nbins = nbins,
                               alpha = alpha)


future::plan(future::sequential)  #return to single core


#load data
data(tracks.list)

#subset only first track
tracks.list<- tracks.list[1]

#only retain id and discretized step length (SL) and turning angle (TA) columns
tracks.list2<- purrr::map(tracks.list,
                   subset,
                  select = c(id, SL, TA))


set.seed(1)

# Define model params
alpha<- 1
ngibbs<- 1000
nbins<- c(5,8)

future::plan(future::multisession, workers = 3)  #run all MCMC chains in parallel

dat.res<- segment_behavior(data = tracks.list2, ngibbs = ngibbs, nbins = nbins,
                               alpha = alpha)


future::plan(future::sequential)  #return to single core

Dynamically explore tracks within Shiny app

Description

This Shiny application allows for the exploration of animal movement patterns. Options are available to interactively filter the plotted tracks by a selected time period of a given variable, which is then displayed on an interactive map. Additionally, a data table is shown with options to filter and export this table once satisfied.

Usage

shiny_tracks(data, epsg)
shiny_tracks(data, epsg)

Arguments

`data`	A data frame that must contain columns labeled `id, x, y, date`, but can include any other variables of interest.
`epsg`	numeric. The coordinate reference system (CRS) as an EPSG code or a PROJ string.

Details

Currently, the time series plot shown for the exploration of individual tracks cannot display variables of class character or factor. Therefore, these should be changed to numeric values if they are to be plotted.

If the data are stored as longitude and latitude (i.e., WGS84), the EPSG code is 4326. All other codes will need to be looked up if they are not already known.

Examples

## Not run: 
#load data
data(tracks)

#run Shiny app
shiny_tracks(data = tracks, epsg = 32617)


## End(Not run)

## Not run: 
#load data
data(tracks)

#run Shiny app
shiny_tracks(data = tracks, epsg = 32617)


## End(Not run)

This function helps store z from all iterations after burn in

Description

This function helps store z from all iterations after burn in

Usage

StoreZ(z, store_z, nobs)
StoreZ(z, store_z, nobs)

Arguments

`z`	An integer vector.
`store_z`	An integer matrix.
`nobs`	An integer.

Summarize observations within bins per track segment

Description

Prepares the data that has already been segmented for clustering by Latent Dirichlet Allocation. This function summarizes the counts observed per movement variable bin within each track segment per animal ID.

Usage

summarize_tsegs(dat, nbins)
summarize_tsegs(dat, nbins)

Arguments

`dat`	A data frame of only the animal ID, track segment number, and the discretized data for each movement variable. Animal ID and time segment must be the first two columns of this data frame. This should be a simplified form of the output from `assign_tseg`.
`nbins`	numeric. A vector of the number of bins used to discretize each movement variable. These must be in the same order as the columns within `dat`.

Value

A new data frame that contains the animal ID, track segment number, and the counts per bin for each movement variable. The names for each of these bins are labeled according to the order in which the variables were provided to summarize_tsegs.

Examples

#load data
data(tracks.seg)

#select only id, tseg, SL, and TA columns
tracks.seg2<- tracks.seg[,c("id","tseg","SL","TA")]

#run function
obs<- summarize_tsegs(dat = tracks.seg2, nbins = c(5,8))


#load data
data(tracks.seg)

#select only id, tseg, SL, and TA columns
tracks.seg2<- tracks.seg[,c("id","tseg","SL","TA")]

#run function
obs<- summarize_tsegs(dat = tracks.seg2, nbins = c(5,8))

Internal function that summarizes bin distributions of track segments

Description

Internal function that summarizes bin distributions of track segments

Usage

summarize1(VecVals, Breakpts, nobs, nbins, nbreak)
summarize1(VecVals, Breakpts, nobs, nbins, nbreak)

Arguments

`VecVals`	A vector of bin values.
`Breakpts`	A vector if breakpoints.
`nobs`	The number of observations.
`nbins`	The number of bins for a given data stream.
`nbreak`	The number of estimated breakpoints.

Internal function that generates nmat matrix to help with multinomial draws

Description

Internal function that generates nmat matrix to help with multinomial draws

Usage

SummarizeDat(z, dat, ncateg, nbehav, nobs)
SummarizeDat(z, dat, ncateg, nbehav, nobs)

Arguments

`z`	An integer vector.
`dat`	An integer vector.
`ncateg`	An integer.
`nbehav`	An integer.
`nobs`	An integer.

View trace-plots of output from Bayesian segmentation model

Description

Visualize trace-plots of the number of breakpoints estimated by the model as well as the log marginal likelihood (LML) for each animal ID.

Usage

traceplot(data, type)
traceplot(data, type)

Arguments

`data`	A list of model results that is returned as output from `segment_behavior`.
`type`	character. The type of data that are being plotted from the Bayesian segmentation model results. Takes either 'nbrks' for the number of breakpoints or 'LML' for the log marginal likelihood.

Value

Trace-plots for the number of breakpoints or the log marginal likelihood are displayed for each of the animal IDs that were analyzed by the segmentation model.

Examples


#load data
data(tracks.list)

#only retain id and discretized step length (SL) and turning angle (TA) columns
tracks.list2<- purrr::map(tracks.list,
                   subset,
                  select = c(id, SL, TA))


set.seed(1)

# Define model params
alpha<- 1
ngibbs<- 1000
nbins<- c(5,8)

future::plan(future::multisession, workers = 3)  #run all MCMC chains in parallel

dat.res<- segment_behavior(data = tracks.list2, ngibbs = ngibbs, nbins = nbins,
                               alpha = alpha)

future::plan(future::sequential)  #return to single core


#run function
traceplot(data = dat.res, type = "nbrks")
traceplot(data = dat.res, type = "LML")


#load data
data(tracks.list)

#only retain id and discretized step length (SL) and turning angle (TA) columns
tracks.list2<- purrr::map(tracks.list,
                   subset,
                  select = c(id, SL, TA))


set.seed(1)

# Define model params
alpha<- 1
ngibbs<- 1000
nbins<- c(5,8)

future::plan(future::multisession, workers = 3)  #run all MCMC chains in parallel

dat.res<- segment_behavior(data = tracks.list2, ngibbs = ngibbs, nbins = nbins,
                               alpha = alpha)

future::plan(future::sequential)  #return to single core


#run function
traceplot(data = dat.res, type = "nbrks")
traceplot(data = dat.res, type = "LML")

Simulated set of three tracks.

Description

A dataset containing the IDs as well as x and y coordinates for three tracks of 5001 observations each (15,003 in total).

Usage

tracks
tracks

Format

A data frame with 15003 rows and 4 variables:

id: ID for each simulated track
date: date, recorded as datetime
x: x coordinate of tracks
y: y coordinate of tracks

Tracks discretized and prepared for segmentation.

Description

A dataset containing the prepared data after discretizing step lengths and turning angles, as well as filtering observations at the primary time step.

Usage

tracks.list
tracks.list

Format

A list with three elements, each containing a data frame with ~4700 rows and 11 variables:

id: ID for each simulated track
date: date, recorded as datetime
x: x coordinate of tracks
y: y coordinate of tracks
step: the step length calculated as the distance between successive locations measured in units
angle: the relative turning angle measured in radians
dt: the time step or sampling interval between datetimes of successive observations
obs: the ordered number of observations per ID before filtering for the primary time step
time1: the ordered number of observations per ID after filtering for the primary time step
SL: discretized step lengths, separated into five bins
TA: discretized turning angles, separated into eight bins

Segmented tracks for all IDs.

Description

A dataset containing the filtered track data with time segments assigned to all observations on an individual basis.

Usage

tracks.seg
tracks.seg

Format

A data frame with 14096 rows and 12 variables:

id: ID for each simulated track
date: date, recorded as datetime
x: x coordinate of tracks
y: y coordinate of tracks
step: the step length calculated as the distance between successive locations measured in units
angle: the relative turning angle measured in radians
dt: the time step or sampling interval between datetimes of successive observations
obs: the ordered number of observations per ID before filtering for the primary time step
time1: the ordered number of observations per ID after filtering for the primary time step
SL: discretized step lengths, separated into five bins
TA: discretized turning angles, separated into eight bins
tseg: time segment assigned to a given set of observations per ID

Package 'bayesmove'

Help Index

Assign behavior estimates to observations

Description

Usage

Arguments

Value

Examples

Add segment numbers to observations

Description

Usage

Arguments

Value

Examples

Internal function that adds segment numbers to observations

Description

Usage

Arguments

Value

Internal function that runs RJMCMC on a single animal ID

Description

Usage

Arguments

Value

Internal function that transforms a vector of bin numbers to a presence-absence matrix

Description

Usage

Arguments

Value

Cluster observations into behavioral states

Description

Usage

Arguments

Details

Value

Examples

Cluster time segments into behavioral states

Description

Usage

Arguments

Details

Value

Examples

Internal function that calculates the inverted cumsum

Description

Usage

Arguments

Convert data frame to a list by animal ID

Description

Usage

Arguments

Value

Examples

Discretize movement variables

Description

Usage

Arguments

Value

Examples

Expand behavior estimates from track segments to observations

Description

Usage

Arguments

Value

Examples

Extract behavior proportion estimates for each track segment

Description

Usage

Arguments

Value

Examples

Filter observations for time interval of interest

Description

Usage

Arguments

Value

Examples

Find changes for integer variable

Description

Usage