Title: | Significant Zero Crossings |
---|---|
Description: | Calculates and plots the SiZer map for scatterplot data. A SiZer map is a way of examining when the p-th derivative of a scatterplot-smoother is significantly negative, possibly zero or significantly positive across a range of smoothing bandwidths. |
Authors: | Derek Sonderegger [aut, cre] |
Maintainer: | Derek Sonderegger <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.1-8 |
Built: | 2025-03-14 04:44:20 UTC |
Source: | https://github.com/dereksonderegger/sizer |
A time series of 16 years (5 replicates per year) of mayfly (Ephemeroptera:Heptageniidae) abundance in the fall at the monitoring station AR1 on the Arkansas River in Colorado, USA.
data(Arkansas, package='SiZer')
data(Arkansas, package='SiZer')
A data frame with 90 observations on the following 2 variables.
The year of observation
The Square root of observed abundance.
Sonderegger, D.L., Wang, H., Clements, W.H., and Noon, B.R. 2009. Using SiZer to detect thresholds in ecological data. Frontiers in Ecology and the Environment 7:190-195.
require(ggplot2) data(Arkansas) ggplot(Arkansas, aes(x=year, y=sqrt.mayflies)) + geom_point()
require(ggplot2) data(Arkansas) ggplot(Arkansas, aes(x=year, y=sqrt.mayflies)) + geom_point()
Coerce SiZer object to a Data Frame
## S3 method for class 'SiZer' as.data.frame(x, row.names = NULL, optional = FALSE, ...)
## S3 method for class 'SiZer' as.data.frame(x, row.names = NULL, optional = FALSE, ...)
x |
An object produced by 'SiZer()'. |
row.names |
Required for generic compatibility. Not used. |
optional |
Required for generic compatibility. Not used. |
... |
Required for generic compatibility. Not used. |
and
.Fits a bent-cable model to the given data
Fits a bent-cable model to the given data by exhaustively searching
the 2-dimensional parameter space to find the maximum likelihood
estimators for and
.
bent.cable(x, y, grid.size = 100)
bent.cable(x, y, grid.size = 100)
x |
The independent variable |
y |
The dependent variable |
grid.size |
How many |
Fit the model which is essentially a piecewise linear model with a
quadratic curve of length connecting the two linear pieces.
The reason for searching the space exhaustively is because the bent-cable model often has a likelihood surface with a very flat ridge instead of definite peak. While the exhaustive search is slow, at least it is possible to examine the contour plot of the likelihood surface.
@return A list of 7 elements:
A matrix of log-likelihood values.
A matrix of sum-of-square-error values.
A vector of alpha values examined.
A vector of gamma values examined.
The MLE estimate of alpha.
The MLE estimate of gamma.
The lm
fit after and
are known.
Derek Sonderegger
Chiu, G. S., R. Lockhart, and R. Routledge. 2006. Bent-cable regression theory and applications. Journal of the American Statistical Association 101:542-553.
Toms, J. D., and M. L. Lesperance. 2003. Piecewise regression: a tool for identifying ecological thresholds. Ecology 84:2034-2041.
data(Arkansas) x <- Arkansas$year y <- Arkansas$sqrt.mayflies # For a more accurate estimate, increase grid.size model <- bent.cable(x,y, grid.size=20) plot(x,y) x.grid <- seq(min(x), max(x), length=200) lines(x.grid, predict(model, x.grid), col='red')
data(Arkansas) x <- Arkansas$year y <- Arkansas$sqrt.mayflies # For a more accurate estimate, increase grid.size model <- bent.cable(x,y, grid.size=20) plot(x,y) x.grid <- seq(min(x), max(x), length=200) lines(x.grid, predict(model, x.grid), col='red')
Plot a 'SiZer' object that was created using 'SiZer()'
ggplot_SiZer(x, colorlist = c("red", "purple", "blue", "grey"))
ggplot_SiZer(x, colorlist = c("red", "purple", "blue", "grey"))
x |
An object created using 'SiZer()' |
colorlist |
What colors should be used. This is a vector that corresponds to 'decreasing', 'possibley zero', 'increasing', and 'insufficient data'. |
The white lines in the SiZer map give a graphical representation
of the bandwidth. The horizontal distance between the lines is .
Derek Sonderegger
Chaudhuri, P., and J. S. Marron. 1999. SiZer for exploration of structures in curves. Journal of the American Statistical Association 94:807-823.
Hannig, J., and J. S. Marron. 2006. Advanced distribution theory for SiZer. Journal of the American Statistical Association 101:484-499.
Sonderegger, D.L., Wang, H., Clements, W.H., and Noon, B.R. 2009. Using SiZer to detect thresholds in ecological data. Frontiers in Ecology and the Environment 7:190-195.
plot.SiZer
, locally.weighted.polynomial
data('Arkansas') x <- Arkansas$year y <- Arkansas$sqrt.mayflies plot(x,y) # Calculate the SiZer map for the first derivative SiZer.1 <- SiZer(x, y, h=c(.5,10), degree=1, derv=1, grid.length=21) plot(SiZer.1) plot(SiZer.1, ggplot2=TRUE) ggplot_SiZer(SiZer.1) # Calculate the SiZer map for the second derivative SiZer.2 <- SiZer(x, y, h=c(.5,10), degree=2, derv=2, grid.length=21); plot(SiZer.2) plot(SiZer.2, ggplot2=TRUE) ggplot_SiZer(SiZer.2) # By setting the grid.length larger, we get a more detailed SiZer # map but it takes longer to compute. # # SiZer.3 <- SiZer(x, y, h=c(.5,10), grid.length=100, degree=1, derv=1) # plot(SiZer.3)
data('Arkansas') x <- Arkansas$year y <- Arkansas$sqrt.mayflies plot(x,y) # Calculate the SiZer map for the first derivative SiZer.1 <- SiZer(x, y, h=c(.5,10), degree=1, derv=1, grid.length=21) plot(SiZer.1) plot(SiZer.1, ggplot2=TRUE) ggplot_SiZer(SiZer.1) # Calculate the SiZer map for the second derivative SiZer.2 <- SiZer(x, y, h=c(.5,10), degree=2, derv=2, grid.length=21); plot(SiZer.2) plot(SiZer.2, ggplot2=TRUE) ggplot_SiZer(SiZer.2) # By setting the grid.length larger, we get a more detailed SiZer # map but it takes longer to compute. # # SiZer.3 <- SiZer(x, y, h=c(.5,10), grid.length=100, degree=1, derv=1) # plot(SiZer.3)
Smoothes the given bivariate data using kernel regression.
locally.weighted.polynomial( x, y, h = NA, x.grid = NA, degree = 1, kernel.type = "Normal" )
locally.weighted.polynomial( x, y, h = NA, x.grid = NA, degree = 1, kernel.type = "Normal" )
x |
Vector of data for the independent variable |
y |
Vector of data for the dependent variable |
h |
The bandwidth for the kernel |
x.grid |
What x-values should the value of the smoother be calculated at. |
degree |
The degree of the polynomial to be fit at each x-value. The default is to fit a linear regression, ie degree=1. |
kernel.type |
What kernel to use. Valid choices are 'Normal', 'Epanechnikov', 'biweight', and 'triweight'. |
The confidence intervals are created using the row-wise method of Hannig and Marron (2006).
Notice that the derivative to be estimated must be less than or equal to the degree of the polynomial initially fit to the data.
If the bandwidth is not given, the Sheather-Jones bandwidth selection method is used.
Returns a LocallyWeightedPolynomial
object that has the following elements:
A structure of the data used to generate the smoothing curve
The bandwidth used to generate the smoothing curve.
The grid of x-values that we have estimated function value and derivative(s) for.
The effective sample size at each grid point
A matrix of estimated beta values. The number of rows is degrees+1, while the number of columns is the same as the length of x.grid. Notice that
and so on...
Matrix of estimated variances for Beta
. Same structure as Beta
.
Derek Sonderegger
Chaudhuri, P., and J. S. Marron. 1999. SiZer for exploration of structures in curves. Journal of the American Statistical Association 94 807-823.
Hannig, J., and J. S. Marron. 2006. Advanced distribution theory for SiZer. Journal of the American Statistical Association 101 484-499.
Sonderegger, D.L., Wang, H., Clements, W.H., and Noon, B.R. 2009. Using SiZer to detect thresholds in ecological data. Frontiers in Ecology and the Environment 7:190-195
SiZer
, plot.LocallyWeightedPolynomial
,
spm
in package 'SemiPar', loess
, smooth.spline
,
interpSpline
in the splines
package.
data(Arkansas) x <- Arkansas$year y <- Arkansas$sqrt.mayflies layout(cbind(1,2,3)) model <- locally.weighted.polynomial(x,y) plot(model, main='Smoothed Function', xlab='Year', ylab='Sqrt.Mayflies') model2 <- locally.weighted.polynomial(x,y,h=.5) plot(model2, main='Smoothed Function', xlab='Year', ylab='Sqrt.Mayflies') model3 <- locally.weighted.polynomial(x,y, degree=1) plot(model3, derv=1, main='First Derivative', xlab='Year', ylab='1st Derivative')
data(Arkansas) x <- Arkansas$year y <- Arkansas$sqrt.mayflies layout(cbind(1,2,3)) model <- locally.weighted.polynomial(x,y) plot(model, main='Smoothed Function', xlab='Year', ylab='Sqrt.Mayflies') model2 <- locally.weighted.polynomial(x,y,h=.5) plot(model2, main='Smoothed Function', xlab='Year', ylab='Sqrt.Mayflies') model3 <- locally.weighted.polynomial(x,y, degree=1) plot(model3, derv=1, main='First Derivative', xlab='Year', ylab='1st Derivative')
Return the log-Likelihood value for a fitted bent-cable model.
## S3 method for class 'bent_cable' logLik(object, ...)
## S3 method for class 'bent_cable' logLik(object, ...)
object |
A bent-cable model |
... |
Unused at this time. |
Calculates the log-Likelihood value
## S3 method for class 'PiecewiseLinear' logLik(object, ...)
## S3 method for class 'PiecewiseLinear' logLik(object, ...)
object |
A |
... |
Unused at this time. |
Fit a degree 1 spline with 1 knot point where the location of the knot point is unknown.
piecewise.linear( x, y, middle = 1, CI = FALSE, bootstrap.samples = 1000, sig.level = 0.05 )
piecewise.linear( x, y, middle = 1, CI = FALSE, bootstrap.samples = 1000, sig.level = 0.05 )
x |
Vector of data for the x-axis. |
y |
Vector of data for the y-axis |
middle |
A scalar in |
CI |
Whether or not a bootstrap confidence interval should be calculated. Defaults to FALSE because the interval takes a non-trivial amount of time to calculate |
bootstrap.samples |
The number of bootstrap samples to take when calculating the CI. |
sig.level |
What significance level to use for the confidence intervals. |
The bootstrap samples are taken by resampling the raw data points. Sometimes a more appropriate bootstrap sample would be to calculate the residuals and then add a randomly selected residual to each y-value.
A list of 5 elements is returned:
The estimate of .
The resulting lm
object once is known.
The x-values used.
The y-values used.
Whether or not the confidence interval was calculated.
If the CIs where calculated, this is a matrix of the upper and lower intervals.
Chiu, G. S., R. Lockhart, and R. Routledge. 2006. Bent-cable regression theory and applications. Journal of the American Statistical Association 101:542-553.
Toms, J. D., and M. L. Lesperance. 2003. Piecewise regression: a tool for identifying ecological thresholds. Ecology 84:2034-2041.
The package segmented
has a much more general implementation
of this analysis and users should preferentially use that package.
data(Arkansas) x <- Arkansas$year y <- Arkansas$sqrt.mayflies model <- piecewise.linear(x,y, CI=FALSE) plot(model) print(model) predict(model, 2001)
data(Arkansas) x <- Arkansas$year y <- Arkansas$sqrt.mayflies model <- piecewise.linear(x,y, CI=FALSE) plot(model) print(model) predict(model, 2001)
locally.weighted.polynomial
.Creates a plot of an object created by locally.weighted.polynomial
.
## S3 method for class 'LocallyWeightedPolynomial' plot( x, derv = 0, CI.method = 2, alpha = 0.05, use.ess = TRUE, draw.points = TRUE, ... )
## S3 method for class 'LocallyWeightedPolynomial' plot( x, derv = 0, CI.method = 2, alpha = 0.05, use.ess = TRUE, draw.points = TRUE, ... )
x |
LocallyWeightedPolynomial object |
derv |
Derivative to be plotted. Default is 0 - which plots the smoothed function. |
CI.method |
What method should be used to calculate the confidence interval about the estimated line. The methods are from Hannig and Marron (2006), where 1 is the point-wise estimate, and 2 is the row-wise estimate. |
alpha |
The alpha level such that the CI has a 1-alpha/2 level of significance. |
use.ess |
ESS stands for the estimated sample size. If at any point along the x-axis, the ESS is too small, then we will not plot unless use.ess=FALSE. |
draw.points |
Should the data points be included in the graph? Defaults to TRUE. |
... |
Additional arguments to be passed to the graphing functions. |
Plots a piecewise linear model
## S3 method for class 'PiecewiseLinear' plot(x, xlab = "X", ylab = "Y", ...)
## S3 method for class 'PiecewiseLinear' plot(x, xlab = "X", ylab = "Y", ...)
x |
A |
xlab |
The label for the x-axis |
ylab |
The label for the y-axis |
... |
Any further options to be passed to the |
SiZer
object that was created using SiZer()
Plot a SiZer map
Plot a SiZer
object that was created using SiZer()
## S3 method for class 'SiZer' plot( x, ylab = expression(log[10](h)), colorlist = c("red", "purple", "blue", "grey"), ggplot2 = FALSE, ... )
## S3 method for class 'SiZer' plot( x, ylab = expression(log[10](h)), colorlist = c("red", "purple", "blue", "grey"), ggplot2 = FALSE, ... )
x |
An object created using |
ylab |
What the y-axis should be labled. |
colorlist |
What colors should be used. This is a vector that corresponds to 'decreasing', 'possibley zero', 'increasing', and 'insufficient data'. |
ggplot2 |
Should the graphing be done using 'ggplot2'? Defaults to FALSE for backwards compatibility. |
... |
Any other parameters to be passed to the function |
The white lines in the SiZer map give a graphical representation
of the bandwidth. The horizontal distance between the lines is .
Derek Sonderegger
Chaudhuri, P., and J. S. Marron. 1999. SiZer for exploration of structures in curves. Journal of the American Statistical Association 94:807-823.
Hannig, J., and J. S. Marron. 2006. Advanced distribution theory for SiZer. Journal of the American Statistical Association 101:484-499.
Sonderegger, D.L., Wang, H., Clements, W.H., and Noon, B.R. 2009. Using SiZer to detect thresholds in ecological data. Frontiers in Ecology and the Environment 7:190-195.
plot.SiZer
, locally.weighted.polynomial
data('Arkansas') x <- Arkansas$year y <- Arkansas$sqrt.mayflies plot(x,y) # Calculate the SiZer map for the first derivative SiZer.1 <- SiZer(x, y, h=c(.5,10), degree=1, derv=1, grid.length=21) plot(SiZer.1) plot(SiZer.1, ggplot2=TRUE) # Calculate the SiZer map for the second derivative SiZer.2 <- SiZer(x, y, h=c(.5,10), degree=2, derv=2, grid.length=21); plot(SiZer.2) # By setting the grid.length larger, we get a more detailed SiZer # map but it takes longer to compute. # # SiZer.3 <- SiZer(x, y, h=c(.5,10), grid.length=100, degree=1, derv=1) # plot(SiZer.3)
data('Arkansas') x <- Arkansas$year y <- Arkansas$sqrt.mayflies plot(x,y) # Calculate the SiZer map for the first derivative SiZer.1 <- SiZer(x, y, h=c(.5,10), degree=1, derv=1, grid.length=21) plot(SiZer.1) plot(SiZer.1, ggplot2=TRUE) # Calculate the SiZer map for the second derivative SiZer.2 <- SiZer(x, y, h=c(.5,10), degree=2, derv=2, grid.length=21); plot(SiZer.2) # By setting the grid.length larger, we get a more detailed SiZer # map but it takes longer to compute. # # SiZer.3 <- SiZer(x, y, h=c(.5,10), grid.length=100, degree=1, derv=1) # plot(SiZer.3)
Return model predictions for fitted bent-cable model
## S3 method for class 'bent_cable' predict(object, x, ...)
## S3 method for class 'bent_cable' predict(object, x, ...)
object |
A bent-cable model |
x |
The set x-values for which predictions are desired |
... |
A placeholder that is currently ignored. |
Calculates predicted values from a piecewise linear object
## S3 method for class 'PiecewiseLinear' predict(object, x, ...)
## S3 method for class 'PiecewiseLinear' predict(object, x, ...)
object |
A |
x |
A vector of x-values in which to calculate the y. |
... |
Unused at this time. |
Prints out the model form for a Piecewise linear model
## S3 method for class 'PiecewiseLinear' print(x, ...)
## S3 method for class 'PiecewiseLinear' print(x, ...)
x |
A |
... |
Unused at this time. |
Calculates the SiZer map from a given set of X and Y variables.
SiZer( x, y, h = NA, x.grid = NA, degree = NA, derv = 1, grid.length = 41, quiet = TRUE )
SiZer( x, y, h = NA, x.grid = NA, degree = NA, derv = 1, grid.length = 41, quiet = TRUE )
x |
data vector for the independent axis |
y |
data vector for the dependent axis |
h |
An integer representing how many bandwidths should be considered, or vector of length 2 representing the upper and lower limits h should take, or a vector of length greater than two indicating which bandwidths to examine. |
x.grid |
An integer representing how many bins to use along the x-axis, or a vector of length 2 representing the upper and lower limits the x-axis should take, or a vector of length greater than two indicating which x-values the derivative should be evaluated at |
degree |
The degree of the local weighted polynomial used to smooth the data.
This must be greater than or equal to |
derv |
The order of derivative for which to make the SiZer map. |
grid.length |
The default length of the |
quiet |
Should diagnostic messages be suppressed? Defaults to TRUE. |
SiZer stands for the Significant Zero crossings of the derivative. There are two dominate approaches in smoothing bivariate data: locally weighted regression or penalized splines. Both approaches require the use of a 'bandwidth' parameter that controls how much smoothing should be done. Unfortunately there is no uniformly best bandwidth selection procedure. SiZer (Chaudhuri and Marron, 1999) is a procedure that looks across a range of bandwidths and classifies the p-th derivative of the smoother into one of three states: significantly increasing (blue), possibly zero (purple), or significantly negative (red).
Returns list object of type SiZer which has the following components:
Vector of x-values at which the derivative was evaluated.
Vector of bandwidth values for which a smoothing function was calculated.
Matrix of what category a particular x-value and bandwidth falls into (Increasing=1, Possibly Zero=0, Decreasing=-1, Not Enough Data=2).
Derek Sonderegger
Chaudhuri, P., and J. S. Marron. 1999. SiZer for exploration of structures in curves. Journal of the American Statistical Association 94:807-823.
Hannig, J., and J. S. Marron. 2006. Advanced distribution theory for SiZer. Journal of the American Statistical Association 101:484-499.
Sonderegger, D.L., Wang, H., Clements, W.H., and Noon, B.R. 2009. Using SiZer to detect thresholds in ecological data. Frontiers in Ecology and the Environment 7:190-195.
plot.SiZer
, locally.weighted.polynomial
data('Arkansas') x <- Arkansas$year y <- Arkansas$sqrt.mayflies plot(x,y) # Calculate the SiZer map for the first derivative SiZer.1 <- SiZer(x, y, h=c(.5,10), degree=1, derv=1, grid.length=21) plot(SiZer.1) plot(SiZer.1, ggplot2=TRUE) # Calculate the SiZer map for the second derivative SiZer.2 <- SiZer(x, y, h=c(.5,10), degree=2, derv=2, grid.length=21); plot(SiZer.2) # By setting the grid.length larger, we get a more detailed SiZer # map but it takes longer to compute. # # SiZer.3 <- SiZer(x, y, h=c(.5,10), grid.length=100, degree=1, derv=1) # plot(SiZer.3)
data('Arkansas') x <- Arkansas$year y <- Arkansas$sqrt.mayflies plot(x,y) # Calculate the SiZer map for the first derivative SiZer.1 <- SiZer(x, y, h=c(.5,10), degree=1, derv=1, grid.length=21) plot(SiZer.1) plot(SiZer.1, ggplot2=TRUE) # Calculate the SiZer map for the second derivative SiZer.2 <- SiZer(x, y, h=c(.5,10), degree=2, derv=2, grid.length=21); plot(SiZer.2) # By setting the grid.length larger, we get a more detailed SiZer # map but it takes longer to compute. # # SiZer.3 <- SiZer(x, y, h=c(.5,10), grid.length=100, degree=1, derv=1) # plot(SiZer.3)