Title: | Package to Calculate the Influence of the Data on a Changepoint Segmentation |
---|---|
Description: | Allows users to input their data, segmentation and function used for the segmentation (and additional arguments) and the package calculates the influence of the data on the changepoint locations, see Wilms et al. (2022) <doi:10.1080/10618600.2021.2000873>. Currently this can only be used with the changepoint package functions to identify changes, but we plan to extend this. There are options for different types of graphics to assess the influence. |
Authors: | Rebecca Killick [aut, cre], Ines Wilms [aut] |
Maintainer: | Rebecca Killick <[email protected]> |
License: | GPL |
Version: | 1.0.2 |
Built: | 2024-11-15 04:52:15 UTC |
Source: | https://github.com/rkillick/changepoint.influence |
Allows users to input their data, segmentation and function used for the segmentation (and additional arguments) and the package calculates the influence of the data on the changepoint locations, see Wilms et al. (2022) <doi:10.1080/10618600.2021.2000873>. Currently this can only be used with the changepoint package functions to identify changes, but we plan to extend this. There are options for different types of graphics to assess the influence.
The DESCRIPTION file:
Package: | changepoint.influence |
Type: | Package |
Title: | Package to Calculate the Influence of the Data on a Changepoint Segmentation |
Version: | 1.0.2 |
Date: | 2024-02-19 |
Authors@R: | c(person("Rebecca", "Killick", role=c("aut","cre"),email="[email protected]"), person("Ines", "Wilms", role="aut")) |
Maintainer: | Rebecca Killick <[email protected]> |
BugReports: | https://github.com/rkillick/changepoint.influence/issues |
URL: | https://github.com/rkillick/changepoint.influence/ |
Imports: | data.table, ggplot2, gridExtra, reshape, graphics, methods |
Depends: | R(>= 3.6), changepoint |
Suggests: | testthat, vdiffr |
Description: | Allows users to input their data, segmentation and function used for the segmentation (and additional arguments) and the package calculates the influence of the data on the changepoint locations, see Wilms et al. (2022) <doi:10.1080/10618600.2021.2000873>. Currently this can only be used with the changepoint package functions to identify changes, but we plan to extend this. There are options for different types of graphics to assess the influence. |
License: | GPL |
LazyData: | true |
Packaged: | 2024-02-19 15:55:29 UTC; killick |
Repository: | https://rkillick.r-universe.dev |
RemoteUrl: | https://github.com/rkillick/changepoint.influence |
RemoteRef: | HEAD |
RemoteSha: | 863e1a6fe11ec022dc77a412c4cf6935419a7aeb |
Author: | Rebecca Killick [aut, cre], Ines Wilms [aut] |
Index of help topics:
InfluenceMap Influence Map Graphic LocationStability Location Stability Graphic ParameterStability Parameter Stability Graphic StabilityOverview Stability Overview Graphic changepoint.influence-package Package to Calculate the Influence of the Data on a Changepoint Segmentation welldata Welllog data
The package allows users to input their data, segmentation and function used for the segmentation (and additional arguments) and the package calculates the influence of the data on the changepoint locations.
The influence() function is the first port of call to calculate the influence. We provide two methods for influence detection, via "delete" and "outlier" options which respectively consider the effect of deleting a data point or making it an outlier. Currently we provide this method for cpt objects (as generated by the "changepoint" package) but plan to extend this to other objects in the future. Please add requests for objects to include to our github issues.
Users are encouraged to explore the documentation for the StabilityOverview() graphic, followed by the LocationStability() and ParameterStability() graphics for a more granual view, followed by the InfluenceMap() as the highest level of detail.
Rebecca Killick [aut, cre], Ines Wilms [aut]
Maintainer: Rebecca Killick <[email protected]>
Wilms I, Killick R, Matteson DS (2022) Graphical Influence Diagnostics for Changepoint Models, Journal of Computational and Graphical Statistics, 31:3, 753–765 DOI: 10.1080/10618600.2021.2000873
influence-methods
,StabilityOverview
,
LocationStability
, ParameterStability
, InfluenceMap
#### Load the data in the R package changepoint.influence #### data("welldata") welllog = welldata[1001:2000] # Extract the mid section of the data as analyzed in other papers n = length(welllog) var = NULL; for (i in 30:1000){var[i]=var(welllog[(i-29):i])} welllogs = welllog/sqrt(median(var, na.rm = TRUE)) # rescale the data to have unit variance across time, # note that there may still be changes in variance across the series. #### Apply PELT to the welllog data #### out.PELT = cpt.mean(welllogs, method = 'PELT') #### Calculate the influence measures #### welllogs.inf = influence(out.PELT) # the code extracts all the details of the original cpt.mean() function call # and uses these in the calculation of the influence for the modified data. #### Stability Dashboards #### StabilityOverview(welllogs, cpts(out.PELT), welllogs.inf, las = 1,ylab='Nuclear-Magnetic Response', legend.args=list(display=TRUE,x="bottomright",y=NULL,cex=1.5,bty="n",horiz=FALSE,xpd=FALSE)) # We can specify where the legend will sit in the graphic via the legend.args # which are passed to the legend() function. We can also include additional arguments # to pass to the plotting such as las=1 here. #### Location Stability plot #### exp.seg=LocationStability(cpts(out.PELT), welllogs.inf, type = 'Difference', cpt.lwd = 4, las = 1) # Note that if the expected segmentation is not provided, it will be calcuated and then # returned so that the user can avoid calculating this again in other plot calls. #### Parameter Stability plot #### ParameterStability(welllogs.inf, original.mean = rep(param.est(out.PELT)$mean, times=diff(c(0,out.PELT@cpts))), las = 1, ylab = 'Nuclear-Magnetic Response') # Note that the original.mean argument is provided for each timepoint so is a length n vector. #### Influence Map #### ## Not run: library(ggplot2) welllogs.inf = influence(out.PELT, method = "delete") InfluenceMap(cpts(out.PELT),welllogs.inf,data=welllogs,include.data=TRUE, ylab='Nuclear-Magnetic\n Response', ggops=theme(axis.text=element_text(size=15),axis.title=element_text(size=20), plot.title=element_text(size=25))) # The InfluenceMap uses ggplot2 functions, thus you can add theme options via the ggops argument. # Here we change the text sizes to ensure readable titles and labels for a report. welllogs.inf = influence(out.PELT, method = "outlier") InfluenceMap(cpts(out.PELT), welllogs.inf, data = welllogs, include.data = TRUE, ylab='Nuclear-Magnetic\n Response') ## End(Not run)
#### Load the data in the R package changepoint.influence #### data("welldata") welllog = welldata[1001:2000] # Extract the mid section of the data as analyzed in other papers n = length(welllog) var = NULL; for (i in 30:1000){var[i]=var(welllog[(i-29):i])} welllogs = welllog/sqrt(median(var, na.rm = TRUE)) # rescale the data to have unit variance across time, # note that there may still be changes in variance across the series. #### Apply PELT to the welllog data #### out.PELT = cpt.mean(welllogs, method = 'PELT') #### Calculate the influence measures #### welllogs.inf = influence(out.PELT) # the code extracts all the details of the original cpt.mean() function call # and uses these in the calculation of the influence for the modified data. #### Stability Dashboards #### StabilityOverview(welllogs, cpts(out.PELT), welllogs.inf, las = 1,ylab='Nuclear-Magnetic Response', legend.args=list(display=TRUE,x="bottomright",y=NULL,cex=1.5,bty="n",horiz=FALSE,xpd=FALSE)) # We can specify where the legend will sit in the graphic via the legend.args # which are passed to the legend() function. We can also include additional arguments # to pass to the plotting such as las=1 here. #### Location Stability plot #### exp.seg=LocationStability(cpts(out.PELT), welllogs.inf, type = 'Difference', cpt.lwd = 4, las = 1) # Note that if the expected segmentation is not provided, it will be calcuated and then # returned so that the user can avoid calculating this again in other plot calls. #### Parameter Stability plot #### ParameterStability(welllogs.inf, original.mean = rep(param.est(out.PELT)$mean, times=diff(c(0,out.PELT@cpts))), las = 1, ylab = 'Nuclear-Magnetic Response') # Note that the original.mean argument is provided for each timepoint so is a length n vector. #### Influence Map #### ## Not run: library(ggplot2) welllogs.inf = influence(out.PELT, method = "delete") InfluenceMap(cpts(out.PELT),welllogs.inf,data=welllogs,include.data=TRUE, ylab='Nuclear-Magnetic\n Response', ggops=theme(axis.text=element_text(size=15),axis.title=element_text(size=20), plot.title=element_text(size=25))) # The InfluenceMap uses ggplot2 functions, thus you can add theme options via the ggops argument. # Here we change the text sizes to ensure readable titles and labels for a report. welllogs.inf = influence(out.PELT, method = "outlier") InfluenceMap(cpts(out.PELT), welllogs.inf, data = welllogs, include.data = TRUE, ylab='Nuclear-Magnetic\n Response') ## End(Not run)
Plots the highest detail level of the changepoint location stability according to the influence measure.
InfluenceMap(original.cpts, influence, resid=NULL,data=NULL,include.data=FALSE, influence.col=c("#0C4479","white","#AB9783"),cpt.col=c("#009E73", "#E69F00", "#E41A1C"), cpt.lty=c("dashed","dotdash","dotted"),ylab='',ggops=NULL)
InfluenceMap(original.cpts, influence, resid=NULL,data=NULL,include.data=FALSE, influence.col=c("#0C4479","white","#AB9783"),cpt.col=c("#009E73", "#E69F00", "#E41A1C"), cpt.lty=c("dashed","dotdash","dotted"),ylab='',ggops=NULL)
original.cpts |
An ordered vector of the changepoint locations found by your favourite changepoint method. |
influence |
The influence as calculated the |
resid |
An nxn matrix containing the difference of the observed class ( |
data |
A vector containing the data on which you have run your changepoint method. |
include.data |
Is a plot of the data to be included above the histogram. Default is |
influence.col |
A length 3 vector giving the lower, middle (0) and upper bounds for the influence map colour grading. Note that you should choose these colours to not conflict with the colours used for |
cpt.col |
Colour of the |
cpt.lty |
Line type of the |
ylab |
The label for the y-axis, character vector expected. |
ggops |
Any other settings to be passed to the |
This function creates the highest detail graphic to display the results of a changepoint influence analysis on the location of the changepoints. The graphic is an nxn heatmap of the difference between the observed segmentations under the "delete" or "outlier" Influence analysis and the expected segmentation. Note that the expected segmentations take into account the fact that a changepoint at a timepoint, say 100, will move (to 99) when a timepoint prior to it is deleted and that adding an outlier will introduce new changepoints.
Datapoints on the vertical axis without a single coloured co-ordinate on the horizontal axis can be considered as non-influential since they do not trigger any changepoint instability. Rows with coloured pixels correspond to data points which are instability triggers.
How to interpret the Influence Map (please also read the paper in the references for fuller details):
Colouring above the diagonal indicates that an al-teration of the corresponding data point (on the vertical axis) affects earlier data points,colouring below the diagonal indicates that subsequent data points are affected.
A stop in colouring indicates that change-points have moved, while a continuation of colouring to the last data point indicates that, in total, fewer or additional changepoints are detected.
Most colouring originates on the diagonal,thereby indicating that a data point's alteration mainly affects neighbouring data points that most often belong to the same segment. By contrast, in some cases a coloured pixel may originate away from the diagonal, thereby exercising global influence.
All data points (on the vertical axis) that appear in the coloured area are influential and assert influence over the corresponding data points on the horizontal axis. The height can be seen as the extent to which instability arises in this influential region.
The function returns a plot denoted the Influence Map. If resid=NULL
then the residuals (observed class - expected class) are also returned.
Rebecca Killick
Wilms I, Killick R, Matteson DS (2022) Graphical Influence Diagnostics for Changepoint Models, Journal of Computational and Graphical Statistics, 31:3, 753–765 DOI: 10.1080/10618600.2021.2000873
influence-methods
, StabilityOverview
, ParameterStability
, LocationStability
#### Generate Simulated data example #### set.seed(30) x=c(rnorm(50),rnorm(50,mean=5),rnorm(1,mean=15),rnorm(49,mean=5),rnorm(50,mean=4)) xcpt = cpt.mean(x,method='PELT') # Get the changepoints via PELT #### Influence Map #### ## Not run: library(ggplot2) x.inf = influence(xcpt, method = "delete") InfluenceMap(cpts(xcpt), x.inf, data = x, include.data = TRUE, ggops = theme(axis.text = element_text(size=15), axis.title = element_text(size=20), plot.title = element_text(size=25))) x.inf = influence(xcpt, method = "outlier") InfluenceMap(cpts(xcpt), x.inf, data=x, include.data = TRUE, ggops = theme(axis.text = element_text(size=15), axis.title = element_text(size=20), plot.title = element_text(size=25))) ## End(Not run)
#### Generate Simulated data example #### set.seed(30) x=c(rnorm(50),rnorm(50,mean=5),rnorm(1,mean=15),rnorm(49,mean=5),rnorm(50,mean=4)) xcpt = cpt.mean(x,method='PELT') # Get the changepoints via PELT #### Influence Map #### ## Not run: library(ggplot2) x.inf = influence(xcpt, method = "delete") InfluenceMap(cpts(xcpt), x.inf, data = x, include.data = TRUE, ggops = theme(axis.text = element_text(size=15), axis.title = element_text(size=20), plot.title = element_text(size=25))) x.inf = influence(xcpt, method = "outlier") InfluenceMap(cpts(xcpt), x.inf, data=x, include.data = TRUE, ggops = theme(axis.text = element_text(size=15), axis.title = element_text(size=20), plot.title = element_text(size=25))) ## End(Not run)
Plots the middle detail level of the changepoint location stability according to the influence measure.
LocationStability(original.cpts, influence, expected.class=NULL, type=c("Difference","Global","Local"),data=NULL,include.data=FALSE,cpt.lwd=4, cpt.col=c("#009E73", "#E69F00", "#E41A1C"),cpt.lty=c("dashed","dotdash","dotted"), ylab='',xlab='Index',...)
LocationStability(original.cpts, influence, expected.class=NULL, type=c("Difference","Global","Local"),data=NULL,include.data=FALSE,cpt.lwd=4, cpt.col=c("#009E73", "#E69F00", "#E41A1C"),cpt.lty=c("dashed","dotdash","dotted"), ylab='',xlab='Index',...)
original.cpts |
An ordered vector of the changepoint locations found by your favourite changepoint method. |
influence |
The influence as calculated the |
expected.class |
Only needed for |
type |
The type of Location Stability plot, can be |
data |
A vector containing the data on which you have run your changepoint method. |
include.data |
Is a plot of the data to be included above the histogram. Default is |
cpt.lwd |
The line width to be used when plotting the |
cpt.col |
Colour of the |
cpt.lty |
Line type of the |
ylab , xlab
|
The labels for the x- and y-axis, character vector expected. |
... |
Any other arguments to be passed to the |
This function creates a more granular graphic to display the results of a changepoint influence analysis on the location of the changepoints. The graphic is a histogram of the observed segmentations under the "delete" or "outlier" Influence analysis. The colour and line type of the bars at the original.cpts
locations reflect their stability. The first value of their arguments denotes a stable changepoint - which appears at the same location in all influence segmentations. The second argument denotes an unstable changepoint - which doesn't appear at the same location in all influence segmentations, either it moves or is deleted. The third argument denotes changepoint locations which are deemed outliers as two changepoints occur at consecutive locations (surrounding the outlying observation). Please note that the type="Global"
only uses colour and not line type.
type="Difference"
gives the difference between the observed and expected changepoint segmentations under the "delete" or "outlier" Influence analysis. A positive value can only occur where a changepoint is contained in the observed segmentations but is not present in the expected (an additional changepoint time). A negative value can only occur at the original changepoint location where the changepoint is not present in atleast one of the observed segmentations. Note that the expected segmentations take into account the fact that a changepoint at a timepoint, say 100, will move (to 99) when a timepoint prior to it is deleted.
type="Global"
histograms the observed segmentations. Colour is added to the original changepoint locations and a horizontal (light grey) line is added to the plot to denote the maximum count. Any original changepoint bars that do not meet this grey line indicates that the changepoint is unstable as it either moves or is deleted in atleast one of the observed segmentations. For large datasets this can be difficult to view what is going on at any locations that appear as black bars as these are typically small counts. Hence the inclusion of the "Local" option.
type="Local"
histograms the observed segmentations with the original changepoint locations removed. This is to allow users to see the smaller counts that can be masked in larger datasets. These are the locations where either original changepoints move to or additional changepoints are added.
The function returns plot(s) and a list containing the labels of the original.cpts
as either "stable", "unstable", or "outlier". If type="Difference"
and expected.class=NULL
then the expected class is also returned as the first element of the list.
Rebecca Killick
Wilms I, Killick R, Matteson DS (2022) Graphical Influence Diagnostics for Changepoint Models, Journal of Computational and Graphical Statistics, 31:3, 753–765 DOI: 10.1080/10618600.2021.2000873
influence-methods
, StabilityOverview
,ParameterStability
,InfluenceMap
#### Generate Simulated data example #### set.seed(30) x = c(rnorm(50), rnorm(50, mean = 5), rnorm(1, mean = 15), rnorm(49, mean = 5), rnorm(50, mean = 4)) xcpt = cpt.mean(x,method='PELT') # Get the changepoints via PELT #### Get the influence for both delete and outlier options #### x.inf = influence(xcpt) #### Location Stability Difference plot #### exp.class=LocationStability(cpts(xcpt), type = 'Difference', x.inf, cpt.lwd = 4, las = 1) # note that the expected.class is also returned #### Location Stability Global plot #### exp.class=LocationStability(cpts(xcpt), type = 'Global', x.inf, cpt.lwd = 4, las = 1) #### Location Stability Local plot #### exp.class=LocationStability(cpts(xcpt), type = 'Local', x.inf, cpt.lwd = 4, las = 1)
#### Generate Simulated data example #### set.seed(30) x = c(rnorm(50), rnorm(50, mean = 5), rnorm(1, mean = 15), rnorm(49, mean = 5), rnorm(50, mean = 4)) xcpt = cpt.mean(x,method='PELT') # Get the changepoints via PELT #### Get the influence for both delete and outlier options #### x.inf = influence(xcpt) #### Location Stability Difference plot #### exp.class=LocationStability(cpts(xcpt), type = 'Difference', x.inf, cpt.lwd = 4, las = 1) # note that the expected.class is also returned #### Location Stability Global plot #### exp.class=LocationStability(cpts(xcpt), type = 'Global', x.inf, cpt.lwd = 4, las = 1) #### Location Stability Local plot #### exp.class=LocationStability(cpts(xcpt), type = 'Local', x.inf, cpt.lwd = 4, las = 1)
Plots the middle detail level of the changepoint parameter stability according to the influence measure.
ParameterStability(influence,original.mean=NULL,digits=6,ylab='',xlab='Index', cpt.col='red',cpt.width=3,...)
ParameterStability(influence,original.mean=NULL,digits=6,ylab='',xlab='Index', cpt.col='red',cpt.width=3,...)
influence |
The influence as calculated the |
original.mean |
A vector, length n, of the mean under the original segmentation at each timepoint. |
digits |
The number of significant figures to round the mean values to before plotting. (Purely to reduce the number of points plotted to make the graphics smaller for storage and loading) |
ylab , xlab
|
The labels for the x- and y-axis, character vector expected. |
cpt.col |
Colour of the original parameter vector when plotted. Any values accepted by the |
cpt.width |
Width of the original parameter vector when plotted. Any values accepted by the |
... |
Any other arguments to be passed to the |
This function creates a more granular graphic to display the results of a changepoint influence analysis on the estimated segment parameter. The graphic depicts the observed segment parameters under the "delete" or "outlier" Influence analysis. The intensity of the grey denotes how often that parameter values was seen across all segmentations. We overlay this with the original segment parameters.
The function returns a plot (silently).
Rebecca Killick
Wilms I, Killick R, Matteson DS (2022) Graphical Influence Diagnostics for Changepoint Models, Journal of Computational and Graphical Statistics, 31:3, 753–765 DOI: 10.1080/10618600.2021.2000873
influence-methods
, StabilityOverview
,LocationStability
,InfluenceMap
#### Generate Simulated data example #### set.seed(30) x=c(rnorm(50),rnorm(50,mean=5),rnorm(1,mean=15),rnorm(49,mean=5),rnorm(50,mean=4)) xcpt = cpt.mean(x,method='PELT') # Get the changepoints via PELT #### Get the influence for both delete and outlier options #### x.inf = influence(xcpt) #### Parameter Stability plot #### ParameterStability(x.inf, original.mean = rep(param.est(xcpt)$mean, times=diff(c(0,xcpt@cpts))), las = 1) # note that the original mean is an n length vector and you can use the above code # to get this from the original changepoint locations.
#### Generate Simulated data example #### set.seed(30) x=c(rnorm(50),rnorm(50,mean=5),rnorm(1,mean=15),rnorm(49,mean=5),rnorm(50,mean=4)) xcpt = cpt.mean(x,method='PELT') # Get the changepoints via PELT #### Get the influence for both delete and outlier options #### x.inf = influence(xcpt) #### Parameter Stability plot #### ParameterStability(x.inf, original.mean = rep(param.est(xcpt)$mean, times=diff(c(0,xcpt@cpts))), las = 1) # note that the original mean is an n length vector and you can use the above code # to get this from the original changepoint locations.
Plots the overview of the stability according to the influence measure.
StabilityOverview(data, original.cpts, influence,cpt.lwd=2, cpt.col=c("#009E73", "#E69F00", "#E41A1C"),cpt.lty=c("dashed","dotdash","dotted"), ylab=' ',xlab='Index', legend.args=list(display=TRUE,x="left",y=NULL,cex = 1,bty="n", horiz=TRUE,xpd=FALSE), ...)
StabilityOverview(data, original.cpts, influence,cpt.lwd=2, cpt.col=c("#009E73", "#E69F00", "#E41A1C"),cpt.lty=c("dashed","dotdash","dotted"), ylab=' ',xlab='Index', legend.args=list(display=TRUE,x="left",y=NULL,cex = 1,bty="n", horiz=TRUE,xpd=FALSE), ...)
data |
A vector containing the data on which you have run your changepoint method. |
original.cpts |
An ordered vector of the changepoint locations found by your favourite changepoint method. |
influence |
The influence as calculated the |
cpt.lwd |
The line width to be used when plotting the |
cpt.col |
Colour of the |
cpt.lty |
Line type of the |
ylab , xlab
|
The labels for the x- and y-axis, character vector expected. |
legend.args |
These arguments are passed to the |
... |
Any other arguments to be passed to the |
This function creates a first summary graphic to display the results of a changepoint influence analysis. The graphic is a plot of the original data with the changepoints as vertical lines at their respective positions. The colour and line type of the changepoint vertical lines reflect their stability. The first value of their arguments denotes a stable changepoint - which appears at the same location in all influence segmentations. The second argument denotes an unstable changepoint - which doesn't appear at the same location in all influence segmentations, either it moves or is deleted. The third argument denotes changepoint locations which are deemed outliers as two changepoints occur at consecutive locations (surrounding the outlying observation).
The function returns a plot and a list containing the labels of the original.cpts
as either "stable", "unstable", or "outlier".
Rebecca Killick
Wilms I, Killick R, Matteson DS (2022) Graphical Influence Diagnostics for Changepoint Models, Journal of Computational and Graphical Statistics, 31:3, 753–765 DOI: 10.1080/10618600.2021.2000873
influence-methods
, LocationStability
,ParameterStability
,InfluenceMap
#### Generate Simulated data example #### set.seed(30) x=c(rnorm(50),rnorm(50,mean=5),rnorm(1,mean=15),rnorm(49,mean=5),rnorm(50,mean=4)) xcpt = cpt.mean(x,method='PELT') # Get the changepoints via PELT #### Get the influence for both delete and outlier options #### x.inf = influence(xcpt) #### Stability Dashboard #### StabilityOverview(x,cpts(xcpt),x.inf,las=1, legend.args=list(display=TRUE,x="topright",y=NULL,cex=1.5,bty="n",horiz=FALSE,xpd=FALSE))
#### Generate Simulated data example #### set.seed(30) x=c(rnorm(50),rnorm(50,mean=5),rnorm(1,mean=15),rnorm(49,mean=5),rnorm(50,mean=4)) xcpt = cpt.mean(x,method='PELT') # Get the changepoints via PELT #### Get the influence for both delete and outlier options #### x.inf = influence(xcpt) #### Stability Dashboard #### StabilityOverview(x,cpts(xcpt),x.inf,las=1, legend.args=list(display=TRUE,x="topright",y=NULL,cex=1.5,bty="n",horiz=FALSE,xpd=FALSE))
This data has been used in previous changepoint papers and is described and provided in "On-line inference for hidden Markov models via particle filters" by Fearnhead and Clifford in 2003. The data consists of measurements of the nuclear magnetic response of underground rocks.
Please note that this is the original data. The data analyzed in the majority of publications has been standardized and/or had the outliers removed. Papers typically only analyze a portion of the 4050 vector too.
welldata
welldata
A vector of length 4050.
https://doi.org/10.1111/1467-9868.00421
#### Load the data in the R package changepoint.influence #### data("welldata") welllog = welldata[1001:2000] # Extract the mid section of the data as analyzed in other papers n = length(welllog) var = NULL; for (i in 30:1000){var[i]=var(welllog[(i-29):i])} welllogs = welllog/sqrt(median(var, na.rm = TRUE)) # rescale the data to have unit variance across time, # note that there may still be changes in variance across the series. #### Apply PELT to the welllog data #### out.PELT = cpt.mean(welllogs, method = 'PELT') #### Calculate the influence measures #### welllogs.inf = influence(out.PELT) # the code extracts all the details of the original cpt.mean() function call # and uses these in the calculation of the influence for the modified data. #### Stability Dashboards #### StabilityOverview(welllogs,cpts(out.PELT),welllogs.inf,las=1,ylab='Nuclear-Magnetic Response', legend.args=list(display=TRUE,x="bottomright",y=NULL,cex=1.5,bty="n",horiz=FALSE,xpd=FALSE)) # We can specify where the legend will sit in the graphic via the legend.args # which are passed to the legend() function. We can also include additional # arguments to pass to the plotting such as las=1 here. #### Location Stability plot #### exp.seg=LocationStability(cpts(out.PELT), welllogs.inf, type = 'Difference', cpt.lwd = 4, las = 1) # Note that if the expected segmentation is not provided, it will be calcuated # and then returned so that the user can avoid calculating this again in other plot calls. #### Parameter Stability plot #### ParameterStability(welllogs.inf, original.mean = rep(param.est(out.PELT)$mean, times=diff(c(0,out.PELT@cpts))), las = 1, ylab = 'Nuclear-Magnetic Response') # Note that the original.mean argument is provided for each timepoint so is a length n vector. #### Influence Map #### welllogs.inf = influence(out.PELT, method = "delete") inf.resid.del=InfluenceMap(cpts(out.PELT), welllogs.inf, data = welllogs, include.data = TRUE, ylab = 'Nuclear-Magnetic\n Response') welllogs.inf = influence(out.PELT, method = "outlier") inf.resid.out=InfluenceMap(cpts(out.PELT), welllogs.inf, data = welllogs, include.data = TRUE, ylab='Nuclear-Magnetic\n Response')
#### Load the data in the R package changepoint.influence #### data("welldata") welllog = welldata[1001:2000] # Extract the mid section of the data as analyzed in other papers n = length(welllog) var = NULL; for (i in 30:1000){var[i]=var(welllog[(i-29):i])} welllogs = welllog/sqrt(median(var, na.rm = TRUE)) # rescale the data to have unit variance across time, # note that there may still be changes in variance across the series. #### Apply PELT to the welllog data #### out.PELT = cpt.mean(welllogs, method = 'PELT') #### Calculate the influence measures #### welllogs.inf = influence(out.PELT) # the code extracts all the details of the original cpt.mean() function call # and uses these in the calculation of the influence for the modified data. #### Stability Dashboards #### StabilityOverview(welllogs,cpts(out.PELT),welllogs.inf,las=1,ylab='Nuclear-Magnetic Response', legend.args=list(display=TRUE,x="bottomright",y=NULL,cex=1.5,bty="n",horiz=FALSE,xpd=FALSE)) # We can specify where the legend will sit in the graphic via the legend.args # which are passed to the legend() function. We can also include additional # arguments to pass to the plotting such as las=1 here. #### Location Stability plot #### exp.seg=LocationStability(cpts(out.PELT), welllogs.inf, type = 'Difference', cpt.lwd = 4, las = 1) # Note that if the expected segmentation is not provided, it will be calcuated # and then returned so that the user can avoid calculating this again in other plot calls. #### Parameter Stability plot #### ParameterStability(welllogs.inf, original.mean = rep(param.est(out.PELT)$mean, times=diff(c(0,out.PELT@cpts))), las = 1, ylab = 'Nuclear-Magnetic Response') # Note that the original.mean argument is provided for each timepoint so is a length n vector. #### Influence Map #### welllogs.inf = influence(out.PELT, method = "delete") inf.resid.del=InfluenceMap(cpts(out.PELT), welllogs.inf, data = welllogs, include.data = TRUE, ylab = 'Nuclear-Magnetic\n Response') welllogs.inf = influence(out.PELT, method = "outlier") inf.resid.out=InfluenceMap(cpts(out.PELT), welllogs.inf, data = welllogs, include.data = TRUE, ylab='Nuclear-Magnetic\n Response')