| Version: | 1.4 |
| Title: | Data Sharpening |
| Author: | W. John Braun <john.braun@ubc.ca> |
| Maintainer: | W.J. Braun <john.braun@ubc.ca> |
| Depends: | R (≥ 3.5.0), KernSmooth, stats, quadprog |
| Description: | Functions and data sets inspired by data sharpening - data perturbation to achieve improved performance in nonparametric estimation, as described in Choi, E., Hall, P. and Rousson, V. (2000). Capabilities for enhanced local linear regression function and derivative estimation are included, as well as an asymptotically correct iterated data sharpening estimator for any degree of local polynomial regression estimation. A cross-validation-based bandwidth selector is included which, in concert with the iterated sharpener, will often provide superior performance, according to a median integrated squared error criterion. Sample data sets are provided to illustrate function usage. |
| LazyLoad: | true |
| LazyData: | true |
| ZipData: | no |
| License: | Unlimited |
| NeedsCompilation: | yes |
| Packaged: | 2021-03-30 07:21:38 UTC; braun |
| Repository: | CRAN |
| Date/Publication: | 2021-03-30 07:40:02 UTC |
Cross-Validation Bandwidth Selector for Local Polynomial Regression
Description
Cross-validation bandwidth selector for iterated sharpened responses for bias reduction in function estimation.
Usage
CVsharp(x, y, deg, nsteps)
Arguments
x |
a numeric vector containing the predictor variable values. |
y |
a numeric vector containing the response variable values. |
deg |
a numeric vector containing the local polynomial degree used. |
nsteps |
a numeric vector containing the number of iteration steps. |
Details
If nsteps is specified to be 0, then the CV bandwidth for conventional local polynomial regression is provided.
Value
a list containing 3 elements: the candidate bandwidths; the corresponding CV scores; the selected optimal bandwidth.
Author(s)
W.J. Braun
See Also
locpoly
Examples
speed <- MPG[, 1]
mpg <- MPG[, 2]
h <- CVsharp(speed, mpg, 0, 0)$CVh # conventional local constant regression bandwidth
mpg.l0 <- locpoly(speed, mpg, bandwidth=h, degree=0)
h <- CVsharp(speed, mpg, 0, 1)$CVh # 1-sharpened local constant regression bandwidth
mpgSharp <- sharpiteration(speed, mpg, 0, h, 1)
mpg.l1 <- locpoly(speed, mpgSharp[[1]], bandwidth=h, degree=0)
h <- CVsharp(speed, mpg, 0, 5)$CVh # 5-sharpened local constant regression bandwidth
mpgSharp <- sharpiteration(speed, mpg, 0, h, 5)
mpg.l5 <- locpoly(speed, mpgSharp[[5]], bandwidth=h, degree=0)
plot(mpg ~ speed)
lines(mpg.l0) # unsharpened function estimation
lines(mpg.l1, col=2, lty=2) # sharpened function estimation (1 steps)
lines(mpg.l5, col=4, lty=3) # sharpened function estimation (5 steps)
Data Sharpening for Local Linear Regression
Description
Calculation of sharpened responses for bias reduction in function and first derivative estimation, assuming a gaussian kernel is used in bivariate scatterplot smoothing.
Usage
LLsharpen(x, y, h)
Arguments
x |
a numeric vector containing the predictor variable values. |
y |
a numeric vector containing the response variable values. |
h |
a numeric vector containing the (scalar) bandwidth. |
Value
a vector containing the sharpened (i.e. perturbed) response values, ready for input into a local linear regression estimator.
Author(s)
W.J. Braun
References
Choi, E., Hall, P. and Rousson, V. (2000) Data sharpening methods for bias reduction in nonparametric regression. Annals of Statistics 28(5) 1339-1355.
See Also
locpoly
Examples
speed <- MPG[, 1]
mpg <- MPG[, 2]
h <- dpill(speed, mpg)*2
mpgSharp <- LLsharpen(speed, mpg, h)
mpg.lS <- locpoly(speed, mpgSharp, bandwidth=h, drv=1, degree=1)
mpg.lX <- locpoly(speed, mpg, bandwidth=h, drv=1, degree=1)
plot(mpg.lX, type="l") # unsharpened derivative estimation
lines(mpg.lS, col=2, lty=2) # sharpened derivative estimation
Mileage Data
Description
The MPG data frame has 15 rows and 10 columns.
Usage
data(MPG)
Format
This data frame contains the following columns:
- speed
a numeric vector of cruising speeds in miles per hour
- corsica88
miles per gallon for a 1988 Corsica
- legacy93
miles per gallon for a 1993 Legacy
- olds94
miles per gallon for a 1994 Oldsmobile
- cutlass94
miles per gallon for a 1994 Oldsmobile Cutlass
- chevpickup94
miles per gallon for a 1994 Chevrolet Pickup
- cherokee94
miles per gallon for a 1994 Jeep Cherokee
- villager94
miles per gallon for a 1994 Villager
- prizm95
miles per gallon for a 1995 Prizm
- celica97
miles per gallon for a 1997 Toyota Celica
Source
B.H. West, R.N. McGill, J.W. Hodgson, S.S. Sluder, D.E. Smith, Development and Verification of Light-Duty Modal Emissions and Fuel Consumption Values for Traffic Models, Washington, DC, April 1997, and additional project data, April 1998.
Examples
data(MPG)
plot(celica97 ~ speed, data = MPG)
Matrix of derivative coefficients for local polynomial estimates
Description
This computes a matrix of coefficients of the first derivatives of monotonic local linear sharpening problem.
Usage
MonoMat(xgrid, x, h, d)
Arguments
xgrid |
numeric vector of locations where monotonicity constraint is to be enforced |
x |
numeric explanatory vector |
h |
numeric bandwidth |
d |
local polynomial degree, can be either 0 or 1 |
Value
a list containing the A matrix and the number of rows in A.
Author(s)
W.J. Braun
Monotonized Local Regression
Description
Local constant and local linear regression are applied to bivariate data. The response is ‘sharpened’ or perturbed in a way to render a monotonically increasing curve estimate.
Usage
Monolpoly(x, y, h, d=1, xgrid, numgrid = 401, ...)
Arguments
x |
a vector of explanatory variable observations |
y |
binary vector of responses |
h |
bandwidth |
d |
degree, can be either 0 or 1 |
xgrid |
gridpoints on x-axis where monotonicity constraint is enforced |
numgrid |
number of equally-spaced gridpoints (if xgrid not specified) |
... |
other arguments for locpoly |
Details
Data are perturbed the smallest possible L2 distance subject to the constraint that the local linear estimate is monotonically increasing.
Value
x |
locations of function estimate evaluations |
y |
function estimate evaluations (sharpened - monotonized) |
ysharp |
sharpened responses |
Author(s)
W.J.Braun
References
Braun, W.J. and Hall, P., Data Sharpening for Nonparametric Estimation Subject to Constraints, Journal of Computational and Graphical Statistics, 2001
Examples
gridpts <- seq(1, 10, length=101)
x <- seq(1, 10, length=51)
p <- exp(-1 + .2*x)/(1 + exp(-1 + .2*x))
y <- rbinom(51, 1, p)
plot(x, y)
lines(Monolpoly(x, y, h=0.6, xgrid=gridpts))
##
plot(faithful)
with(faithful,
lines(Monolpoly(eruptions, waiting, h=0.1, d=1,
range=c(1.55,5.15))))
Firebrand Burning Properties
Description
The burnRate data frame contains laboratory data on the
proportion of remaining fuel in a piece of wood that has burned
for a fixed period of time subjected to a fixed windspeed.
Usage
data(burnRate)
Format
This data frame contains the following columns:
- proportionBurned
a numeric vector
- densityRatio
ratio of windspeed, multiplied by density of air, to density of firebrand
- species
factor listing tree species
- diameter
numeric vector of diameter of burned particle in cm
- windspeed
windspeed in cm per second
- testTime
length of test in seconds
Source
Albini, F. USDA Forest Service General Technical Report INT-56, 1979.
Iterated Data Sharpening for Local Polynomial Regression
Description
Calculation of sharpened responses for bias reduction in function and estimation, assuming a gaussian kernel is used in bivariate scatterplot smoothing.
Usage
sharpiteration(x, y, deg, h, nsteps, na.rm, ...)
Arguments
x |
a numeric vector containing the predictor variable values. |
y |
a numeric vector containing the response variable values. |
deg |
a numeric vector containing the local polynomial degree used. |
h |
a numeric vector containing the (scalar) bandwidth. |
nsteps |
a numeric vector containing the number of iteration steps. |
na.rm |
a logical value indicating whether to remove missing values from fitted vectors |
... |
additional arguments to locpoly |
Value
a list with elements containing the sharpened (i.e. perturbed) response values, ready for input into a local polynomial regression estimator. The ith list element corresponds to i steps of data sharpening.
Author(s)
W.J. Braun
See Also
locpoly
Examples
speed <- MPG[, 1]
mpg <- MPG[, 2]
h <- dpill(speed, mpg)
mpgSharp <- sharpiteration(speed, mpg, 1, h, 2)
mpg.lS <- locpoly(speed, mpgSharp[[2]], bandwidth=h, degree=1)
mpg.lX <- locpoly(speed, mpg, bandwidth=h, degree=1)
plot(mpg ~ speed)
lines(mpg.lX) # unsharpened function estimation
lines(mpg.lS, col=2, lty=2) # sharpened function estimation (2 steps)
Whale data
Description
Nursing times for a baby beluga whale.
Usage
data(whale)
Format
A data frame with 228 observations on the following 3 variables.
- V1
a numeric vector
- V2
a numeric vector
- V3
a factor with levels
0104118119126127132135137141441461501511531561571601661671681691701711721741751761801861871891911921931961971981992002042052162182222232252262282292302312322362392432442472522532552572602672712742752772842852862882912922993083203233263323383393403443453493513533543593603623713723773803864044094114194234264294304324334354384404414424434444454464494504534564624634644704734774848549149249449549750450650951513515524528533537538541565579595906006056136446486596868869693694702714727207377475075677280805813825848587087388892939549698M
Source
Simonoff, J. Smoothing Methods in Statistics, Springer, 1996.