This long and incomplete vignette
is an attempt to catalog and illustrate the various capabilities in the
R statistical computing system to perform
differentiation. There are many traps and pitfalls for the unwary in
doing this. It is hoped that by collecting examples in this long
treatment will serve to record these and show how to avoid them, and how
to reliably compute the derivatives desired. Derivative capabilities of
R are in the base system (essentially the functions
D()
and deriv()
) and in different packages,
namely nlsr
(or more recently nlsr
),
Deriv
, and Ryacas
. General tools for
approximations to derivatives are found in the package
numDeriv
as well as optimx
. Other
approximations may be embedded in various packages, but not necessarily
exported for use in scripts or packages.
As a way of recording where attention is needed either to this document or to the functions and methods described, I have put double question marks in various places. Contributions and collaborations to extend and complete the treatment are welcome.
Note: To distinguish output results (which are prefaced
‘##
’ by knitr, I have attempted to put
comments in the R code with the preface
‘#-
’.)
R has a number of tools for finding analytic derivatives.
stats: tools D()
and
deriv()
(R Development Core Team
(2008))
nlsr: tools nlsDeriv()
,
fnDeriv()
, and the interface model2rjfun
(John C Nash and Duncan Murdoch
(2019))
Deriv: tools Deriv()
(Clausen and Sokol (2018))
Ryacas: tools ?? (Goedman et al. (2019))
In 2018, Changcheng Li conducted a Google Summer of Code project
to link R to Julia’s Automatic Differentiation tools, resulting in the
experimental package autodiffr
(see https://github.com/Non-Contradiction/autodiffr).
This is an overview section to give an idea of the capabilities. It is not intended to be exhaustive, but to give pointers to how the tools can be used quickly.
An important issue that may cause a lot of difficulty is the iterating of the tools. That is, we compute a derivative, then want to apply a tool to the derivative to get a second derivative. In doing so, we need to be careful that the type (class??) of the quantity output by the tool is passed back into the tool in a form that will generate a derivative expression. Some examples are presented.
We also note that the Deriv package will give a
result in cases when the input is undefined. This is clearly a bug.
There is an example below on the section for Deriv
.
D()
, deriv()
and deriv3()
: As
deriv3()
is stated to be the same as deriv()
but with argument hessian=TRUE
, we will for now only
consider the first two.
## expression({
## .value <- x^2
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- 2 * x
## attr(.value, "gradient") <- .grad
## .value
## })
## [1] "expression"
## expression({ .value <- x^2 .grad <- array(0, c(length(.value), 1L), list(NULL, c("x"))) .grad[, "x"] <- 2 * x | __truncated__
## [1] 1 0 1 4
## attr(,"gradient")
## x
## [1,] -2
## [2,] 0
## [3,] 2
## [4,] 4
# the "gradient" attribute.
# Note that we cannot (easily) differentiate this again.
firstd <- attr(dx2x,"gradient")
str
## function (object, ...)
## UseMethod("str")
## <bytecode: 0x563247bf3010>
## <environment: namespace:utils>
## Error in deriv.default(firstd, "x") :
## invalid expression in 'FindSubexprs'
## 'try-error' chr "Error in deriv.default(firstd, \"x\") : \n invalid expression in 'FindSubexprs'\n"
## - attr(*, "condition")=List of 2
## ..$ message: chr "invalid expression in 'FindSubexprs'"
## ..$ call : language deriv.default(firstd, "x")
## ..- attr(*, "class")= chr [1:3] "simpleError" "error" "condition"
## [1] 1
## attr(,"gradient")
## x
## [1,] 2
## [1] 10.304
## attr(,"gradient")
## x
## [1,] 6.42
## [1] 1 4 9 16 25
## attr(,"gradient")
## x
## [1,] 2
## [2,] 4
## [3,] 6
## [4,] 8
## [5,] 10
## 2 * x
## [1] -2 0 2 4
## [1] 2
## [1] 2
#- Note how we handle an expression stored in a string via parse(text= ))
sx2 <- "x^2"
sDx2x <- D(parse(text=sx2), "x")
sDx2x
## 2 * x
#- But watch out! The following "seems" to work, but the answer is not as intended.
#- The problem is that the first argument is evaluated before being used. Since
#- x exists, it fails
x
## [1] -1 0 1 2
## [1] 0
## [1] 0
## -(cos(cos(x + y^2)) * sin(x + y^2))
## [1] TRUE
## expression({
## .expr2 <- x + y^2
## .expr3 <- cos(.expr2)
## .expr5 <- cos(.expr3)
## .expr6 <- sin(.expr2)
## .value <- sin(.expr3)
## .grad <- array(0, c(length(.value), 2L), list(NULL, c("x",
## "y")))
## .grad[, "x"] <- -(.expr5 * .expr6)
## .grad[, "y"] <- -(.expr5 * (.expr6 * (2 * y)))
## attr(.value, "gradient") <- .grad
## .value
## })
## [1] 0.84147 0.51440 -0.40424 -0.83602
## attr(,"gradient")
## x y
## [1,] 0.000000 0.00000
## [2,] -0.721606 -1.44321
## [3,] -0.831692 -1.66338
## [4,] -0.077432 -0.15486
## [1] 0.000000 -0.721606 -0.831692 -0.077432
## function (x, y)
## {
## .expr1 <- cos(x)
## .expr2 <- .expr1 * y
## .expr4 <- cos(.expr2)
## .value <- sin(.expr2)
## .grad <- array(0, c(length(.value), 2L), list(NULL, c("x",
## "y")))
## .grad[, "x"] <- -(.expr4 * (sin(x) * y))
## .grad[, "y"] <- .expr4 * .expr1
## attr(.value, "gradient") <- .grad
## .value
## }
#- ??#- Surely there is an error, since documentation says no lhs! i.e.,
#- "expr: a 'expression' or 'call' or (except 'D') a formula with no lhs."
#- function with defaulted arguments:
(fx <- deriv(y ~ b0 + b1 * 2^(-x/th), c("b0", "b1", "th"),
function(b0, b1, th, x = 1:7){} ) )
## function (b0, b1, th, x = 1:7)
## {
## .expr3 <- 2^(-x/th)
## .value <- b0 + b1 * .expr3
## .grad <- array(0, c(length(.value), 3L), list(NULL, c("b0",
## "b1", "th")))
## .grad[, "b0"] <- 1
## .grad[, "b1"] <- .expr3
## .grad[, "th"] <- b1 * (.expr3 * (log(2) * (x/th^2)))
## attr(.value, "gradient") <- .grad
## .value
## }
## [1] 4.5227 4.1213 3.7838 3.5000 3.2613 3.0607 2.8919
## attr(,"gradient")
## b0 b1 th
## [1,] 1 0.84090 0.10929
## [2,] 1 0.70711 0.18380
## [3,] 1 0.59460 0.23183
## [4,] 1 0.50000 0.25993
## [5,] 1 0.42045 0.27322
## [6,] 1 0.35355 0.27570
## [7,] 1 0.29730 0.27047
## 2 * x
#- stopifnot(D(as.name("x"), "x") == 1) #- A way of testing.
#- This works by coercing "x" to name/symbol, and derivative should be 1.
#- Would fail only if "x" cannot be so coerced. How could this happen??
#- Higher derivatives showing deriv3
myd3 <- deriv3(y ~ b0 + b1 * 2^(-x/th), c("b0", "b1", "th"),
c("b0", "b1", "th", "x") )
myd3(2,3,4, x=1:7)
## [1] 4.5227 4.1213 3.7838 3.5000 3.2613 3.0607 2.8919
## attr(,"gradient")
## b0 b1 th
## [1,] 1 0.84090 0.10929
## [2,] 1 0.70711 0.18380
## [3,] 1 0.59460 0.23183
## [4,] 1 0.50000 0.25993
## [5,] 1 0.42045 0.27322
## [6,] 1 0.35355 0.27570
## [7,] 1 0.29730 0.27047
## attr(,"hessian")
## , , b0
##
## b0 b1 th
## [1,] 0 0 0
## [2,] 0 0 0
## [3,] 0 0 0
## [4,] 0 0 0
## [5,] 0 0 0
## [6,] 0 0 0
## [7,] 0 0 0
##
## , , b1
##
## b0 b1 th
## [1,] 0 0 0.036429
## [2,] 0 0 0.061266
## [3,] 0 0 0.077278
## [4,] 0 0 0.086643
## [5,] 0 0 0.091073
## [6,] 0 0 0.091899
## [7,] 0 0 0.090157
##
## , , th
##
## b0 b1 th
## [1,] 0 0.036429 -0.049909
## [2,] 0 0.061266 -0.075974
## [3,] 0 0.077278 -0.085786
## [4,] 0 0.086643 -0.084923
## [5,] 0 0.091073 -0.077428
## [6,] 0 0.091899 -0.066187
## [7,] 0 0.090157 -0.053215
#- check against deriv()
myd3a <- deriv(y ~ b0 + b1 * 2^(-x/th), c("b0", "b1", "th"),
c("b0", "b1", "th", "x"), hessian=TRUE )
myd3a(2,3,4, x=1:7)
## [1] 4.5227 4.1213 3.7838 3.5000 3.2613 3.0607 2.8919
## attr(,"gradient")
## b0 b1 th
## [1,] 1 0.84090 0.10929
## [2,] 1 0.70711 0.18380
## [3,] 1 0.59460 0.23183
## [4,] 1 0.50000 0.25993
## [5,] 1 0.42045 0.27322
## [6,] 1 0.35355 0.27570
## [7,] 1 0.29730 0.27047
## attr(,"hessian")
## , , b0
##
## b0 b1 th
## [1,] 0 0 0
## [2,] 0 0 0
## [3,] 0 0 0
## [4,] 0 0 0
## [5,] 0 0 0
## [6,] 0 0 0
## [7,] 0 0 0
##
## , , b1
##
## b0 b1 th
## [1,] 0 0 0.036429
## [2,] 0 0 0.061266
## [3,] 0 0 0.077278
## [4,] 0 0 0.086643
## [5,] 0 0 0.091073
## [6,] 0 0 0.091899
## [7,] 0 0 0.090157
##
## , , th
##
## b0 b1 th
## [1,] 0 0.036429 -0.049909
## [2,] 0 0.061266 -0.075974
## [3,] 0 0.077278 -0.085786
## [4,] 0 0.086643 -0.084923
## [5,] 0 0.091073 -0.077428
## [6,] 0 0.091899 -0.066187
## [7,] 0 0.090157 -0.053215
## [1] TRUE
#- Higher derivatives:
DD <- function(expr, name, order = 1) {
if(order < 1) stop("'order' must be >= 1")
if(order == 1) D(expr, name)
else DD(D(expr, name), name, order - 1)
}
DD(expression(sin(x^2)), "x", 3)
## -(sin(x^2) * (2 * x) * 2 + ((cos(x^2) * (2 * x) * (2 * x) + sin(x^2) *
## 2) * (2 * x) + sin(x^2) * (2 * x) * 2))
## 2 * x
## [1] "call"
## language 2 * x
## [1] -2 0 2 4
## language 2 * x
## [1] 2
d2x2xnF <- nlsDeriv(firstdn, "x", do_substitute=FALSE)
d2x2xnF # in this case we get the same result
## [1] 2
## [1] 0
## [1] 0
## [1] 0
## [1] 0
## [1] 0
## 2 * x
## 2 * x
## [1] 0
#- Some things to check and explain??
#?? firstde <- quote(firstd)
#?? firstde
#?? firstde <- bquote(firstd)
#?? firstde
#?? nlsDeriv(firstde, "x")
d2 <- nlsDeriv(2 * x, "x")
str(d2)
## num 0
## [1] 0
#?? firstc <- as.call(firstd)
#?? nlsDeriv(firstc, "x")
#- Build a function from the expression
#?? fdx2xn<-function(x){eval(dx2xn)}
#?? fdx2xn(1)
#?? fdx2xn(3.21)
#?? fdx2xn(1:5)
The tool codeDeriv
returns an R expression to evaluate
the derivative efficiently. fnDeriv
wraps it in a function.
By default the arguments to the function are constructed from all
variables in the expression. In the example below this includes
x
.
## {
## .expr1 <- -x
## .expr2 <- .expr1/th
## .expr3 <- 2^.expr2
## .value <- b0 + b1 * 2^(.expr2)
## .grad <- array(0, c(length(.value), 3L), list(NULL, c("b0",
## "b1", "th")))
## .grad[, "b0"] <- 1
## .grad[, "b1"] <- .expr3
## .grad[, "th"] <- b1 * (.expr3 * 0.693147180559945 * -(.expr1/th^2))
## attr(.value, "gradient") <- .grad
## .value
## }
#- Include parameters as arguments
fj.1 <- fnDeriv(parse(text="b0 + b1 * 2^(-x/th)"), c("b0", "b1", "th"))
head(fj.1)
##
## 1 function (b0, b1, x, th)
## 2 {
## 3 .expr1 <- -x
## 4 .expr2 <- .expr1/th
## 5 .expr3 <- 2^.expr2
## 6 .value <- b0 + b1 * 2^(.expr2)
## [1] 2.1892
## attr(,"gradient")
## b0 b1 th
## [1,] 1 0.5946 0.15456
#- Get all parameters from the calling environment
fj.2 <- fnDeriv(parse(text="b0 + b1 * 2^(-x/th)"), c("b0", "b1", "th"),
args = character())
head(fj.2)
##
## 1 function ()
## 2 {
## 3 .expr1 <- -x
## 4 .expr2 <- .expr1/th
## 5 .expr3 <- 2^.expr2
## 6 .value <- b0 + b1 * 2^(.expr2)
## [1] 2.1892
## attr(,"gradient")
## b0 b1 th
## [1,] 1 0.5946 0.15456
#- Just use an expression
fje <- codeDeriv(parse(text="b0 + b1 * 2^(-x/th)"), c("b0", "b1", "th"))
eval(fje)
## [1] 2.1892
## attr(,"gradient")
## b0 b1 th
## [1,] 1 0.5946 0.15456
dx2xnf <- fnDeriv(~ x^2, "x") #- Use tilde
dx2xnf <- fnDeriv(expression(x^2), "x") #- or use expression()
dx2xnf
## function (x)
## {
## .value <- x^2
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- 2 * x
## attr(.value, "gradient") <- .grad
## .value
## }
## [1] "function"
## function (x)
x <- -1:2
#?? eval(dx2xnf) # This is evaluated at -1, 0, 1, 2, BUT result is returned directly,
#- NOT in "gradient" attribute
# Note that we cannot (easily) differentiate this again.
# firstd <- dx2xnf
# str(firstd)
# d2x2xnf <- try(nlsDeriv(firstd, "x")) #- this APPEARS to work, but WRONG answer
# str(d2x2xnf)
# d2x2xnf
# eval(d2x2xnf)
# dx2xnfh <- fnDeriv(expression(x^2), "x", hessian=TRUE) #- Try for second derivatives
# dx2xnfh
# mode(dx2xnfh)
# str(dx2xnfh)
# x <- -1
# eval(dx2xnfh) # This is evaluated at -1, 0, 1, 2, BUT result is returned directly,
The following examples are drawn from the example(Deriv)
contained in the Deriv
package.
## function (x)
## 2 * x
## function (x)
## 2 * x
## function (x)
## 2
## function (x, y)
## c(x = cos(x) * cos(y), y = -(sin(x) * sin(y)))
## x y
## 0.6471 0.1068
#- Should see
#- x y
#- [1,] 0.6471023 0.1068000
f2 <- Deriv(~ f(x, y^2), "y")
#- This has a tilde to render the 1st argument as a formula object
#- Also we are substituting in y^2 for y
f2 #- print it
## -(2 * (y * sin(x) * sin(y^2)))
## [1] "call"
## [1] "call"
## -(2 * (y * sin(x) * sin(y^2)))
## [1] -1 0 1 2
## [1] 1
## [1] -0.45465 0.00000 0.45465 0.49130
## [1] 1.4161 0.0000 -1.4161 -1.5303
f3 <- Deriv(quote(f(x, y^2)), c("x", "y"), cache.exp=FALSE) #- check cache.exp operation
#- Note that we need to quote or will get evaluation at current x, y values (if they exist)
f3 #- print it
## c(x = cos(x) * cos(y^2), y = -(2 * (y * sin(x) * sin(y^2))))
#- c(x = cos(x) * cos(y^2), y = -(2 * (y * sin(x) * sin(y^2))))
f3c <- Deriv(quote(f(x, y^2)), c("x", "y"), cache.exp=TRUE) #- check cache.exp operation
f3c #- print it
## {
## .e1 <- y^2
## c(x = cos(x) * cos(.e1), y = -(2 * (y * sin(x) * sin(.e1))))
## }
## x y
## 0.94808 0.32503
## x y
## 0.94808 0.32503
## x y
## -0.35317 2.54731
## x y
## 0.94808 0.32503
## expression(2 * (x * y * cos(x^2)))
#- should see
#- expression(2 * (x * y * cos(x^2)))
#- quoted string
Deriv("sin(x^2) * y", "x") # differentiate only by x
## [1] "2 * (x * y * cos(x^2))"
#- Should see
#- "2 * (x * y * cos(x^2))"
Deriv("sin(x^2) * y", cache.exp=FALSE) #- differentiate by all variables (here by x and y)
## [1] "c(x = 2 * (x * y * cos(x^2)), y = sin(x^2))"
#- Note that default is to differentiate by all variables.
#- Should see
#- "c(x = 2 * (x * y * cos(x^2)), y = sin(x^2))"
#- Compound function example (here abs(x) smoothed near 0)
#- Note that this introduces the possibilty of `if` statements in the code
#- BUT (JN) seems to give back quoted string, so we must parse.
fc <- function(x, h=0.1) if (abs(x) < h) 0.5*h*(x/h)**2 else abs(x)-0.5*h
efc1 <- Deriv("fc(x)", "x", cache.exp=FALSE)
#- "if (abs(x) < h) x/h else sign(x)"
#- A few checks on the results
efc1
## [1] "if (abs(x) < 0.1) 10 * x else sign(x)"
## function (x, h = 0.1)
## {
## eval(parse(text = efc1))
## }
## [1] 1
## [1] 0.01
## [1] -0.01
## [1] -1
## [1] 0.01
#- Example of a first argument that cannot be evaluated in the current environment:
try(suppressWarnings(rm("xx", "yy"))) #- Make sure there are no objects xx or yy
Deriv(~ xx^2+yy^2)
## c(xx = 2 * xx, yy = 2 * yy)
#- Should show
#- c(xx = 2 * xx, yy = 2 * yy)
#- ?? What is the meaning / purpose of this construct?
#- ?? Is following really AD?
#- Automatic differentiation (AD), note intermediate variable 'd' assignment
Deriv(~{d <- ((x-m)/s)^2; exp(-0.5*d)}, "x")
## {
## .e1 <- x - m
## -(exp(-(0.5 * (.e1/s)^2)) * .e1/s^2)
## }
# Note that the result we see does NOT match what follows in the example(Deriv) (JN ??)
#{
# d <- ((x - m)/s)^2
# .d_x <- 2 * ((x - m)/s^2)
# -(0.5 * (.d_x * exp(-(0.5 * d))))
#}
#- For some reason the intermediate variable d is NOT included.??
#- Custom derivative rule. Note that this needs explanations??
myfun <- function(x, y=TRUE) NULL #- do something useful
dmyfun <- function(x, y=TRUE) NULL #- myfun derivative by x.
drule[["myfun"]] <- alist(x=dmyfun(x, y), y=NULL) #- y is just a logical
Deriv(myfun(z^2, FALSE), "z")
## NULL
# 2 * (z * dmyfun(z^2, FALSE))
#- Differentiation by list components
theta <- list(m=0.1, sd=2.) #- Why do we set values??
x <- names(theta) #- and why these particular names??
names(x)=rep("theta", length(theta))
Deriv(~exp(-(x-theta$m)**2/(2*theta$sd)), x, cache.exp=FALSE)
## c(theta_m = exp(-((x - theta$m)^2/(2 * theta$sd))) * (x - theta$m)/theta$sd,
## theta_sd = 2 * (exp(-((x - theta$m)^2/(2 * theta$sd))) *
## (x - theta$m)^2/(2 * theta$sd)^2))
#- Should show the following (but why??)
#- c(theta_m = exp(-((x - theta$m)^2/(2 * theta$sd))) *
#- (x - theta$m)/theta$sd, theta_sd = 2 * (exp(-((x - theta$m)^2/
#- (2 * theta$sd))) * (x - theta$m)^2/(2 * theta$sd)^2))
lderiv <- Deriv(~exp(-(x-theta$m)**2/(2*theta$sd)), x, cache.exp=FALSE)
fld <- function(x){ eval(lderiv)} #- put this in a function
fld(2) #- and evaluate at a value
## theta_m theta_sd
## 0.38528 0.18301
Deriv has some design choices that can get the user into trouble. The following example shows one such problem.
## [1] 1
## 2 * x
## 2 * x
By comparison, nlsr
## Error : object 'x' not found
## [1] 1
## Error : object 'x' not found
## 2 * x
## 2 * x
There is at least one other symbolic package for R. Here we look at
Ryacas. The structures for using yacas tools do not
seem at the time of writing (2016-10-21) to be suitable for working with
nonlinear least squares or optimization facilities of
R. Thus, for the moment, we will not pursue the
derivatives available in Ryacas
beyond the following
example provided by Gabor Grothendieck.
## cos(x + y)
## [1] "call"
detach("package:nlsr", unload=TRUE)
detach("package:Deriv", unload=TRUE)
## New Ryacas mechanism as of 2019-8-29 from [email protected] (Mikkel Meyer Andersen)
yac_str("D(x) Sin(x+y)")
## [1] "Cos(x+y)"
## expression(cos(x + y))
## expression(cos(x + y))
## [1] -1.837e-16
See specific notes either in comments or at the end of the section.
The help page for D
lists the functions for which
derivatives are known: “The internal code knows about the arithmetic
operators +
, -
, *
, /
and ^
, and the single-variable functions exp
,
log
, sin
, cos
, tan
,
sinh
, cosh
, sqrt
,
pnorm
, dnorm
, asin
,
acos
, atan
, gamma
,
lgamma
, digamma
and trigamma
, as
well as psigamma
for one or two arguments (but derivative
only with respect to the first).”
nlsr
This package supports the derivatives that D
supports,
as well as a few others, and users can add their own definitions. The
current list is
## [1] "(" "*" "+" "-" "/" "^"
## [7] "abs" "acos" "asin" "atan" "cos" "cosh"
## [13] "digamma" "dnorm" "exp" "gamma" "lgamma" "log"
## [19] "pnorm" "psigamma" "sign" "sin" "sinh" "sqrt"
## [25] "tan" "trigamma" "~"
Here is a slightly expanded testing of the elements of the
nlsr
derivatives table.
## Loading required package: nlsr
## [1] "call"
## 1/x
## [1] "expression"
## expression({
## .value <- log(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- 1/x
## attr(.value, "gradient") <- .grad
## .value
## })
## [1] "call"
## 1/x
## but
## Error in D("~ log(x)", "x") : expression must not be type 'character'
## Error in D(~log(x), "x") : Function '`~`' is not in the derivatives table
## ~log(x)
## [1] "formula"
## [1] "expression"
## Error in D(interme, "x") : Function '`~`' is not in the derivatives table
## Error in deriv.default(interme, "x") :
## Function '`~`' is not in the derivatives table
## expression({
## .value <- log(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- 1/x
## attr(.value, "gradient") <- .grad
## .value
## })
## 1/(x * 1.09861228866811)
## Error in D(expression(log(x, base = 3)), "x") :
## only single-argument calls to log() are supported;
## maybe use log(x,a) = log(x)/log(a)
## Error in deriv.formula(~log(x, base = 3), "x") :
## only single-argument calls to log() are supported;
## maybe use log(x,a) = log(x)/log(a)
## Error in deriv.default(expression(log(x, base = 3)), "x") :
## only single-argument calls to log() are supported;
## maybe use log(x,a) = log(x)/log(a)
## Error in deriv3.default(expression(log(x, base = 3)), "x") :
## only single-argument calls to log() are supported;
## maybe use log(x,a) = log(x)/log(a)
## function (x)
## {
## .value <- log(x, base = 3)
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- 1/(x * 1.09861228866811)
## attr(.value, "gradient") <- .grad
## .value
## }
## exp(x)
## exp(x)
## expression({
## .expr1 <- exp(x)
## .value <- .expr1
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- .expr1
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x)
## {
## .expr1 <- exp(x)
## .value <- .expr1
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- .expr1
## attr(.value, "gradient") <- .grad
## .value
## }
## cos(x)
## cos(x)
## expression({
## .value <- sin(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- cos(x)
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x)
## {
## .value <- sin(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- cos(x)
## attr(.value, "gradient") <- .grad
## .value
## }
## -sin(x)
## -sin(x)
## expression({
## .value <- cos(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- -sin(x)
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x)
## {
## .value <- cos(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- -sin(x)
## attr(.value, "gradient") <- .grad
## .value
## }
## 1/cos(x)^2
## 1/cos(x)^2
## expression({
## .value <- tan(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- 1/cos(x)^2
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x)
## {
## .value <- tan(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- 1/cos(x)^2
## attr(.value, "gradient") <- .grad
## .value
## }
## cosh(x)
## cosh(x)
## expression({
## .value <- sinh(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- cosh(x)
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x)
## {
## .value <- sinh(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- cosh(x)
## attr(.value, "gradient") <- .grad
## .value
## }
## sinh(x)
## sinh(x)
## expression({
## .value <- cosh(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- sinh(x)
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x)
## {
## .value <- cosh(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- sinh(x)
## attr(.value, "gradient") <- .grad
## .value
## }
## 0.5/sqrt(x)
## 0.5 * x^-0.5
## expression({
## .value <- sqrt(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- 0.5 * x^-0.5
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x)
## {
## .expr1 <- sqrt(x)
## .value <- .expr1
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- 0.5/.expr1
## attr(.value, "gradient") <- .grad
## .value
## }
## dnorm(q)
## dnorm(q)
## expression({
## .value <- pnorm(q)
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("q")))
## .grad[, "q"] <- dnorm(q)
## attr(.value, "gradient") <- .grad
## .value
## })
## function (q)
## {
## .value <- pnorm(q)
## .grad <- array(0, c(length(.value), 1L), list(NULL, "q"))
## .grad[, "q"] <- dnorm(q)
## attr(.value, "gradient") <- .grad
## .value
## }
## dnorm(x - mean) * (x - mean)
## [1] 0
## expression({
## .value <- dnorm(x, mean)
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("mean")))
## .grad[, "mean"] <- 0
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x, mean)
## {
## .expr1 <- x - mean
## .value <- dnorm(x, mean)
## .grad <- array(0, c(length(.value), 1L), list(NULL, "mean"))
## .grad[, "mean"] <- dnorm(.expr1) * .expr1
## attr(.value, "gradient") <- .grad
## .value
## }
## 1/sqrt(1 + x^2)
## 1/sqrt(1 - x^2)
## expression({
## .value <- asin(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- 1/sqrt(1 - x^2)
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x)
## {
## .value <- asin(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- 1/sqrt(1 + x^2)
## attr(.value, "gradient") <- .grad
## .value
## }
## -1/sqrt(1 + x^2)
## -(1/sqrt(1 - x^2))
## expression({
## .value <- acos(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- -(1/sqrt(1 - x^2))
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x)
## {
## .value <- acos(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- -1/sqrt(1 + x^2)
## attr(.value, "gradient") <- .grad
## .value
## }
## 1/(1 + x^2)
## 1/(1 + x^2)
## expression({
## .value <- atan(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- 1/(1 + x^2)
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x)
## {
## .value <- atan(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- 1/(1 + x^2)
## attr(.value, "gradient") <- .grad
## .value
## }
## gamma(x) * digamma(x)
## gamma(x) * digamma(x)
## expression({
## .expr1 <- gamma(x)
## .value <- .expr1
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- .expr1 * digamma(x)
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x)
## {
## .expr1 <- gamma(x)
## .value <- .expr1
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- .expr1 * digamma(x)
## attr(.value, "gradient") <- .grad
## .value
## }
## digamma(x)
## digamma(x)
## expression({
## .value <- lgamma(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- digamma(x)
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x)
## {
## .value <- lgamma(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- digamma(x)
## attr(.value, "gradient") <- .grad
## .value
## }
## trigamma(x)
## trigamma(x)
## expression({
## .value <- digamma(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- trigamma(x)
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x)
## {
## .value <- digamma(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- trigamma(x)
## attr(.value, "gradient") <- .grad
## .value
## }
## psigamma(x, 2L)
## psigamma(x, 2L)
## expression({
## .value <- trigamma(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- psigamma(x, 2L)
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x)
## {
## .value <- trigamma(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- psigamma(x, 2L)
## attr(.value, "gradient") <- .grad
## .value
## }
## psigamma(x, 6)
## psigamma(x, 6L)
## expression({
## .value <- psigamma(x, deriv = 5)
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- psigamma(x, 6L)
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x)
## {
## .value <- psigamma(x, deriv = 5)
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- psigamma(x, 6)
## attr(.value, "gradient") <- .grad
## .value
## }
## y
## y
## expression({
## .value <- x * y
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- y
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x, y)
## {
## .value <- x * y
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- y
## attr(.value, "gradient") <- .grad
## .value
## }
## 1/y
## 1/y
## expression({
## .value <- x/y
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- 1/y
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x, y)
## {
## .value <- x/y
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- 1/y
## attr(.value, "gradient") <- .grad
## .value
## }
## y * x^(y - 1)
## x^(y - 1) * y
## expression({
## .value <- x^y
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- x^(y - 1) * y
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x, y)
## {
## .value <- x^y
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- y * x^(y - 1)
## attr(.value, "gradient") <- .grad
## .value
## }
## [1] 1
## [1] 1
## expression({
## .value <- (x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- 1
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x)
## {
## .value <- (x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- 1
## attr(.value, "gradient") <- .grad
## .value
## }
## [1] 1
## [1] 1
## expression({
## .value <- +x
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- 1
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x)
## {
## .value <- +x
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- 1
## attr(.value, "gradient") <- .grad
## .value
## }
## [1] -1
## -1
## expression({
## .value <- -x
## .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
## .grad[, "x"] <- -1
## attr(.value, "gradient") <- .grad
## .value
## })
## function (x)
## {
## .value <- -x
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- -1
## attr(.value, "gradient") <- .grad
## .value
## }
## sign(x)
## Error in D(expression(abs(x)), "x") :
## Function 'abs' is not in the derivatives table
## Error in deriv.formula(~abs(x), "x") :
## Function 'abs' is not in the derivatives table
## function (x)
## {
## .value <- abs(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- sign(x)
## attr(.value, "gradient") <- .grad
## .value
## }
## [1] 0
## Error in D(expression(sign(x)), "x") :
## Function 'sign' is not in the derivatives table
## Error in deriv.formula(~sign(x), "x") :
## Function 'sign' is not in the derivatives table
## function (x)
## {
## .value <- sign(x)
## .grad <- array(0, c(length(.value), 1L), list(NULL, "x"))
## .grad[, "x"] <- 0
## attr(.value, "gradient") <- .grad
## .value
## }
the base tool deriv
(and deriv3
) and
nlsr::codeDeriv
are intended to output an expression to
compute a derivative. deriv
generates an expression object,
while codeDeriv
will generate a language object. Note that
input to deriv
is of the form of a tilde expression with no
left hand side, while codeDeriv
is more flexible: quoted
expressions, or length-1 expression vectors may also be used.
the base tool D
and nlsr::nlsDeriv
generate expressions, but D
requires an expression, while
nlsDeriv
can handle the expression without a wrapper. ?? Do
we need to discuss more??
nlsr includes abs(x)
and
sign(x)
in the derivatives table despite conventional
wisdom that these are not differentiable. However, abs(x)
clearly has a defined derivative everywhere except at x = 0, where
assigning a value of 0 to the derivative is almost certainly acceptable
in computations. Similarly for sign(x)
.
nlsr also includes some tools for simplification of algebraic expressions, extensible by the user. Currently these involve the following functions:
## [1] "!" "&&" "(" "*" "+" "-" "/"
## [8] "^" "exp" "if" "log" "missing" "||"
#- Remove ##? to see reproducible error
#- ?? For some reason, if we leave packages attached, we get errors.
#- Here we detach all the non-base packages and then reload nlsr
##? require(nlsr)
##? sessionInfo()
##? ##? nlsSimplify(quote(+(a+b)))
##? nlsSimplify(quote(-5))
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] nlsr_2024.01.29 microbenchmark_1.5.0 minpack.lm_1.2-4
## [4] ggplot2_3.5.1 nlraa_1.9.7 optimx_2024-11.03
## [7] rmarkdown_2.29
##
## loaded via a namespace (and not attached):
## [1] sass_0.4.9 utf8_1.2.4 lattice_0.22-6
## [4] pracma_2.4.4 digest_0.6.37 magrittr_2.0.3
## [7] evaluate_1.0.1 grid_4.4.2 fastmap_1.2.0
## [10] jsonlite_1.8.9 Matrix_1.7-1 mgcv_1.9-1
## [13] fansi_1.0.6 scales_1.3.0 numDeriv_2016.8-1.1
## [16] jquerylib_0.1.4 cli_3.6.3 rlang_1.1.4
## [19] munsell_0.5.1 splines_4.4.2 withr_3.0.2
## [22] cachem_1.1.0 yaml_2.3.10 tools_4.4.2
## [25] nloptr_2.1.1 colorspace_2.1-1 boot_1.3-31
## [28] buildtools_1.0.0 vctrs_0.6.5 R6_2.5.1
## [31] lifecycle_1.0.4 MASS_7.3-61 pkgconfig_2.0.3
## [34] bslib_0.8.0 pillar_1.9.0 gtable_0.3.6
## [37] glue_1.8.0 Rcpp_1.0.13-1 xfun_0.49
## [40] tibble_3.2.1 sys_3.4.3 knitr_1.49
## [43] htmltools_0.5.8.1 nlme_3.1-166 maketools_1.3.1
## [46] compiler_4.4.2
if ("Deriv" %in% loadedNamespaces()){detach("package:Deriv", unload=TRUE)}
if ("nlsr" %in% loadedNamespaces() ){detach("package:nlsr", unload=TRUE)}
if ("Ryacas" %in% loadedNamespaces() ){detach("package:Ryacas", unload=TRUE)}
require(nlsr)
## Loading required package: nlsr
#- require(Deriv)
#- require(stats)
#- Various simplifications
#- ?? Do we need quote() to stop attempt to evaluate before applying simplification
nlsSimplify(quote(+(a+b)))
## a + b
## [1] -5
## a + b
## a + b
## [1] 2.7183
## a + b
## [1] 0
## [1] FALSE
## [1] TRUE
## a + b
## a + b
## a + b
## 2 * (a + b)
## [1] 5
## a + b
## -a - b
## [1] 0
## [1] 2
## [1] 0
## [1] 0
## a + b
## a + b
## -(a + b)
## -(a + b)
## [1] 10
## a + b
## -(a + b)
## [1] 0
## [1] 0.33333
## a + b
## [1] 1024
## a/1.09861228866811
## [1] FALSE
## a
## b
## [1] TRUE
## b
## a
## a + b
## NULL
## a + b
## a * b
## a + b
## a + b
## a + b
Deriv
?? To be added
#- For some reason, if we leave packages attached, we get errors.
#- Here we detach all the non-base packages and then reload nlsr
sessionInfo()
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] nlsr_2024.01.29 microbenchmark_1.5.0 minpack.lm_1.2-4
## [4] ggplot2_3.5.1 nlraa_1.9.7 optimx_2024-11.03
## [7] rmarkdown_2.29
##
## loaded via a namespace (and not attached):
## [1] sass_0.4.9 utf8_1.2.4 lattice_0.22-6
## [4] pracma_2.4.4 digest_0.6.37 magrittr_2.0.3
## [7] evaluate_1.0.1 grid_4.4.2 fastmap_1.2.0
## [10] jsonlite_1.8.9 Matrix_1.7-1 mgcv_1.9-1
## [13] fansi_1.0.6 scales_1.3.0 numDeriv_2016.8-1.1
## [16] jquerylib_0.1.4 cli_3.6.3 rlang_1.1.4
## [19] munsell_0.5.1 splines_4.4.2 withr_3.0.2
## [22] cachem_1.1.0 yaml_2.3.10 tools_4.4.2
## [25] nloptr_2.1.1 colorspace_2.1-1 boot_1.3-31
## [28] buildtools_1.0.0 vctrs_0.6.5 R6_2.5.1
## [31] lifecycle_1.0.4 MASS_7.3-61 pkgconfig_2.0.3
## [34] bslib_0.8.0 pillar_1.9.0 gtable_0.3.6
## [37] glue_1.8.0 Rcpp_1.0.13-1 xfun_0.49
## [40] tibble_3.2.1 sys_3.4.3 knitr_1.49
## [43] htmltools_0.5.8.1 nlme_3.1-166 maketools_1.3.1
## [46] compiler_4.4.2
if ("Deriv" %in% loadedNamespaces()){detach("package:Deriv", unload=TRUE)}
if ("Deriv" %in% loadedNamespaces() ){detach("package:nlsr", unload=TRUE)}
if ("Deriv" %in% loadedNamespaces() ){detach("package:Ryacas", unload=TRUE)}
require(Deriv)
## Loading required package: Deriv
##
## Attaching package: 'Deriv'
## The following object is masked _by_ '.GlobalEnv':
##
## drule
#- Various simplifications
#- ?? Do we need quote() to stop attempt to evaluate before applying simplification?
Simplify(quote(+(a+b)))
## a + b
## [1] -5
## a + b
## exp(log(a + b))
## [1] 2.7183
## a + b
## [1] 0
## [1] FALSE
## [1] TRUE
## a + b
## a + b
## a + b
## 2 * (a + b)
## [1] 5
## a + b
## -(a + b)
## [1] 0
## [1] 2
## [1] 0
## [1] 0
## a + b
## a + b
## -(a + b)
## -(a + b)
## [1] 10
## a + b
## -(a + b)
## [1] 0
## [1] 0.33333
## a + b
## [1] 1024
## a/1.09861228866811
## [1] FALSE
## a
## b
## [1] TRUE
## b
## a
## a + b
## if (FALSE) a + b
## a + b
## a * b
## a + b
## a + b
## a + b
?? to be added
?? to be added
?? to be added
?? to be added
?? need to explain where Deriv package comes from
One of the key tasks with tools for derivatives is that of taking objects in one or other form (that is, R class) and using it as an input for a symbolic function. The object may, of course, be an output from another such function, and this is one of the reasons we need to do such transformations.
We also note that the different tools for symbolic derivatives use slightly different inputs. For example, for the derivative of log(x), we have
## language 1/x
## 1/x
Unfortunately, there are complications when we have an expression object, and we need to specify that we do NOT execute the substitute() function. Here we show how to do this implicitly and with an explicit object.
## language 1/x
## 1/x
## / 1 x
## language 1/x
## 1/x
## language 1/x
## 1/x
## expression(1/x)
## expression(1/x)
## 1/x
## Class 'formula' language ~ddlogx
## ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
?? do each example by all methods and by numDeriv and put in dataframe for later presentation in a table.
Do we want examples in columns or rows. Probably 1 fn per row and work out a name for the row that is reasonably meaningful. Probably want an index column as well that is a list of strings. Can we then act on those strings to automate the whole setup?
require(nlsr)
# require(stats)
# require(Deriv)
# require(Ryacas)
# Various derivatives
new <- codeDeriv(quote(1 + x + y), c("x", "y"))
old <- deriv(quote(1 + x + y), c("x", "y"))
print(new)
## {
## .value <- 1 + x + y
## .grad <- array(0, c(length(.value), 2L), list(NULL, c("x",
## "y")))
## .grad[, "x"] <- 1
## .grad[, "y"] <- 1
## attr(.value, "gradient") <- .grad
## .value
## }
## [1] "{"
## language { .value <- 1 + x + y; .grad <- array(0, c(length(.value), 2L), list(NULL, c("x", "y"))); .grad[, "x"] <- 1; .gr| __truncated__
## expression({
## .value <- 1 + x + y
## .grad <- array(0, c(length(.value), 2L), list(NULL, c("x",
## "y")))
## .grad[, "x"] <- 1
## .grad[, "y"] <- 1
## attr(.value, "gradient") <- .grad
## .value
## })
## [1] 9
## attr(,"gradient")
## x y
## [1,] 1 1
## expression({
## .value <- 1 + x + y
## .grad <- array(0, c(length(.value), 2L), list(NULL, c("x",
## "y")))
## .grad[, "x"] <- 1
## .grad[, "y"] <- 1
## attr(.value, "gradient") <- .grad
## .value
## })
## [1] "expression"
## expression({ .value <- 1 + x + y .grad <- array(0, c(length(.value), 2L), list(NULL, c("x", "y"))) .grad[, "x"| __truncated__
## [1] 9
## attr(,"gradient")
## x y
## [1,] 1 1
Unfortunately, the inputs and outputs are not always easily transformed so that the symbolic derivatives can be found. (?? Need to codify this and provide filters so we can get things to work nicely.)
As an example, how could we take object new and embed it in a function we can then use in R? We can certainly copy and paste the output into a function template, as follows,
fnfromnew <- function(x,y){
.value <- 1 + x + y
.grad <- array(0, c(length(.value), 2L), list(NULL, c("x",
"y")))
.grad[, "x"] <- 1
.grad[, "y"] <- 1
attr(.value, "gradient") <- .grad
.value
}
print(fnfromnew(3,5))
## [1] 9
## attr(,"gradient")
## x y
## [1,] 1 1
However, we would ideally like to be able to automate this to generate functions and gradients for nonlinear least squares and optimization calculations. The same criticism applies to the object old
If we have x and y set such that the function is not admissible, then both our old and new functions give a gradient that is seemingly reasonable. While the gradient of this simple function could be considered to be defined for ANY values of x and y, I (JN) am sure most users would wish for a warning at the very least in such cases.
## [1] NA
## attr(,"gradient")
## x y
## [1,] 1 1
## [1] NA
## attr(,"gradient")
## x y
## [1,] 1 1
####SafeD
We could define a way to avoid the issue of character vs. expression (and possibly other classes) as follows:
safeD <- function(obj, var) {
# safeguarded D() function for symbolic derivs
if (! is.character(var) ) stop("The variable var MUST be character type")
if (is.character(obj) ) {
eobj <- parse(text=obj)
result <- D(eobj, var)
} else {
result <- D(obj, var)
}
}
lxy2 <- expression(log(x+y^2))
clxy2 <- "log(x+y^2)"
try(print(D(clxy2, "y")))
## Error in D(clxy2, "y") : expression must not be type 'character'
## 2 * y/(x + y^2)
## 2 * y/(x + y^2)
## 2 * y/(x + y^2)
Erin Hodgess on R-help in January 2015 raised the issue of taking the
derivative of an expression that contains an indexed variable. We show
the example and its resolution, then give an explanation. Note that
indexed parameters in nonlinear regression models is an open issue for
package nlsr
, largely because there does not seem to be an
agreed mechanism to specify models with indexed parameters.
## Error in deriv.default(zzz, c("r1", "r2")) :
## Function '`[`' is not in the derivatives table
## Error in nlsDeriv(expr[[2]], name, derivEnv, do_substitute = FALSE, verbose = verbose, :
## no derivative known for '['
## Error in nlsDeriv(expr[[2]], name, derivEnv, do_substitute = FALSE, verbose = verbose, :
## no derivative known for '['
## y[3] * c(1, 0) + stop("no derivative when indexing") * r1 + c(0,
## 1)
## function (y, r1, r2)
## {
## .expr1 <- y[3]
## .expr2 <- stop("no derivative when indexing")
## .expr3 <- .expr2 * r1
## .value <- .expr1 * r1 + r2
## .grad <- array(0, c(length(.value), 2L), list(NULL, c("r1",
## "r2")))
## .grad[, "r1"] <- .expr1 + .expr3
## .grad[, "r2"] <- .expr3 + 1
## attr(.value, "gradient") <- .grad
## .value
## }
Richard Heiberger pointed out that internally, R stores
y[3]
as
"["(y,3)
that is, as a function. Duncan Murdoch pointed out the availability of nlsr and the use of newDeriv() to redefine the “[” function for the purposes of derivatives.
This is not an ideal resolution, especially as we would like to be able to get the gradients of functions with respect to vectors of parameters, noted also by Sergei Sokol in the manual for package Deriv. The following examples illustrate this.
## stop("no derivative when indexing") * r1
## Error : object 'y3' not found
## Error : object 'r1' not found