R math derivative   

R math derivative

R中求导与求极值

25 Sep 2016

Go back r-math-derivative

这里记录一下怎么用R来求导的过程,但是这里我侧重的是用R直接来进行求导的技术

使用的函数:在R的基本包stats中三个基本的函数可以用来进行求导

D (expr, name)
deriv(expr, ...)
deriv3(expr, ...)

D is modelled after its S namesake for taking simple symbolic derivatives.

D()是直接做符号运算的。

deriv is a generic function with a default and a formula method. It returns a call for computing the expr and its (partial) derivatives, simultaneously. It uses so-called algorithmic derivatives. If function.arg is a function, its arguments can have default values, see the fx example below.

Arguments

Arguments function
expr A expression or call or (except D) a formula with no lhs
name,namevec character vector, giving the variable names (only one for D()) with respect to which derivatives will be computed.
function.arg If specified and non-NULL, a character vector of arguments for a function return, or a function (with empty body) or TRUE, the latter indicating that a function with argument names namevec should be used.
tag character; the prefix to be used for the locally created variables in result.
hessian a logical value indicating whether the second derivatives should be calculated and incorporated in the return value.

教程1

摘自: http://blog.fens.me/r-math-derivative/

1.根据实例演示

dx <- deriv(y ~ x^3, "x") ; dx # 生成导数公式
## expression({
##     .value <- x^3
##     .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
##     .grad[, "x"] <- 3 * x^2
##     attr(.value, "gradient") <- .grad
##     .value
## })
mode(dx) # 查看dx 变量类型
## [1] "expression"
x<-1:2 # 给自变量x 赋值
eval(dx) # 运行求导计算
## [1] 1 8
## attr(,"gradient")
##       x
## [1,]  3
## [2,] 12
# 原函数的计算结果和使用梯度下降法,导函数的计算结果
# x=1,dx=3*1^2=3
# x=2,dx=3*2^2=12

我们使用deriv(expr, name)函数时通常要传2 个参数,第一参数expr 就是原函数公式,用~号 来分隔公式的两边,第二参数name 用于指定函数的自变量。deriv()函数会返回一个表达式 expression 类型变量,再用eval()函数运行这个表达式得到就可得到计算结果,如上面的代码实 现。

如果希望以函数的形式调用计算公式,那么你还需要传第三个参数func,并让func 参数为TRUE

参考下面的代码实现。

# 计算正弦函数y=sin(x)的导数,根据导数计算公式,用于手动计算的变形结果为 y’=cos(x),当x=pi 时,y’=-1,当x=4*pi 时,y’=1,其中pi=π 表示圆周率。

dx <- deriv(y ~ sin(x), "x", func= TRUE) ; dx # 生成导数公式的调用函数
## function (x) 
## {
##     .value <- sin(x)
##     .grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
##     .grad[, "x"] <- cos(x)
##     attr(.value, "gradient") <- .grad
##     .value
## }
mode(dx) # 检查dx 的类型
## [1] "function"
dx(c(pi,4*pi)) # 以参数作为自变量,进行函数调用
## [1]  1.224606e-16 -4.898425e-16
## attr(,"gradient")
##       x
## [1,] -1
## [2,]  1
# 导函数的计算结果
# x=pi,dx=cos(pi)=-1
# x=4*pi,dx=cos(4*pi)=1

2.二阶导数计算

比如,计算 y=sin(a*x) 函数的二阶导数导数y”,其中a 为常数,根据导数计算公式,用于手动 计算的变形结果为一阶导数为\(y’=a*cos(a*x)\),对y’再求导公式变形为,\(y”=-a^2*sin(a*x)\)

用R 语言进行程序实现

a<-2 # 设置a 的值
dx<-deriv(y~sin(a*x),"x",func = TRUE) # 生成一阶导数公式
dx(pi/3) # 计算一阶导数
## [1] 0.8660254
## attr(,"gradient")
##       x
## [1,] -1
# 导函数计算结果y'= a*cos(a*x)=2*cos(2*pi/3)=-1
dx<-deriv(y~a*cos(a*x),"x",func = TRUE) # 对一阶导函数求导
dx(pi/3)
## [1] -1
## attr(,"gradient")
##              x
## [1,] -3.464102
# 导函数计算结果y'= -a^2*sin(a*x)=-2^2*sin(2*pi/3)=-3.464102

上面二阶导数的计算,我们是动手划分为两次求导进行计算的,利用deriv3()函数其实合并成一步计算。

dx<-deriv3(y~sin(a*x),"x",func = TRUE) # 生成二阶导数公式
dx(pi/3) # 计算导数
## [1] 0.8660254
## attr(,"gradient")
##       x
## [1,] -1
## attr(,"hessian")
## , , x
## 
##              x
## [1,] -3.464102
#前面是一阶导数的结果,后面是二阶导数的结果

Deriv3 是求二阶导的方法,与 Deriv 使用方法相同,返回结果为二阶导数。

在R 语言中二阶导数是可以直接求出的,想计算更高阶的导数就需要其他的数学工具包了。

3.D 方法

D 方法与 deriv 的最大不同之处在于其返回类型为“call”类型,这也意味着它更容易嵌入在其他方法中被引用。其次它能够求取二阶偏导。再者 D 方法的嵌套使用可以求取任意高阶导数。最后 D 的结果要比 deriv 的结果直观一些,deriv 常会把返回的表达式转换成复合表达式的形式,不如 D 方法显示结果直接。

4.高阶求导

## Higher derivatives:
DD <- function(expr, name, order = 1) {
   if(order < 1) stop("'order' must be >= 1")
   if(order == 1) D(expr, name)
   else DD(D(expr, name), name, order - 1)
}
DD(expression(sin(x^2)), "x", 3)
## -(sin(x^2) * (2 * x) * 2 + ((cos(x^2) * (2 * x) * (2 * x) + sin(x^2) * 
##     2) * (2 * x) + sin(x^2) * (2 * x) * 2))
#教程自带的例子
# 如果你要求sin(x^2)在x=3时的三阶导数的值
x=3
eval(DD(expression(sin(x^2)), "x", 3))
## [1] 181.9679

自带的教程

## formula argument :
dx2x <- deriv(~ x^2, "x") ; dx2x
#the same as: dx2x <- deriv(expr=~ x^2, name="x")

mode(dx2x)
x <- -1:2
eval(dx2x)#程序自动会读取x这个变量的值

## Something 'tougher':
trig.exp <- expression(sin(cos(x + y^2)))
( D.sc <- D(trig.exp, "x") )
all.equal(D(trig.exp[[1]], "x"), D.sc)

( dxy <- deriv(trig.exp, c("x", "y")) )
y <- 1
eval(dxy)
eval(D.sc)

## function returned:
deriv((y ~ sin(cos(x) * y)), c("x","y"), func = TRUE)

## function with defaulted arguments:
(fx <- deriv(y ~ b0 + b1 * 2^(-x/th), c("b0", "b1", "th"),
             function(b0, b1, th, x = 1:7){} ) )
fx(2, 3, 4)

## Higher derivatives
deriv3(y ~ b0 + b1 * 2^(-x/th), c("b0", "b1", "th"),
     c("b0", "b1", "th", "x") )

## Higher derivatives:
DD <- function(expr, name, order = 1) {
   if(order < 1) stop("'order' must be >= 1")
   if(order == 1) D(expr, name)
   else DD(D(expr, name), name, order - 1)
}
DD(expression(sin(x^2)), "x", 3)
## showing the limits of the internal "simplify()" :
## Not run: 
-sin(x^2) * (2 * x) * 2 + ((cos(x^2) * (2 * x) * (2 * x) + sin(x^2) *
    2) * (2 * x) + sin(x^2) * (2 * x) * 2)

## End(Not run)

5.最后还有一个内容就是求极值

导数求好了,当然还要求驻点了,所以现在再学一个函数。求解方程根公式uniroot(f,interval,lower=min(interval),upper=max(interval),tol=,maxiter=,)

## some platforms hit zero exactly on the first step:
## if so the estimated precision is 2/3.
f <- function (x, a) x - a
str(xmin <- uniroot(f, c(0, 1), tol = 0.0001, a = 1/3))
## List of 5
##  $ root      : num 0.333
##  $ f.root    : num 0
##  $ iter      : int 1
##  $ init.it   : int NA
##  $ estim.prec: num 0.667
#等同于str(xmin <- uniroot(f=f, interval=c(0, 1), tol = 0.0001, a = 1/3))
# 其中,f表示要求解方程,interval表示包含方程根的初始区间,lower表示interval中方程根初始区间的左端点,upper表示右端点,tol是想要达到的计算精度,maxiter是设置的最大迭代次数。

通过uniroot()函数就求得了方程\(f(x)=x-1/3=0\)的一个根

## handheld calculator example: fixed point of cos(.):
uniroot(function(x) cos(x) - x, lower = -pi, upper = pi, tol = 1e-9)$root
## [1] 0.7390851
str(uniroot(function(x) x*(x^2-1) + .5, lower = -2, upper = 2,
            tol = 0.0001))
## List of 5
##  $ root      : num -1.19
##  $ f.root    : num -2.55e-07
##  $ iter      : int 7
##  $ init.it   : int NA
##  $ estim.prec: num 5e-05
str(uniroot(function(x) x*(x^2-1) + .5, lower = -2, upper = 2,
            tol = 1e-10))
## List of 5
##  $ root      : num -1.19
##  $ f.root    : num 5.67e-11
##  $ iter      : int 8
##  $ init.it   : int NA
##  $ estim.prec: num 5e-11
## Find the smallest value x for which exp(x) > 0 (numerically):
r <- uniroot(function(x) 1e80*exp(x) - 1e-300, c(-1000, 0), tol = 1e-15)
str(r, digits.d = 15) # around -745, depending on the platform.
## List of 5
##  $ root      : num -745
##  $ f.root    : num -1e-300
##  $ iter      : int 70
##  $ init.it   : int NA
##  $ estim.prec: num 3.41060513164848e-13
exp(r$root)     # = 0, but not for r$root * 0.999...
## [1] 0
minexp <- r$root * (1 - 10*.Machine$double.eps)
exp(minexp)     # typically denormalized
## [1] 4.940656e-324

rootsolve包中的uniroot.all函数

library(rootSolve)
uniroot.all(function(x) x^4-10*x^3+35*x^2-50*x+24,c(0,5))
## [1] 1 2 3 4
uniroot.all(function(x) x^4-10*x^3+35*x^2-50*x+24,c(-1,10))
## [1] 0.9999999 2.0000189 3.0000053 3.9999991
uniroot.all(function(x) x^4-10*x^3+35*x^2-50*x+24,c(-10,10))
## [1] 1 2 3 4
uniroot.all(function(x) x^4-10*x^3+35*x^2-50*x+24,c(-100,100))
## [1] 2 4

 自带的教程

## =======================================================================
##   Mathematical examples  
## =======================================================================

# a well-behaved case...
fun <- function (x) cos(2*x)^3

curve(fun(x), 0, 10,main = "uniroot.all")

All <- uniroot.all(fun, c(0, 10))
points(All, y = rep(0, length(All)), pch = 16, cex = 2)

# a difficult case...
f <- function (x) 1/cos(1+x^2)
AA <- uniroot.all(f, c(-5, 5))
curve(f(x), -5, 5, n = 500, main = "uniroot.all")
points(AA, rep(0, length(AA)), col = "red", pch = 16)

f(AA)  # !!!
##  [1]  2208.227  1811.285  1741.080 -2735.492 -2273.341  3422.914  2635.566
##  [8] 11794.351 11794.351  2635.566  3422.914 -2273.341 -2735.492  1741.080
## [15]  1811.285  2208.227
## =======================================================================
## Ecological modelling example  
## =======================================================================

# Example from the book of Soetaert and Herman(2009)
# A practical guide to ecological modelling -
# using R as a simulation platform. Springer

r   <- 0.05
K   <- 10
bet <- 0.1
alf <- 1

# the model : density-dependent growth and sigmoid-type mortality rate
rate <- function(x, r = 0.05) r*x*(1-x/K) - bet*x^2/(x^2+alf^2)

# find all roots within the interval [0,10]
Eq   <- uniroot.all(rate, c(0, 10))

# jacobian evaluated at all roots: 
# This is just one value - and therefore jacobian = eigenvalue
# the sign of eigenvalue: stability of the root: neg=stable, 0=saddle, pos=unstable

eig <- vector()
for (i in 1:length(Eq)) 
  eig[i] <- sign (gradient(rate, Eq[i]))

curve(rate(x), ylab = "dx/dt", from = 0, to = 10,
      main = "Budworm model, roots", 
      sub = "Example from Soetaert and Herman, 2009")
abline(h = 0)
points(x = Eq, y = rep(0, length(Eq)), pch = 21, cex = 2,
       bg = c("grey", "black", "white")[eig+2] )
legend("topleft", pch = 22, pt.cex = 2,
       c("stable", "saddle", "unstable"),
       col = c("grey", "black", "white"), 
       pt.bg = c("grey", "black", "white"))

## =======================================================================
## Vectorisation:
## =======================================================================
# from R-help Digest, Vol 130, Issue 27
#https://stat.ethz.ch/pipermail/r-help/2013-December/364799.html

integrand1 <- function(x) 1/x*dnorm(x) 
integrand2 <- function(x) 1/(2*x-50)*dnorm(x) 
integrand3 <- function(x, C) 1/(x+C)

res <- function(C) {
  integrate(integrand1, lower = 1, upper = 50)$value + 
  integrate(integrand2, lower = 50, upper = 100)$value - 
  integrate(integrand3, C = C, lower = 1, upper = 100)$value
}

# uniroot passes one value at a time to the function, so res can be used as such
uniroot(res, c(1, 1000))
## $root
## [1] 837.0516
## 
## $f.root
## [1] 1.622351e-11
## 
## $iter
## [1] 9
## 
## $init.it
## [1] NA
## 
## $estim.prec
## [1] 6.103516e-05
# Need to vectorise the function to use uniroot.all:
res <- Vectorize(res)
uniroot.all(res, c(1,1000))
## [1] 837.0516