Erlo

微积分笔记04:常见的矩阵求导运算

2025-03-13 17:30:14 发布   66 浏览  
页面报错/反馈
收藏 点赞

微积分笔记04:常见的矩阵求导运算

4.1 常规矩阵求导示例

4.1.1 求导示例1:(f(x)=A_{mtimes n}cdot x_{n times 1}) (Rightarrow f'_{x^T}(x)=A_{mtimes n})

如:

[A= begin{bmatrix} a_1&a_2&a_3\ b_1&b_2&b_3 end{bmatrix}, x= begin{bmatrix} x_1\ x_2\ x_3 end{bmatrix} Rightarrow f(x)= begin{bmatrix} a_1x_1+a_2x_2+a_3x_3\ b_1x_1+b_2x_2+b_3x_3 end{bmatrix} ]

由矩阵性质和意义(参数项直接保留在矩阵中)可得:

[tag{1} f'_{x^T}(x)= begin{bmatrix} a_1&a_2&a_3\ b_1&b_2&b_3 end{bmatrix}=A ]

4.1.2 求导示例2:(f(x)= x_{1 times m}cdot A_{mm} cdot x^T_{1 times m} Rightarrow f'_x(x)=(A_{mm}+A_{mm}^T)cdot x_{1 times m})

如:

[x= begin{bmatrix} x_1&x_2 end{bmatrix}, A= begin{bmatrix} a&b\ c&d end{bmatrix}, x^T= begin{bmatrix} x_1\ x_2 end{bmatrix} ]

[Rightarrow f(x)= begin{bmatrix} ax_1+cx_2&bx_1+dx_2 end{bmatrix} cdot begin{bmatrix} x_1\ x_2 end{bmatrix} ]

[qquadquad = begin{bmatrix} a{x_1}^2+bx_1x_2+cx_1x_2+dx_2^2 end{bmatrix} ]

则有:

[f'_x(x)= begin{bmatrix} 2ax_1+bx_2+cx_2&2dx_2+bx_1+cx_1 end{bmatrix} ]

[tag{2} = begin{bmatrix} a&b\ c&d end{bmatrix} cdot begin{bmatrix} x_1&x_2 end{bmatrix} + begin{bmatrix} a&c\ b&d end{bmatrix} cdot begin{bmatrix} x_1&x_2 end{bmatrix} =(A+A^T)x ]

4.1.3 求导示例3:(f(x)=x_{1times n}^Tcdot a_{n times 1} Rightarrow f_x'(x)=(x_{1times n}cdot a_{n times 1}^T)'_x=a)

如:

[x^T= begin{bmatrix} x_1&x_2 end{bmatrix}, a= begin{bmatrix} a_1\ a_2 end{bmatrix} ]

[Rightarrow f(x)= x^Tcdot a= begin{bmatrix} x_1a_1+x_2a_2 end{bmatrix} =xcdot a^T ]

又:

[x= begin{bmatrix} x_1\ x_2 end{bmatrix} ]

则由矩阵的性质及意义(参数项直接保留在矩阵中),有:

[tag{3} f'_x(x)= (xcdot a^T)_x' = begin{bmatrix} a_1\ a_2 end{bmatrix} =a ]

4.1.4 求导示例4:(f(x)=x_{mtimes 1}^Tcdot A_{m times n}cdot y_{n times 1} Rightarrow f_x'(x)=Ay,f'_A(x)=xy^T)

如:

[x^T= begin{bmatrix} x_1&x_2&x_3 end{bmatrix}, A= begin{bmatrix} a_1&a_2\ a_3&a_4\ a_5&a_6 end{bmatrix}, y= begin{bmatrix} y_1\ y_2\ end{bmatrix} ]

[Rightarrow f(x) =x^Tcdot Acdot y= begin{bmatrix} a_1x_1+a_3x_2+a_5x_3&a_2x_1+a_4x_2+a_6x_3\ end{bmatrix} cdot begin{bmatrix} y_1\ y_2\ end{bmatrix} ]

[qquadqquadqquadqquadqquadquad = begin{bmatrix} (a_1x_1+a_3x_2+a_5x_3)cdot y_1+(a_2x_1+a_4x_2+a_6x_3)cdot y_2 end{bmatrix} ]

则有:

[f'_x(x)= begin{bmatrix} (a_1+a_3+a_5)cdot y_1+(a_2+a_4+a_6)cdot y_2 end{bmatrix} =A cdot y ]

[tag{4} f'_A(x)= begin{bmatrix} (x_1)cdot y_1+(x_1)cdot y_2\ (x_2)cdot y_1+(x_2)cdot y_2\ (x_3)cdot y_1+(x_3)cdot y_2 end{bmatrix} =xcdot y^T ]

4.2 矩阵的范数求导示例

设存在矩阵(X_{N times n},向量a_{n times 1},y_{N times 1})

(f(x)=||Xcdot a-y||^2),则(f'_a(x))的求解过程如下:

由范数相关性质可得:

[f(x)=(Xcdot a-y)cdot (Xcdot a-y)^T ]

[qquad qquad =(Xcdot a-y)cdot (a^Tcdot X^T -y^T) ]

[tag{5} qquad qquadqquadqquadqquadquad =acdot X X^T cdot a^T -Xcdot acdot y^T-ycdot a^T cdot X^T + yy^T ]

式(5)中:

对于项(acdot X X^T cdot a^T),由常规矩阵求导的式(2)可得:

[(acdot X X^T cdot a^T)'_a=(XX^T+X^TX)cdot a=2XX^Tcdot a ]

对于项(Xcdot acdot y^T),由常规矩阵求导的式(3)可得:

[(Xcdot acdot y^T)_a'=(y^Tcdot Xcdot a )_a'=[(X^Tcdot y )^Tcdot a] _a'=X^Tcdot y ]

对于项(ycdot a^T cdot X^T)

[(ycdot a^T cdot X^T)'_a=(a^Tcdot X^Tcdot y)'_a=X^Tcdot y ]

由上可得:

[f'_a(x)=(||Xcdot a-y||^2)_a'=2(XX^Tcdot a-X^Tcdot y) ]

4.3 矩阵的迹求导示例

4.3.1 矩阵的迹求导示例1:(tr'_A(A)=I)

设存在矩阵(A_{mm}),且(tr(A))为矩阵(A)的迹,则有:

[tr(A)=Sigma_{i=1}^m a_{ii} ]

由矩阵的特性和意义(参数项直接保留在矩阵中)可得:

[tag{6} Rightarrow tr(A)'_A=I= begin {bmatrix} 1&&&\ &1&&\ &&...&\ &&&1\ end{bmatrix} ]

4.3.2 矩阵的迹求导示例2:(tr'_A(Acdot B)=B^T)

设存在矩阵(A_{mm}、B_{mm}),且(tr(Acdot B))(Acdot B)的迹,则有:

[tr(Acdot B)=Sigma_{i=1}^mSigma_{j=1}^m a_{ij}b_{ji} ]

由矩阵的特性和意义(参数项直接保留在矩阵中)可得:

[tag{7} tr'_A(Acdot B)=(Sigma_{i=1}^mSigma_{j=1}^m a_{ij}b_{ji})'_A=B^T ]

4.3.3 矩阵的迹求导示例3:(tr'_A(Acdot A^T)=2cdot A)

设存在矩阵(A_{mm}),且(tr(Acdot A^T))(Acdot A^T)的迹,则有:

[tr(Acdot A^T)=Sigma_{i=1}^mSigma_{j=1}^m a_{ij}a_{ji}=Sigma_{i=1}^mSigma_{j=1}^m a^2_{ij} ]

由矩阵的特性和意义(参数项直接保留在矩阵中)可得:

[tag{8} tr'_A(Acdot A^T)=(Sigma_{i=1}^mSigma_{j=1}^m a^2_{ij})'_A=(A^2)'_A=2cdot A ]

4.4 行列式求导示例:(|A|'_A=|A|cdot (A^{-1})^T)

设存在矩阵(A_{mm})(|A|)是A的行列式,(a_{ij})是A中任一元素,(A_{ij})(a_{ij})的代数余子式

则有:

[|A|=a_{i1}A_{i1}+a_{i2}A_{i2}+...+a_{im}A_{im} ]

[Rightarrow |A|'_A=(a_{i1}A_{i1}+a_{i2}A_{i2}+...+a_{im}A_{im})'_A ]

[qquadqquadqquadqquad = begin {bmatrix} (a_{11}A_{11}+a_{12}A_{12}+...+a_{1m}A_{1m})'_A\ (a_{21}A_{21}+a_{22}A_{22}+...+a_{2m}A_{2m})'_A\ ......\ (a_{m1}A_{m1}+a_{m2}A_{m2}+...+a_{mm}A_{mm})'_A end {bmatrix} ]

[tag{9} qquadqquadquad = begin {bmatrix} A_{11}&A_{12}&...&A_{1m}\ A_{21}&A_{22}&...&A_{2m}\ &&......&\ A_{m1}&A_{m2}&...&A_{mm}\ end {bmatrix} =A^{*T} ]

由矩阵的逆相关性质(A^{-1}=frac{A^*}{|A|})可得:

[tag{10} |A|'_A=|A|cdot (A^{-1})^T ]

登录查看全部

参与评论

评论留言

还没有评论留言,赶紧来抢楼吧~~

手机查看

返回顶部

给这篇文章打个标签吧~

棒极了 糟糕透顶 好文章 PHP JAVA JS 小程序 Python SEO MySql 确认