From bc797f595a389ab693b3ace2e9e0b861e64057de Mon Sep 17 00:00:00 2001 From: u <595770995@qq.com> Date: Sat, 26 Oct 2024 00:21:39 +0800 Subject: [PATCH 1/2] =?UTF-8?q?Updatee=2003GradMode.md=20=E6=9B=B4?= =?UTF-8?q?=E6=AD=A3=E5=8F=8D=E5=90=91=E4=BC=A0=E6=92=AD=E5=8F=82=E6=95=B0?= =?UTF-8?q?=E9=94=99=E8=AF=AF?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- 05Framework/02AutoDiff/03GradMode.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/05Framework/02AutoDiff/03GradMode.md b/05Framework/02AutoDiff/03GradMode.md index bd80e9bb..a19b8226 100644 --- a/05Framework/02AutoDiff/03GradMode.md +++ b/05Framework/02AutoDiff/03GradMode.md @@ -85,7 +85,7 @@ $$ 转化成如上 DAG(有向无环图)结构之后,我们可以很容易分步计算函数的值,并求取它每一步的导数值,然后,我们把 $df/dx_1$ 求导过程利用链式法则表示成如下的形式: $$ -\dfrac{df}{dx_1}= \dfrac{dv_{-1}}{dx_1} \cdot (\dfrac{dv_{1}}{dv_{-1}} \cdot \dfrac{dv_{4}}{dv_{1}} + \dfrac{dv_{2}}{dv_{-1}} \cdot \dfrac{dv_{4}}{dx_{2}}) \cdot \dfrac{dv_{5}}{dv_{4}} \cdot \dfrac{df}{dv_{5}} +\dfrac{df}{dx_1}= \dfrac{dv_{-1}}{dx_1} \cdot (\dfrac{dv_{1}}{dv_{-1}} \cdot \dfrac{dv_{4}}{dv_{1}} + \dfrac{dv_{2}}{dv_{-1}} \cdot \dfrac{dv_{4}}{dv_{2}}) \cdot \dfrac{dv_{5}}{dv_{4}} \cdot \dfrac{df}{dv_{5}} $$ > 整个求导可以被拆成一系列微分算子的组合。 From fc98eec15d35c1561c7201f436b0f5a16f7d3db7 Mon Sep 17 00:00:00 2001 From: u <595770995@qq.com> Date: Sat, 26 Oct 2024 16:41:30 +0800 Subject: [PATCH 2/2] =?UTF-8?q?Update=2002Fundamentals.md=20=E6=B1=82?= =?UTF-8?q?=E5=AF=BC=E9=94=99=E8=AF=AF?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- 05Framework/01Foundation/02Fundamentals.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/05Framework/01Foundation/02Fundamentals.md b/05Framework/01Foundation/02Fundamentals.md index a34dc0db..6d584d3d 100644 --- a/05Framework/01Foundation/02Fundamentals.md +++ b/05Framework/01Foundation/02Fundamentals.md @@ -78,7 +78,7 @@ $$ loss(w)=f(w)-g $$ 按照高中数学的基本概念,假设神经网络是一个复合函数(高维函数),那么对这个复合函数求导,用的是链式法则。举个简单的例子,考虑函数 $z=f(x,y)$,其中 $x=g(t),t=h(t)$ ,其中 $g(t), h(t)$ 是可微函数,那么对函数 $z$ 关于 $t$ 求导,函数会顺着链式向外逐层进行求导。 -$$ \frac{\mathrm{d} x}{\mathrm{d} t} = \frac{\partial z}{\partial x} \frac{\mathrm{d} x}{\mathrm{d} t} + \frac{\partial z}{\partial y} \frac{\mathrm{d} y}{\mathrm{d} t} $$ +$$ \frac{\mathrm{d} z}{\mathrm{d} t} = \frac{\partial z}{\partial x} \frac{\mathrm{d} x}{\mathrm{d} t} + \frac{\partial z}{\partial y} \frac{\mathrm{d} y}{\mathrm{d} t} $$ 既然有了链式求导法则,而神经网络其实就是个庞大的复合函数,直接求导不就解决问题了吗?反向到底起了什么作用?下面来看几组公式。