内容目录
合集请查看:深度学习入门合集
参考书籍:深度学习入门 基于Python的理论与实现 作者:斋藤康毅
前置:[[计算图]]
激活函数层的实现
ReLU层
激活函数ReLU(Rectified Linear Unit)由下式表示
$$y= \begin{cases} x & \text {($x>0$)} \ 0 & \text{($x\leq0$)} \end{cases}$$求导后:
$$ \frac{\partial y}{\partial x}= \begin{cases} 1 & \text {($x>0$)} \ 0 & \text{($x\leq0$)} \end{cases}$$
利用ReLU,如果正向传播时的输入x大于0,则反向传播会将上游的值原封不动地传给下游。反过来,如果正向传播时的x小于等于0,则反向传播中传给下游的信号将停在此处
class Relu:
def __init__(self):
self.mask = None
def forward(self, x):
# 创造一个掩码
self.mask = (x <= 0)
out = x.copy()
# 将掩码应用到out上
out[self.mask] = 0
return out
def backward(self, dout):
dout[self.mask] = 0
dx = dout
return dx
Sigmoid层
sigmoid function由下所示
$$y = \frac{1}{1+\text{exp}(-x)}$$
通过计算可以得到
class Sigmoid:
def __init__(self):
self.out = None
def forward(self, x):
out = sigmoid(x)
self.out = out
return out
def backward(self, dout):
dx = dout * (1.0 - self.out) * self.out
return dx
Affine/Softmax层的实现
Affine层
感觉就是矩阵的反向传播?
批版本的Affine层
class Affine:
def __init__(self, W, b):
self.W = W
self.b = b
self.x = None
self.dW = None
self.db = None
def forward(self, x):
self.x = x
out = np.dot(x, self.W) + self.b
return out
def backward(self, dout):
dx = np.dot(dout, self.W.T)
self.dW = np.dot(self.x.T, dout)
self.db = np.sum(dout, axis=0)
return dx
Softmax-with-Loss层
class SoftmaxWithLoss:
def __init__(self):
self.loss = None
self.y = None # softmax的输出
self.t = None # 监督数据
def forward(self, x, t):
self.t = t
self.y = softmax(x)
self.loss = cross_entropy_error(self.y, self.t)
return self.loss
def backward(self, dout=1):
batch_size = self.t.shape[0]
if self.t.size == self.y.size: # 监督数据是one-hot-vector的情况
dx = (self.y - self.t) / batch_size
else:
dx = self.y.copy()
dx[np.arange(batch_size), self.t] -= 1
dx = dx / batch_size
return dx
误差反向传播法的实现
误差反向传播法会在步骤2中出现
与上一章不同的是,上一章是用数值微分求梯度
数值微分简单但是耗时多