如何解决如何正确实现动量和衰减 - SGD
我正在尝试将动量和衰减应用于小批量 SGD: 更新权重的正确方法是什么,一旦设置衰减,我就会得到奇怪的结果..
import numpy as np
def _mini_batch(self,X,y,batch_size):
# sack data for shuffle - mini batch
rows = len(X)
X_full = np.hstack(( np.ones((rows,1)),np.array(y).reshape(rows,-1) ))
np.random.shuffle(X_full)
# Performing minibatch
num_batches = rows // batch_size
for rng in range(num_batches):
start_rng,end_rng = rng*batch_size,(rng+1)*batch_size
yield X_full[start_rng:end_rng,:-1],X_full[start_rng:end_rng,-1] # X_batch,y_batch
if not rows % batch_size == 0:
yield X_full[end_rng:rows,X_full[end_rng:rows,-1] # X_batch,y_batch
decay_rate = 0.2
alpha = 0.1 #learning rate
weights = np.random.normal(size=X.shape[-1]) #np.zeros(X.shape[-1])
rows = len(X)
for i in range(epochs):
# init mini-batch to update gradients for each batch
for X_batch,y_batch in _mini_batch(X,batch_size):
train_predictions = np.dot(X,weights) #y_hat
errors = np.subtract(train_predictions,y)
self.weights = (1. - decay_rate) * weights - alpha * np.dot(X.T,errors) / rows
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。