为什么我在随机梯度下降实施中花费了巨额费用?

如何解决为什么我在随机梯度下降实施中花费了巨额费用?

在尝试实现随机梯度下降时遇到了一些问题,基本上发生的是我的费用正疯狂地增长,我不知道为什么。

MSE实施:

def mse(x,y,w,b):
    predictions = x @ w 
    summed = (np.square(y - predictions - b)).mean(0)
    cost = summed / 2 
    return cost

渐变:

def grad_w(y,x,b,n_samples):
    return -y @ x / n_samples + x.T @ x @ w / n_samples + b * x.mean(0)
def grad_b(y,n_samples):
    return -y.mean(0) + x.mean(0) @ w + b

SGD实施:

def stochastic_gradient_descent(X,learning_rate=0.01,iterations=500,batch_size =100):
    
    length = len(y)
    cost_history = np.zeros(iterations)
    n_batches = int(length/batch_size)
    
    for it in range(iterations):
        cost =0
        indices = np.random.permutation(length)
        X = X[indices]
        y = y[indices]
        for i in range(0,length,batch_size):
            X_i = X[i:i+batch_size]
            y_i = y[i:i+batch_size]

            w -= learning_rate*grad_w(y_i,X_i,length)
            b -= learning_rate*grad_b(y_i,length)
            
            cost = mse(X_i,y_i,b)
        cost_history[it]  = cost
        if cost_history[it] <= 0.0052: break
        
    return w,cost_history[:it]

随机变量:

w_true = np.array([0.2,0.5,-0.2])
b_true = -1
first_feature = np.random.normal(0,1,1000)
second_feature = np.random.uniform(size=1000)
third_feature = np.random.normal(1,2,1000)
arrays = [first_feature,second_feature,third_feature]
x = np.stack(arrays,axis=1) 
y = x @ w_true + b_true + np.random.normal(0,0.1,1000)
w = np.asarray([0.0,0.0,0.0],dtype='float64')
b = 1.0

运行此命令后:

theta,cost_history = stochastic_gradient_descent(x,b)

print('Final cost/MSE:  {:0.3f}'.format(cost_history[-1]))

我明白了:

Final cost/MSE:  3005958172614261248.000

这是plot

解决方法

以下是一些建议:

  • 您的学习率对于培训而言太大:将其更改为1e-3之类应该没问题。
  • 您的更新部分可以进行如下修改:
PDOException

最终结果:

def stochastic_gradient_descent(X,y,w,b,learning_rate=0.01,iterations=500,batch_size =100):
    
    length = len(y)
    cost_history = np.zeros(iterations)
    n_batches = int(length/batch_size)
    
    for it in range(iterations):
        cost =0
        indices = np.random.permutation(length)
        X = X[indices]
        y = y[indices]
        for i in range(0,length,batch_size):
            X_i = X[i:i+batch_size]
            y_i = y[i:i+batch_size]

            w -= learning_rate*grad_w(y_i,X_i,len(X_i)) # the denominator should be the actual batch size
            b -= learning_rate*grad_b(y_i,len(X_i))
            
            cost += mse(X_i,y_i,b)*len(X_i) # add batch loss
        cost_history[it]  = cost/length # this is a running average of your batch losses,which is statistically more stable
        if cost_history[it] <= 0.0052: break
        
    return w,cost_history[:it]

enter image description here

,

嘿,@ TQCH,谢谢你。我想出了另一种方法来实现SGD,而没有内部循环,结果也很不错。

def stochastic_gradient_descent(X,learning_rate=0.35,iterations=3000,batch_size =100):
    
    length = len(y)
    cost_history = np.zeros(iterations)
    n_batches = int(length/batch_size)
    marker = 0
    cost = mse(X,b)
    print(cost)
    for it in range(iterations):
        cost =0
        indices = np.random.choice(length,batch_size)
        X_i = X[indices]
        y_i = y[indices]

        w -= learning_rate*grad_w(y_i,b)
        b -= learning_rate*grad_b(y_i,b)
            
        cost = mse(X_i,b)
        cost_history[it]  = cost
        if cost_history[it] <= 0.0075 and cost_history[it] > 0.0071: marker = it
        if cost <= 0.0052: break
    print(f'{w},{b}')
    return w,cost_history,marker,cost
w = np.asarray([0.0,0.0,0.0],dtype='float64')
b = 1.0
theta,cost = stochastic_gradient_descent(x,b)

print(f'Number of iterations: {marker}')
print('Final cost/MSE:  {:0.3f}'.format(cost))

这给了我这些结果:

1.9443112664859845,
[0.19592532 0.31735225 -0.20044424],-0.9059800816290591
迭代次数:68
最终成本/MSE:0.005

但是你是对的,我错过了我要除以向量y的总长度,而不是除以批次大小,而忘记增加批次损失了!

谢谢!

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams[&#39;font.sans-serif&#39;] = [&#39;SimHei&#39;] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -&gt; systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping(&quot;/hires&quot;) public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate&lt;String
使用vite构建项目报错 C:\Users\ychen\work&gt;npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-
参考1 参考2 解决方案 # 点击安装源 协议选择 http:// 路径填写 mirrors.aliyun.com/centos/8.3.2011/BaseOS/x86_64/os URL类型 软件库URL 其他路径 # 版本 7 mirrors.aliyun.com/centos/7/os/x86
报错1 [root@slave1 data_mocker]# kafka-console-consumer.sh --bootstrap-server slave1:9092 --topic topic_db [2023-12-19 18:31:12,770] WARN [Consumer clie
错误1 # 重写数据 hive (edu)&gt; insert overwrite table dwd_trade_cart_add_inc &gt; select data.id, &gt; data.user_id, &gt; data.course_id, &gt; date_format(
错误1 hive (edu)&gt; insert into huanhuan values(1,&#39;haoge&#39;); Query ID = root_20240110071417_fe1517ad-3607-41f4-bdcf-d00b98ac443e Total jobs = 1
报错1:执行到如下就不执行了,没有显示Successfully registered new MBean. [root@slave1 bin]# /usr/local/software/flume-1.9.0/bin/flume-ng agent -n a1 -c /usr/local/softwa
虚拟及没有启动任何服务器查看jps会显示jps,如果没有显示任何东西 [root@slave2 ~]# jps 9647 Jps 解决方案 # 进入/tmp查看 [root@slave1 dfs]# cd /tmp [root@slave1 tmp]# ll 总用量 48 drwxr-xr-x. 2
报错1 hive&gt; show databases; OK Failed with exception java.io.IOException:java.lang.RuntimeException: Error in configuring object Time taken: 0.474 se
报错1 [root@localhost ~]# vim -bash: vim: 未找到命令 安装vim yum -y install vim* # 查看是否安装成功 [root@hadoop01 hadoop]# rpm -qa |grep vim vim-X11-7.4.629-8.el7_9.x
修改hadoop配置 vi /usr/local/software/hadoop-2.9.2/etc/hadoop/yarn-site.xml # 添加如下 &lt;configuration&gt; &lt;property&gt; &lt;name&gt;yarn.nodemanager.res