如何让我的神经网络正确地进行线性回归?

如何解决如何让我的神经网络正确地进行线性回归?

我使用了 Michael Nielsen 所著的《神经网络和深度学习》一书中的第一个神经网络代码,该代码用于识别手写数字。它使用带有小批量和 sigmoid 激活函数的随机梯度下降。我给了它一个输入神经元、两个隐藏神经元和一个输出神经元。然后我给它一堆数据,它代表一条直线,所以基本上是零到 1 之间的一些点,其中输入与输出相同。无论我如何调整学习率和使用的 epoch 数,网络永远无法进行线性回归。是因为我使用了 sigmoid 激活函数吗?如果是这样,我还可以使用哪些其他功能?

The prediction of the network based on new input

蓝线代表网络的预测,绿线是训练数据,网络预测的输入只是 0 到 3 之间的数字,间隔为 0.01。

代码如下:

"""
network.py
~~~~~~~~~~
A module to implement the stochastic gradient descent learning
algorithm for a feedforward neural network.  Gradients are calculated
using backpropagation.  Note that I have focused on making the code
simple,easily readable,and easily modifiable.  It is not optimized,and omits many desirable features.
"""

#### Libraries
# Standard library
import random

# Third-party libraries
import numpy as np

from sklearn.datasets import make_regression
import matplotlib.pyplot as plt

class Network(object):

    def __init__(self,sizes):
        """The list ``sizes`` contains the number of neurons in the
        respective layers of the network.  For example,if the list
        was [2,3,1] then it would be a three-layer network,with the
        first layer containing 2 neurons,the second layer 3 neurons,and the third layer 1 neuron.  The biases and weights for the
        network are initialized randomly,using a Gaussian
        distribution with mean 0,and variance 1.  Note that the first
        layer is assumed to be an input layer,and by convention we
        won't set any biases for those neurons,since biases are only
        ever used in computing the outputs from later layers."""
        self.num_layers = len(sizes)
        self.sizes = sizes
        '''creates a list of arrays with random numbers with mean 0 and variance 1;
        These arrays represent the biases of each neuron in each layer so one random number is assigned per neuron in 
        each layer and every array represents one layer of biases
        '''
        self.biases = [np.random.randn(y,1) for y in sizes[1:]]
        self.weights = [np.random.randn(y,x)
                        for x,y in zip(sizes[:-1],sizes[1:])]

    #self always refers to an instance of a class
    def feedforward(self,a):
        # a are the activations of the neurons
        """Return the output of the network if ``a`` is input."""
        for b,w in zip(self.biases,self.weights):
            a = sigmoid(np.dot(w,a)+b)
            
        return a

    def SGD(self,training_data,epochs,mini_batch_size,eta,test_data=None):
        """Train the neural network using mini-batch stochastic
        gradient descent.  The ``training_data`` is a list of tuples
        ``(x,y)`` representing the training inputs and the desired
        outputs.  The other non-optional parameters are
        self-explanatory.  If ``test_data`` is provided then the
        network will be evaluated against the test data after each
        epoch,and partial progress printed out.  This is useful for
        tracking progress,but slows things down substantially."""
        if test_data: n_test = len(test_data)
        n = len(training_data)
        #this is done as many times as the number of epochs say -> that is how often the network is trained
        for j in range(epochs):
            random.shuffle(training_data)
            mini_batches = [
                training_data[k:k+mini_batch_size]
                for k in range(0,n,mini_batch_size)]
            #data is made into appropriately sized mini-batches
            for mini_batch in mini_batches:
                self.update_mini_batch(mini_batch,eta)
                for x,y in mini_batch:
                    print("Loss: ",(self.feedforward(x) - y)**2)
            if test_data:
                print ("Epoch {0}: {1} / {2}".format(
                    j,self.evaluate(test_data),n_test))
            else:
                print ("Epoch {0} complete".format(j))

    def update_mini_batch(self,mini_batch,eta):
        """Update the network's weights and biases by applying
        gradient descent using backpropagation to a single mini batch.
        The ``mini_batch`` is a list of tuples ``(x,y)``,and ``eta``
        is the learning rate."""
        #nabla_b and nabla_w are the same lists of matrices as "biases" and 
        #"weights" but all matrices are filled with zeroes; Thus,it is reset to 0 for every mini_batch.        
        nabla_b = [np.zeros(b.shape) for b in self.biases]
        nabla_w = [np.zeros(w.shape) for w in self.weights]
        for x,y in mini_batch:
            delta_nabla_b,delta_nabla_w = self.backprop(x,y)
            nabla_b = [nb+dnb for nb,dnb in zip(nabla_b,delta_nabla_b)]
            nabla_w = [nw+dnw for nw,dnw in zip(nabla_w,delta_nabla_w)]
        #updates the weights and biases by subtracting the average of the sum of the derivatives of the cost
        #function wrt to the biases/weights that were added for every training example in the mini_batch.
        self.weights = [w-(eta/len(mini_batch))*nw
                        for w,nw in zip(self.weights,nabla_w)]
        self.biases = [b-(eta/len(mini_batch))*nb
                       for b,nb in zip(self.biases,nabla_b)]

    def backprop(self,x,y):
        """Return a tuple ``(nabla_b,nabla_w)`` representing the
        gradient for the cost function C_x.  ``nabla_b`` and
        ``nabla_w`` are layer-by-layer lists of numpy arrays,similar
        to ``self.biases`` and ``self.weights``."""
        """Makes two lists filled with zeros in the same shape as biases and weights"""
        nabla_b = [np.zeros(b.shape) for b in self.biases]
        nabla_w = [np.zeros(w.shape) for w in self.weights]
        # feedforward
        activation = x
        activations = [x]
        zs = [] # list to store all the z vectors,layer by layer
        for b,self.weights):
            #multiplies w matrix for each layer by activation vector and adds bias
            z = np.dot(w,activation)+b
            zs.append(z)
            activation = sigmoid(z)
            activations.append(activation)
        # backward pass
        #this calculates the output error
        delta = self.cost_derivative(activations[-1],y) * \
            sigmoid_prime(zs[-1])
        #this is the derivative of the cost function wrt the biases in the last layer
        nabla_b[-1] = delta
        #this is the derivative of the cost function wrt the weights in the last layer
        nabla_w[-1] = np.dot(delta,activations[-2].transpose())
        for l in range(2,self.num_layers): #Code really is this: for l in range(2,self.num_layers):
            z = zs[-l]
            sp = sigmoid_prime(z)
            #This is the vector of errors of the layer -l
            delta = np.dot(self.weights[-l+1].transpose(),delta) * sp
            #fills the matrices nabla_b and nabla_w with the derivatives of the 
            #cost function with respect to the biases and weights in layers -l
            nabla_b[-l] = delta
            nabla_w[-l] = np.dot(delta,activations[-l-1].transpose())
        return (nabla_b,nabla_w)

    def evaluate(self,test_data):
        """Return the number of test inputs for which the neural
        network outputs the correct result. Note that the neural
        network's output is assumed to be the index of whichever
        neuron in the final layer has the highest activation."""
        test_results = [(np.argmax(self.feedforward(x)),y)
                        for (x,y) in test_data]
        #returns the number of inputs that were preducted correctly.
        return sum(int(x == y) for (x,y) in test_results)

    def cost_derivative(self,output_activations,y):
        """Return the vector of partial derivatives \partial C_x /
        \partial a for the output activations."""
        return (output_activations-y)

#### Miscellaneous functions
def sigmoid(z):
    """The sigmoid function.""" 
    return 1.0/(1.0+np.exp(-z))

def sigmoid_prime(z):
    """Derivative of the sigmoid function."""
    return sigmoid(z)*(1-sigmoid(z))

解决方法

Sigmoid 激活函数用于分类任务,在您的情况下是识别手写数字。而线性回归是回归任务,其中输出应该是连续的。如果您希望输出层充当回归,您应该使用 linear 激活函数,这是 Keras Dense 层的默认设置。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams['font.sans-serif'] = ['SimHei'] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -> systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping("/hires") public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate<String
使用vite构建项目报错 C:\Users\ychen\work>npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-
参考1 参考2 解决方案 # 点击安装源 协议选择 http:// 路径填写 mirrors.aliyun.com/centos/8.3.2011/BaseOS/x86_64/os URL类型 软件库URL 其他路径 # 版本 7 mirrors.aliyun.com/centos/7/os/x86
报错1 [root@slave1 data_mocker]# kafka-console-consumer.sh --bootstrap-server slave1:9092 --topic topic_db [2023-12-19 18:31:12,770] WARN [Consumer clie
错误1 # 重写数据 hive (edu)> insert overwrite table dwd_trade_cart_add_inc > select data.id, > data.user_id, > data.course_id, > date_format(
错误1 hive (edu)> insert into huanhuan values(1,'haoge'); Query ID = root_20240110071417_fe1517ad-3607-41f4-bdcf-d00b98ac443e Total jobs = 1
报错1:执行到如下就不执行了,没有显示Successfully registered new MBean. [root@slave1 bin]# /usr/local/software/flume-1.9.0/bin/flume-ng agent -n a1 -c /usr/local/softwa
虚拟及没有启动任何服务器查看jps会显示jps,如果没有显示任何东西 [root@slave2 ~]# jps 9647 Jps 解决方案 # 进入/tmp查看 [root@slave1 dfs]# cd /tmp [root@slave1 tmp]# ll 总用量 48 drwxr-xr-x. 2
报错1 hive> show databases; OK Failed with exception java.io.IOException:java.lang.RuntimeException: Error in configuring object Time taken: 0.474 se
报错1 [root@localhost ~]# vim -bash: vim: 未找到命令 安装vim yum -y install vim* # 查看是否安装成功 [root@hadoop01 hadoop]# rpm -qa |grep vim vim-X11-7.4.629-8.el7_9.x
修改hadoop配置 vi /usr/local/software/hadoop-2.9.2/etc/hadoop/yarn-site.xml # 添加如下 <configuration> <property> <name>yarn.nodemanager.res