微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

这个MLP反向传播实现有什么问题?

如何解决这个MLP反向传播实现有什么问题?

我正在用C ++编写MLP neural network,但我在反向传播方面苦苦挣扎。我的实现紧随this article,但是我做错了一些,无法发现问题。我的Matrix class确认在任何Matrix计算中都没有尺寸不匹配的情况,但是输出似乎总是接近零或无穷大。是here所述的“消失”或“爆炸”梯度,还是还有其他问题?

这是我的激活函数及其派生函数

double sigmoid(double d) {
    return 1/(1+exp(-d));
}

double dsigmoid(double d) {
    return sigmoid(d) * (1 - sigmoid(d));
}

这是我的训练算法:

void KNN::train(const Matrix& input,const Matrix& target) {
    this->layer[0] = input;
    for(uint i = 1; i <= this->num_depth+1; i++) {
        this->layer[i] = Matrix::multiply(this->weights[i-1],this->layer[i-1]);
        this->layer[i] = Matrix::function(this->layer[i],sigmoid);
    }
    this->deltas[this->num_depth+1] = Matrix::multiply(Matrix::subtract(this->layer[this->num_depth+1],target),Matrix::function(Matrix::multiply(this->weights[this->num_depth],this->layer[this->num_depth]),dsigmoid),true);
    this->gradients[this->num_depth+1] = Matrix::multiply(this->deltas[this->num_depth+1],Matrix::transpose(this->layer[this->num_depth]));
    this->weights[this->num_depth] = Matrix::subtract(this->weights[this->num_depth],Matrix::multiply(Matrix::multiply(this->weights[this->num_depth],this->learning_rate),this->gradients[this->num_depth+1],true));
    for(int i = this->num_depth; i > 0; i--) {
        this->deltas[i] = Matrix::multiply(Matrix::multiply(Matrix::transpose(this->weights[i]),this->deltas[i+1]),Matrix::function(Matrix::multiply(this->weights[i-1],this->layer[i-1]),true);
        this->gradients[i] = Matrix::multiply(this->deltas[i],Matrix::transpose(this->layer[i-1]));
        this->weights[i-1] = Matrix::subtract(this->weights[i-1],Matrix::multiply(Matrix::multiply(this->weights[i-1],this->gradients[i],true));
    }
}

Matrix :: multiply中的第三个参数指示是否使用Hadamard乘积(认为false)。 this-> num_depth是隐藏层的数量

添加偏差似乎可以起到某些作用,但是输出几乎总是趋于零。

void KNN::train(const Matrix& input,this->layer[i-1]);
        this->layer[i] = Matrix::add(this->layer[i],this->biases[i-1]);
        this->layer[i] = Matrix::function(this->layer[i],this->activation);
    }
    this->deltas[this->num_depth+1] = Matrix::multiply(Matrix::subtract(this->layer[this->num_depth+1],this->dactivation),true));
    this->biases[this->num_depth] = Matrix::subtract(this->biases[this->num_depth],Matrix::multiply(this->deltas[this->num_depth+1],this->learning_rate * .5));
    for(uint i = this->num_depth+1 -1; i > 0; i--) {
        this->deltas[i] = Matrix::multiply(Matrix::multiply(Matrix::transpose(this->weights[i+1 -1]),true));
        this->biases[i-1] = Matrix::subtract(this->biases[i-1],Matrix::multiply(this->deltas[i],this->learning_rate * .5));
    }
}

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。