如何使用GRU解决此RNN的PyTorch中的NaN问题是这种损失？

如何解决如何使用GRU解决此RNN的PyTorch中的NaN问题是这种损失？

我对PyTorch完全陌生，并尝试了一些模型。我想对股票价格作一个简单的预测，发现以下代码：

我用熊猫加载数据集，然后将其分为训练和测试数据，然后将其加载到pytorch DataLoader中，以供以后在训练过程中使用。该模型在GRU类中定义。但是实际的问题似乎是优化。我认为问题可能是梯度爆炸。我曾考虑过添加渐变剪切，但GRU设计实际上应该可以防止渐变爆炸，或者我错了吗？是什么导致损失立即变为NaN（已经在第一个时期）

from sklearn.preprocessing import MinMaxScaler

import time
import pandas as pd
import numpy as np

import torch
import torch.nn as nn
from torch.utils.data import TensorDataset,DataLoader

batch_size = 200
input_dim = 1
hidden_dim = 32
num_layers = 2
output_dim = 1
num_epochs = 10

nvda = pd.read_csv('dataset/stocks/NVDA.csv')
price = nvda[['Close']]
scaler = MinMaxScaler(feature_range=(-1,1))
price['Close'] = scaler.fit_transform(price['Close'].values.reshape(-1,1))

def split_data(stock,lookback):
    data_raw = stock.to_numpy()  # convert to numpy array
    data = []

    # create all possible sequences of length seq_len
    for index in range(len(data_raw) - lookback):
        data.append(data_raw[index: index + lookback])

    data = np.array(data)
    test_set_size = int(np.round(0.2 * data.shape[0]))
    train_set_size = data.shape[0] - (test_set_size)

    x_train = data[:train_set_size,:-1,:]
    y_train = data[:train_set_size,-1,:]

    x_test = data[train_set_size:,:-1]
    y_test = data[train_set_size:,:]

    return [x_train,y_train,x_test,y_test]


lookback = 20  # choose sequence length
x_train,y_test = split_data(price,lookback)

train_data = TensorDataset(torch.from_numpy(x_train).float(),torch.from_numpy(y_train).float())
train_data = DataLoader(train_data,shuffle=True,batch_size=batch_size,drop_last=True)

test_data = TensorDataset(torch.from_numpy(x_test).float(),torch.from_numpy(y_test).float())
test_data = DataLoader(test_data,drop_last=True)


class GRU(nn.Module):
    def __init__(self,input_dim,hidden_dim,num_layers,output_dim):
        super(GRU,self).__init__()
        self.hidden_dim = hidden_dim
        self.num_layers = num_layers

        self.gru = nn.GRU(input_dim,batch_first=True,dropout=0.2)
        self.fc = nn.Linear(hidden_dim,output_dim)
        self.relu = nn.ReLU()

    def forward(self,x,h):

        out,h = self.gru(x,h)
        out = self.fc(self.relu(out[:,-1]))
        return out,h

    def init_hidden(self,batch_size):
        weight = next(self.parameters()).data
        hidden = weight.new(self.num_layers,batch_size,self.hidden_dim).zero_()
        return hidden


model = GRU(input_dim=input_dim,hidden_dim=hidden_dim,output_dim=output_dim,num_layers=num_layers)
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(),lr=0.0000000001)
model.train()

start_time = time.time()

h = model.init_hidden(batch_size)
for epoch in range(1,num_epochs+1):
    for x,y in train_data:
        h = h.data
        model.zero_grad()
        y_train_pred,h = model(x,h)
        loss = criterion(y_train_pred,y)
        print("Epoch ",epoch,"MSE: ",loss.item())
        loss.backward()
        optimizer.step()


training_time = time.time() - start_time
print("Training time: {}".format(training_time))

这是我使用的dataset。

解决方法

不确定是否是这种情况，但是您是否预处理并清除了数据？我不知道，但也许缺少一些值，或者有些奇怪。我在这里检查 https://ca.finance.yahoo.com/quote/NVDA/history?p=NVDA，似乎每两行都有一些不一致之处。就像我说的那样，我不知道是否是这样。

如何使用GRU解决此RNN的PyTorch中的NaN问题是这种损失？

如何解决如何使用GRU解决此RNN的PyTorch中的NaN问题是这种损失？

解决方法

相关推荐