如何解决如何使用GRU解决此RNN的PyTorch中的NaN问题是这种损失?
我对PyTorch完全陌生,并尝试了一些模型。我想对股票价格作一个简单的预测,发现以下代码:
我用熊猫加载数据集,然后将其分为训练和测试数据,然后将其加载到pytorch DataLoader中,以供以后在训练过程中使用。该模型在GRU类中定义。但是实际的问题似乎是优化。我认为问题可能是梯度爆炸。我曾考虑过添加渐变剪切,但GRU设计实际上应该可以防止渐变爆炸,或者我错了吗?是什么导致损失立即变为NaN(已经在第一个时期)
from sklearn.preprocessing import MinMaxScaler
import time
import pandas as pd
import numpy as np
import torch
import torch.nn as nn
from torch.utils.data import TensorDataset,DataLoader
batch_size = 200
input_dim = 1
hidden_dim = 32
num_layers = 2
output_dim = 1
num_epochs = 10
nvda = pd.read_csv('dataset/stocks/NVDA.csv')
price = nvda[['Close']]
scaler = MinMaxScaler(feature_range=(-1,1))
price['Close'] = scaler.fit_transform(price['Close'].values.reshape(-1,1))
def split_data(stock,lookback):
data_raw = stock.to_numpy() # convert to numpy array
data = []
# create all possible sequences of length seq_len
for index in range(len(data_raw) - lookback):
data.append(data_raw[index: index + lookback])
data = np.array(data)
test_set_size = int(np.round(0.2 * data.shape[0]))
train_set_size = data.shape[0] - (test_set_size)
x_train = data[:train_set_size,:-1,:]
y_train = data[:train_set_size,-1,:]
x_test = data[train_set_size:,:-1]
y_test = data[train_set_size:,:]
return [x_train,y_train,x_test,y_test]
lookback = 20 # choose sequence length
x_train,y_test = split_data(price,lookback)
train_data = TensorDataset(torch.from_numpy(x_train).float(),torch.from_numpy(y_train).float())
train_data = DataLoader(train_data,shuffle=True,batch_size=batch_size,drop_last=True)
test_data = TensorDataset(torch.from_numpy(x_test).float(),torch.from_numpy(y_test).float())
test_data = DataLoader(test_data,drop_last=True)
class GRU(nn.Module):
def __init__(self,input_dim,hidden_dim,num_layers,output_dim):
super(GRU,self).__init__()
self.hidden_dim = hidden_dim
self.num_layers = num_layers
self.gru = nn.GRU(input_dim,batch_first=True,dropout=0.2)
self.fc = nn.Linear(hidden_dim,output_dim)
self.relu = nn.ReLU()
def forward(self,x,h):
out,h = self.gru(x,h)
out = self.fc(self.relu(out[:,-1]))
return out,h
def init_hidden(self,batch_size):
weight = next(self.parameters()).data
hidden = weight.new(self.num_layers,batch_size,self.hidden_dim).zero_()
return hidden
model = GRU(input_dim=input_dim,hidden_dim=hidden_dim,output_dim=output_dim,num_layers=num_layers)
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(),lr=0.0000000001)
model.train()
start_time = time.time()
h = model.init_hidden(batch_size)
for epoch in range(1,num_epochs+1):
for x,y in train_data:
h = h.data
model.zero_grad()
y_train_pred,h = model(x,h)
loss = criterion(y_train_pred,y)
print("Epoch ",epoch,"MSE: ",loss.item())
loss.backward()
optimizer.step()
training_time = time.time() - start_time
print("Training time: {}".format(training_time))
这是我使用的dataset。
解决方法
不确定是否是这种情况,但是您是否预处理并清除了数据?我不知道,但也许缺少一些值,或者有些奇怪。我在这里检查 https://ca.finance.yahoo.com/quote/NVDA/history?p=NVDA,似乎每两行都有一些不一致之处。就像我说的那样,我不知道是否是这样。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。