如何解决使用自定义损失函数时 lightgbm 训练/验证分数不移动
我一直在尝试使用自定义目标损失函数训练 lightgbm 模型,但没有取得多大成功。具体来说,我得到以下输出:
[1] training's jane_street: 580.501 valid_1's jane_street: 18.7899
Training until validation scores don't improve for 10 rounds
[2] training's jane_street: 580.501 valid_1's jane_street: 18.7899
[3] training's jane_street: 580.501 valid_1's jane_street: 18.7899
[4] training's jane_street: 580.501 valid_1's jane_street: 18.7899
[5] training's jane_street: 580.501 valid_1's jane_street: 18.7899
分数看起来没有变化!我正在粘贴以下相关代码片段的一部分(如果需要,我可以添加更多):
def jane_street_objective(preds,train_data):
y = train_data.get_label()
if np.allclose(y,train_df.resp.values):
rel_df = train_df
elif np.allclose(y,validation_df.resp.values):
rel_df = validation_df
else:
print("Error")
yhat = preds
grad,hess = get_gradient_hess_anal(preds,rel_df)
return grad,hess
def jane_street_metric(preds,train_data):
y = train_data.get_label()
if len(y) == len(train_df):
rel_df = train_df
elif len(y) == len(validation_df):
rel_df = validation_df
else:
print("Error")
yhat = preds
is_higher_better = True
return 'jane_street',get_new_utility(preds,rel_df),is_higher_better
def get_new_utility(resp_prediction,full_df):
full_df['p'] = full_df['weight'] * full_df['resp'] * func_to_transform_resp_to_action(resp_prediction)
groupby_metric_date = full_df.groupby(['date']).agg({'p' : 'sum'}).reset_index()
groupby_metric_date['pSquared'] = groupby_metric_date['p'] * groupby_metric_date['p']
sump = groupby_metric_date.p.sum()
sumpSquared = groupby_metric_date.pSquared.sum()
num_dates = groupby_metric_date.date.nunique()
if sumpSquared != 0:
t = (sump/np.sqrt(sumpSquared))*(np.sqrt(250/num_dates))
else:
t = 0
min_max_val = 0
if t <= 0:
min_max_val = 0
elif t < 6:
min_max_val = t
else:
min_max_val = 6
u = min_max_val * sump
return u
num_boost_round=10000
early_stopping_rounds=10
X = train_df[x_cols]
Y = train_df[['resp']].values.ravel()
lgb_params = {'learning_rate' : 0.05,'objective' : 'regression'}
lgb_complete_data = lgb.Dataset(X,Y,init_score=Y)
valid_X = validation_df[x_cols]
valid_Y = validation_df[['resp']].values.ravel()
lgb_validation_data = lgb.Dataset(valid_X,valid_Y,init_score=valid_Y)
model_lgb = lgb.train(params=lgb_params,train_set=lgb_complete_data,num_boost_round=num_boost_round,early_stopping_rounds=early_stopping_rounds,valid_sets=[lgb_complete_data,lgb_validation_data],fobj=jane_street_objective,feval=jane_street_metric,verbose_eval=True
)
我是否在这里遗漏了一些明显的东西?
谢谢
巴比努
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。