微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

使用自定义损失函数时 lightgbm 训练/验证分数不移动

如何解决使用自定义损失函数时 lightgbm 训练/验证分数不移动

我一直在尝试使用自定义目标损失函数训练 lightgbm 模型,但没有取得多大成功。具体来说,我得到以下输出

[1] training's jane_street: 580.501 valid_1's jane_street: 18.7899
Training until validation scores don't improve for 10 rounds
[2] training's jane_street: 580.501 valid_1's jane_street: 18.7899
[3] training's jane_street: 580.501 valid_1's jane_street: 18.7899
[4] training's jane_street: 580.501 valid_1's jane_street: 18.7899
[5] training's jane_street: 580.501 valid_1's jane_street: 18.7899

分数看起来没有变化!我正在粘贴以下相关代码片段的一部分(如果需要,我可以添加更多):

def jane_street_objective(preds,train_data):
    y = train_data.get_label()
    if np.allclose(y,train_df.resp.values):
        rel_df = train_df
    elif np.allclose(y,validation_df.resp.values):
        rel_df = validation_df
    else:
        print("Error")
    
    yhat = preds
    grad,hess = get_gradient_hess_anal(preds,rel_df)

    return grad,hess

def jane_street_metric(preds,train_data):
    y = train_data.get_label()
    if len(y) == len(train_df):
        rel_df = train_df
    elif len(y) == len(validation_df):
        rel_df = validation_df
    else:
        print("Error")

    yhat = preds
    is_higher_better = True
    return 'jane_street',get_new_utility(preds,rel_df),is_higher_better


def get_new_utility(resp_prediction,full_df):

    full_df['p'] = full_df['weight'] * full_df['resp'] * func_to_transform_resp_to_action(resp_prediction)
     groupby_metric_date = full_df.groupby(['date']).agg({'p' : 'sum'}).reset_index() 
    groupby_metric_date['pSquared'] = groupby_metric_date['p'] * groupby_metric_date['p']
    sump = groupby_metric_date.p.sum()
    sumpSquared = groupby_metric_date.pSquared.sum()
    num_dates = groupby_metric_date.date.nunique() 
    if sumpSquared != 0:        
        t = (sump/np.sqrt(sumpSquared))*(np.sqrt(250/num_dates))
    else:
        t = 0
    min_max_val = 0
    if t <= 0:

        min_max_val = 0
    elif t < 6:
        min_max_val = t
    else:

        min_max_val = 6
    
    u = min_max_val * sump
    return u


num_boost_round=10000
early_stopping_rounds=10

X = train_df[x_cols]
Y = train_df[['resp']].values.ravel()

lgb_params = {'learning_rate' : 0.05,'objective' : 'regression'}

lgb_complete_data = lgb.Dataset(X,Y,init_score=Y)
valid_X = validation_df[x_cols]
valid_Y = validation_df[['resp']].values.ravel()

lgb_validation_data = lgb.Dataset(valid_X,valid_Y,init_score=valid_Y)
model_lgb = lgb.train(params=lgb_params,train_set=lgb_complete_data,num_boost_round=num_boost_round,early_stopping_rounds=early_stopping_rounds,valid_sets=[lgb_complete_data,lgb_validation_data],fobj=jane_street_objective,feval=jane_street_metric,verbose_eval=True
                     )

我是否在这里遗漏了一些明显的东西?

谢谢

巴比努

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。