微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

未考虑 make_regression() 的 n_informative 参数为什么?

如何解决未考虑 make_regression() 的 n_informative 参数为什么?

我正在研究支持向量机算法 (SVR)。 只需更改 random_state 参数, random_state 就会产生截然不同的结果。当我使用 random_state = 12 时,它根据我们最初指定为 2 的 n_informative 参数值生成值。

# Code with random_state = 12
from sklearn.datasets import make_regression

# Creating two arrays using the 'make_regression()' function 
reg_feat,reg_target = make_regression(n_samples = 30,n_features = 5,n_informative = 2,n_targets = 1,random_state = 12)

#creating empty dictionary
data_dict = {}

# Creating for loop to add the features (keys) and data in the dictionary
for i in range(reg_feat.shape[1]):
  data_dict["feature " + str(i + 1)] = reg_feat[:,i]

# Add the target key data in the dictionary
data_dict["target"] = reg_target

#Creating a dataframe from a dictionary
reg_feat_df = pd.DataFrame(data = data_dict)

# Checking for co - relation
print(reg_feat_df.corr())

输出

          feature 1  feature 2  feature 3  feature 4  feature 5    target
feature 1   1.000000   0.035793   0.413027   0.170365  -0.189310  0.917164
feature 2   0.035793   1.000000  -0.085540  -0.266076  -0.097623 -0.011071
feature 3   0.413027  -0.085540   1.000000  -0.014338   0.235447  0.741744
feature 4   0.170365  -0.266076  -0.014338   1.000000  -0.102619  0.119188
feature 5  -0.189310  -0.097623   0.235447  -0.102619   1.000000 -0.036388
target      0.917164  -0.011071   0.741744   0.119188  -0.036388  1.000000

但是当我开始尝试改变 random_state 参数值时 到 95 / 72 / 75,它不尊重 n_informative 参数,只给了我一个高度相关的特征,而我给了 n_informative = 2。

# random_state = 72
from sklearn.datasets import make_regression

# Creating two arrays using the 'make_regression()' function 
reg_feat,random_state = 72)

#creating empty dictionary
data_dict = {}

# Creating for loop to add the features (keys) and data in the dictionary
for i in range(reg_feat.shape[1]):
  data_dict["feature " + str(i + 1)] = reg_feat[:,i]

# Add the target key data in the dictionary
data_dict["target"] = reg_target

#Creating a dataframe from a dictionary
reg_feat_df = pd.DataFrame(data = data_dict)

# Checking for co - relation
print(reg_feat_df.corr())

输出

           feature 1  feature 2  feature 3  feature 4  feature 5    target
feature 1   1.000000   0.145218   0.013431  -0.065064   0.231296  0.098015
feature 2   0.145218   1.000000  -0.158852  -0.162734   0.170550 -0.138638
feature 3   0.013431  -0.158852   1.000000  -0.264027   0.039458 -0.261125
feature 4  -0.065064  -0.162734  -0.264027   1.000000  -0.399130  0.986699
feature 5   0.231296   0.170550   0.039458  -0.399130   1.000000 -0.360373
target      0.098015  -0.138638  -0.261125   0.986699  -0.360373  1.000000

那么某些 random_state 是如何给出正确结果的,而有些则没有??还是我错过了什么??

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。