如何解决指数拟合与对数线性拟合之间的差异
我的数据具有明显的指数依赖性。我试图用两个不同的非常简单的模型拟合曲线。
第一个是简单的指数拟合。对于第二个,我对y值进行对数转换,然后使用线性回归。
为了最终划清界限,我将结果提高到e
的幂。
但是,当绘制两条最终的回归线时,它们看起来却大不相同。另外,r ^ 2的值也大不相同。
有人可以向我解释为什么拟合如此不同吗?老实说,两个模型应该产生相同的曲线。
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
import math
from sklearn.metrics import r2_score
def exp(x,k):
return np.exp(k * x)
def lin(x,m):
return m * x
x = np.array([0.03553744809541667,0.07393361944488888,0.11713398354352941,0.1574279442442857,0.20574484316400002,0.24638269718399997,0.28022173237600007,0.33088392763600005,0.37608523866,0.4235348808,0.4698941935266667,0.5049780023645001,0.53193232248,0.59661874698,0.64686695376,0.6765964062965002,0.7195010072795001,0.7624056082625001,0.8053102092455002,0.8696671107200001])
y = np.array([1.0,0.9180040755624065,0.7580780029008654,0.662359339541471,0.556415757973503,0.4575163368602455,0.3982995279500034,0.3309496816813175,0.25142343840921577,0.21526738042912116,0.19490849614884595,0.12714651046365663,0.1015770731180174,0.0728982261567812,0.04180399979351543,0.04180399979351543])
k_exp = curve_fit(exp,x,y)[0]
m_lin = curve_fit(lin,np.log(y))[0]
x_ticks = np.linspace(x.min(),x.max(),100)
print("Exponential fit",r2_score(y,[exp(i,k_exp) for i in x])) #0.964
print("Log linear fit",[np.exp(i * m_lin) for i in x])) #0.939
plt.scatter(x,y,c="k",s=5)
plt.plot(x_ticks,exp(x_ticks,k_exp),"r--",label="Exponential fit")
plt.plot(x_ticks,[np.exp(x * m_lin) for x in x_ticks],label="Log-linear fit")
plt.legend()
plt.show()
解决方法
一个是:
exp(k * x) + err = y
其他是:
m * x + err = log(y)
或:
exp(m*x + err) = y
如您所见,误差的分布是不同的,因此拟合度也将不同。
,指数拟合和对数线性拟合的最小化问题有些不同。如果您适合其他事物,则还应该准备好获得不同的结果。
在指数拟合中,差异
exp(k_exp,x) - y
array([-0.11232018,-0.13754469,-0.08285192,-0.07245726,-0.05473322,-0.01973325,-0.00746918,-0.00117105,0.03198186,0.02645663,0.01201946,0.05681884,0.04092329,0.03372492,0.04142677,0.06167547,0.04781162,0.03580521,0.02540738,0.01236329])
在最小二乘意义上最小化
sum((exp(k_exp,x) - y)**2)
0.06488526426576267
在对数线性拟合中,差异
m_lin * x - np.log(y)
array([-0.14034862,-0.20643379,-0.18563015,-0.20978567,-0.22631195,-0.19110049,-0.18613326,-0.20097633,-0.10466277,-0.13679878,-0.22053568,0.06809742,-0.03835371,-0.06929866,0.06400883,0.5026701,0.33322626,0.16378243,-0.00566141,-0.25982716])
在最小二乘意义上最小化
sum((m_lin * x - np.log(y))**2)
0.8549505409763158
当看起来对数线性拟合似乎是指数拟合时,差异是
exp(m_lin,x) - y
array([-0.13094479,-0.17122601,-0.12843302,-0.12534621,-0.11269128,-0.07958516,-0.06764601,-0.06025541,-0.0249844,-0.02752286,-0.03857453,0.00895996,-0.00478421,-0.00680079,0.0048187,0.02730342,0.01653194,0.00743936,-0.000236,-0.00956539])
有两个区别
- 对数线性拟合显示的对数
sum((exp(m_lin,x) - y)**2) = 0.11011945823779898
平方和比指数拟合(0.06488526426576267
)高,并且 - 在非对数线性范围内,对数线性拟合的误差
exp(m_lin,x) - y
距离零更远,而x
的值很小。
值y
array([1.,0.91800408,0.758078,0.66235934,0.55641576,0.45751634,0.39829953,0.33094968,0.25142344,0.21526738,0.1949085,0.12714651,0.10157707,0.07289823,0.041804,0.041804 ])
在整个x
值范围内很小,而值np.log(y)
array([ 0.,-0.08555345,-0.27696899,-0.41194706,-0.5862395,-0.78194269,-0.92055097,-1.10578893,-1.38061676,-1.53587439,-1.63522508,-2.06241523,-2.28693743,-2.61869097,-3.17476326,-3.17476326])
对于更高的x
array([0.03553745,0.07393362,0.11713398,0.15742794,0.20574484,0.2463827,0.28022173,0.33088393,0.37608524,0.42353488,0.46989419,0.504978,0.53193232,0.59661875,0.64686695,0.67659641,0.71950101,0.76240561,0.80531021,0.86966711])
更接近值1。
在这种情况下,与对数线性标度相比,在指数标度上您拟合的平均绝对值要小得多。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。