Scipy-如何使用Python Scipy Curve Fit拟合此beta分布

如何解决Scipy-如何使用Python Scipy Curve Fit拟合此beta分布

我对curve_fit和scipy陌生。我有许多看起来像y而不看起来像y的发行版。看起来像y的大多数发行版都是beta发行版。我的方法是，如果我可以将beta函数适合所有具有变化分布的唯一ID，则可以从beta函数中找到系数，然后查看幅度接近的系数，然后可以有效地滤除所有分布看起来像y。

y看起来像这样（以下示例代码中的数据相同）：

但是，我在入门时遇到了一些麻烦。

y = array([[ 0.50423378,0.50423378,0.50254455,0.50507627,0.5,0.49658699,0.49228746,0.48707792,0.48092881,0.47380354,0.46565731,0.45643546,0.44607129,0.43448304,29.38186886,29.37898909,29.45299206,29.52449116,29.74083063,29.73771398,30.12527698,30.48367189,30.8169243,30.82153203,30.81230208,30.80766536,30.80301414,30.82612528,10.51949923,10.51436497,10.22456193,9.91464422,9.36922158,9.37416663,9.37906375,9.383913,9.38871446,9.36422851,9.35918734,7.72711675,5.53121937,0.48092881]])

如何使用scipy中的示例，如何获取x数组并将其插入以获得系数，然后在分布中绘制curve_fit？

import numpy as np
from scipy.optimize import curve_fit
from scipy.special import gamma as gamma

def betafunc(x,a,b,cst):
    return cst*gamma(a+b) * (x**(a-1)) * ((1-x)**(b-1))  / ( gamma(a)*gamma(b) )

x = np.array( [0.1,0.3,0.7,0.9,1.1])
y = np.array( [0.45112234,0.56934313,0.3996803,0.28982859,0.19682153,0.] )

popt2,pcov2 = curve_fit(betafunc,x[:-1],y[:-1],p0=(0.5,1.5,0.5))

print(popt2)
print(pcov2)

解决方法

对于问题的第一部分： 如果您有一组观测值，则可以使用numpy.histogram获取直方图。要获得每个垃圾箱的中心，请按照下面的代码进行操作。您可以在拟合过程中使用的那些值。根据您提供的数据，任何人都无法使用betafunc，因为它根本不合适。

import numpy as np
from matplotlib import pyplot as plt
from scipy.optimize import curve_fit
from scipy.special import gamma as gamma


def betafunc(x,a,b,cst):
    return cst*gamma(a+b) * (x**(a-1)) * ((1-x)**(b-1))  / ( gamma(a)*gamma(b) )

y_data=np.array([[ 0.50423378,0.50423378,0.50254455,0.50507627,0.5,0.49658699,0.49228746,0.48707792,0.48092881,0.47380354,0.46565731,0.45643546,0.44607129,0.43448304,29.38186886,29.37898909,29.45299206,29.52449116,29.74083063,29.73771398,30.12527698,30.48367189,30.8169243,30.82153203,30.81230208,30.80766536,30.80301414,30.82612528,10.51949923,10.51436497,10.22456193,9.91464422,9.36922158,9.37416663,9.37906375,9.383913,9.38871446,9.36422851,9.35918734,7.72711675,5.53121937,0.48092881]])


hist=np.histogram(y_data[0],bins=20)
x=(hist[1][1:]+hist[1][:-1])/2
y=hist[0]

print(x,y)

plt.step(x,y,label='Manual calculation of the center of the bins')
plt.hist(y_data[0],bins=20,histtype='bar',label='Automatic plot with plt.hist')
plt.legend()
plt.show()

popt2,pcov2 = curve_fit(betafunc,x[:-1],y[:-1],p0=(0.5,1.5,0.5))

对于问题的第二部分： 要绘制具有最佳拟合参数的函数，只需添加我最后添加的最后四行代码。

import numpy as np
from scipy.optimize import curve_fit
from scipy.special import gamma as gamma


def betafunc(x,cst):
    return cst*gamma(a+b) * (x**(a-1)) * ((1-x)**(b-1))  / ( gamma(a)*gamma(b) )



x = np.array( [0.1,0.3,0.7,0.9,1.1])
y = np.array( [0.45112234,0.56934313,0.3996803,0.28982859,0.19682153,0.] )

popt2,0.5))

print(popt2)
print(pcov2)

from matplotlib import pyplot as plt
plt.plot(x,betafunc(x,*popt2))
plt.plot(x,y)
plt.show()

如果您不必使用curve_fit。我建议您看看scipy.stats.beta。一种可能的解决方案是：

from scipy.stats import beta

y = array([[ 0.50423378,0.48092881]]) 

params = beta.fit(y)

y2 = np.loadtxt("other_data_file.dat")   # other distribution file
params2 = beta.fit(y2)

然后可以通过比较params和params2来分别比较参数。请注意，scipy.stats.beta在定义概率密度函数时使用标准化形式。