如何解决使用 curve_fit 拟合 SIR 模型的问题
我正在尝试开发适合给定 SIR 模型的模型。根据各种来源(特别是 https://scipython.com/book/chapter-8-scipy/additional-examples/the-sir-epidemic-model/ 和 https://stackoverflow.com/a/34425290/3588242)的建议,我整理了一个测试代码,因为我很难将任何东西与我的模型连贯起来。
如下图所示。
基本上,我会根据数据的生成方式生成我知道适合我的模型的数据。随着时间的推移,这给了我一个传染性隔间(ydata 和 xdata)。然后我想回到我用来生成数据的参数(beta、gamma、mu),使用 scipy 函数 curve_fit。
然而,它不起作用,我很不清楚为什么它似乎没有做任何事情。
import numpy as np
from scipy import integrate,optimize
import matplotlib.pyplot as plt
# The SIR model differential equations.
def sir_model(y,t,N,beta,gamma,mu):
S,I,R = y
dSdt = -beta * S * I / N + mu * I
dIdt = beta * S * I / N - gamma * I
dRdt = gamma * I
return dSdt,dIdt,dRdt
# The fit integration.
def fit_odeint(x,mu):
return integrate.odeint(sir_model,(S0,I0,R0),x,args=(N,mu))[:,1]
if __name__ == "__main__":
########################################
# Get data to be fitted from the model #
########################################
# Total population,N.
N = 1
# Initial number of infected and recovered individuals,I0 and R0.
I0,R0 = 1e-6,0
# Everyone else,S0,is susceptible to infection initially.
S0 = N - I0 - R0
# Contact rate,and mean recovery rate,(in 1/deltaT).
beta,mu = -0.001524766068089,1.115130184090387,-0.010726414041332
# A grid of time points (in deltaT increment)
t = np.linspace(0,305,306)
# Initial conditions vector
y0 = S0,R0
# Integrate the SIR equations over the time grid,t.
ret = integrate.odeint(sir_model,y0,mu))
S,R = ret.T
# Plot the data on three separate curves for I(t) and R(t)
fig = plt.figure(facecolor='w')
ax = fig.add_subplot(111,facecolor='#dddddd',axisbelow=True)
# ax.plot(t,S/N,'b',alpha=0.5,lw=2,label='Susceptible')
ax.plot(t,I/N,'r',label='Infected')
ax.plot(t,R/N,'g',label='Recovered with immunity')
ax.set_xlabel('Time /days')
ax.set_ylabel('Number (1000s)')
# ax.set_ylim(0,1.2)
ax.yaxis.set_tick_params(length=0)
ax.xaxis.set_tick_params(length=0)
ax.grid(b=True,which='major',c='w',ls='-')
legend = ax.legend()
legend.get_frame().set_alpha(0.5)
for spine in ('top','right','bottom','left'):
ax.spines[spine].set_visible(False)
plt.show()
#####################################
# Fit the data using the same model #
#####################################
# Define the "experimental" data
ydata = I
xdata = t
# Define the initial conditions vector,note that this will be equal to y0 above.
I0,R0 = ydata[0],0
S0 = N - I0 - R0
# Try with default p0 = [1,1,1] (most of the times I will have no clue where the values lie)
print('p0 default values...')
popt,pcov = optimize.curve_fit(fit_odeint,xdata,ydata,bounds=([-0.1,0.,-np.inf],np.inf))
psig = np.sqrt(np.diag(pcov))
print(f'beta : {popt[0]} (+/- {psig[0]})\ngamma: {popt[1]} (+/- {psig[1]})\nmu : {popt[2]} (+/- {psig[2]})\n\n')
# Try with p0 close to the kNown solution (kind of defeats the purpose of curve fit if it's too close...)
print('p0 close to the kNown solution...')
popt,np.inf),p0=[-0.01,1.,-0.01])
psig = np.sqrt(np.diag(pcov))
print(f'beta : {popt[0]} (+/- {psig[0]})\ngamma: {popt[1]} (+/- {psig[1]})\nmu : {popt[2]} (+/- {psig[2]})\n\n')
# Try with p0 equal to the kNown solution (kind of defeats the purpose of curve fit if it's too close...)
print('p0 equal to the kNown solution...')
popt,p0=[-0.001524766068089,-0.010726414041332])
psig = np.sqrt(np.diag(pcov))
print(f'beta : {popt[0]} (+/- {psig[0]})\ngamma: {popt[1]} (+/- {psig[1]})\nmu : {popt[2]} (+/- {psig[2]})\n\n')
这段代码给了我正确的预期情节,然后:
p0 default values...
beta : 0.9 (+/- 13.202991356641752)
gamma: 1.0 (+/- 13.203507667858469)
mu : 1.0 (+/- 50.75556985555176)
p0 close to the kNown solution...
beta : -0.01 (+/- 2.0204502661168218)
gamma: 1.0 (+/- 2.0182998608106186)
mu : -0.01 (+/- 7.701149479142956)
p0 equal to the kNown solution...
beta : -0.001524766068089 (+/- 0.0)
gamma: 1.115130184090387 (+/- 0.0)
mu : -0.010726414041332 (+/- 0.0)
所以看起来curve_fit就像,好吧,这已经足够接近了,让我们停下来,在几乎没有做任何事情并拒绝尝试之后。这可能是由于某个地方的 epsilon 值(绝对与相对或类似的东西,我之前在 scipy 中遇到过这个问题,用于不同的求解器)。然而,curve_fit 的文档似乎没有提到关于能够改变容差的任何内容。或者这可能是因为我误解了该代码中的某些内容。
如果有人有任何建议,我很乐意。
我尝试查看 lmfit 包,但收效甚微(我遇到了一些我无法弄清楚的 ValueError a.any() blabla 错误)。
编辑:
我环顾四周,决定使用随机进化算法(差分进化)来自动获得更好的初始猜测。代码运行良好,但同样,curve_fit 没有做任何事情并返回初始猜测。
import numpy as np
import matplotlib.pyplot as plt
from scipy import optimize,integrate
# The SIR model differential equations.
def sir_model(y,1]
# function for genetic algorithm to minimize (sum of squared error)
# bounds on parameters are set in generate_Initial_Parameters() below
def sumOfSquaredError(parameterTuple):
return np.sum((ydata - fit_odeint(xdata,*parameterTuple)) ** 2)
def generate_Initial_Parameters():
parameterBounds = []
parameterBounds.append([-0.1,10.0]) # parameter bounds for beta
parameterBounds.append([-0.1,20.0]) # parameter bounds for gamma
parameterBounds.append([-0.1,0.1]) # parameter bounds for mu
# "seed" the numpy random number generator for repeatable results
result = optimize.differential_evolution(sumOfSquaredError,parameterBounds,seed=3)
return result.x
if __name__ == "__main__":
########################################
# Get data to be fitted from the model #
########################################
# Total population,label='Recovered with immunity')
ax.set_xlabel('Time /deltaT')
ax.set_ylabel('Number (normalized)')
# ax.set_ylim(0,0
S0 = N - I0 - R0
# generate initial parameter values
initialParameters = generate_Initial_Parameters()
# curve fit the test data
fittedParameters,p0=tuple(initialParameters),np.inf))
# create values for display of fitted peak function
b,g,m = fittedParameters
ret = integrate.odeint(sir_model,b,m))
S,R = ret.T
plt.plot(xdata,ydata) # plot the raw data
plt.plot(xdata,linestyle='--') # plot the equation using the fitted parameters
plt.show()
psig = np.sqrt(np.diag(pcov))
print('Initial parameters:')
print(f'beta : {initialParameters[0]}\n'
f'gamma: {initialParameters[1]}\n'
f'mu : {initialParameters[2]}\n\n')
print('Fitted parameters:')
print(f'beta : {fittedParameters[0]} (+/- {psig[0]})\n'
f'gamma: {fittedParameters[1]} (+/- {psig[1]})\n'
f'mu : {fittedParameters[2]} (+/- {psig[2]})\n\n')
这给了我,以及正确和预期的数字:
Initial parameters:
beta : -0.039959661364345145
gamma: 1.0766953272292845
mu : -0.040321969786292024
Fitted parameters:
beta : -0.039959661364345145 (+/- 5.6469679624489775e-12)
gamma: 1.0766953272292845 (+/- 5.647099056919525e-12)
mu : -0.040321969786292024 (+/- 5.720259134770649e-12)
所以 curve_fit 几乎什么也没做。在这种情况下并不令人震惊,因为最初的猜测非常好。
但是因为我关心的是参数,所以我对curve_fit在我的情况下明显的无用感到“惊讶”,我期待更多。这可能源于我对它实际作用的误解。我会注意到,正如预期的那样,如果初始猜测参数的边界很大,遗传算法就很难找到最小值,并且计算成本很高。
欢迎任何启发或建议!
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。