如何解决从 python_speech_features 使用 mfcc 并获得内存错误
我正在使用 python_speech_features 中的 mfcc 并尝试从 (5-120) 秒范围内的波形文件中提取特征。对于持续时间较短(如 (10,20) 秒)的文件,我可以提取特征,但对于较大的文件,它会显示此错误:
---------------------------------------------------------------------------
MemoryError Traceback (most recent call last)
<ipython-input-6-ea3546938d03> in <module>
14 print("\n\tFeatures\n")
15 data,sampling_rate = librosa.load(sample_data[i])
---> 16 mfcc_features = mfcc(data,sampling_rate,winlen=30,nfft=66150)
17 print(pd.DataFrame(mfcc_features))
18 print("========================================\n")
~/anaconda3/lib/python3.8/site-packages/python_speech_features/base.py in mfcc(signal,samplerate,winlen,winstep,numcep,nfilt,nfft,lowfreq,highfreq,preemph,ceplifter,appendEnergy,winfunc)
26 :returns: A numpy array of size (NUMFRAMES by numcep) containing features. Each row holds 1 feature vector.
27 """
---> 28 feat,energy = fbank(signal,winfunc)
29 feat = numpy.log(feat)
30 feat = dct(feat,type=2,axis=1,norm='ortho')[:,:numcep]
~/anaconda3/lib/python3.8/site-packages/python_speech_features/base.py in fbank(signal,winfunc)
53 highfreq= highfreq or samplerate/2
54 signal = sigproc.preemphasis(signal,preemph)
---> 55 frames = sigproc.framesig(signal,winlen*samplerate,winsteP*samplerate,winfunc)
56 pspec = sigproc.powspec(frames,nfft)
57 energy = numpy.sum(pspec,1) # this stores the total energy in each frame
~/anaconda3/lib/python3.8/site-packages/python_speech_features/sigproc.py in framesig(sig,frame_len,frame_step,winfunc)
33 padsignal = numpy.concatenate((sig,zeros))
34
---> 35 indices = numpy.tile(numpy.arange(0,frame_len),(numframes,1)) + numpy.tile(numpy.arange(0,numframes*frame_step,frame_step),(frame_len,1)).T
36 indices = numpy.array(indices,dtype=numpy.int32)
37 frames = padsignal[indices]
<__array_function__ internals> in tile(*args,**kwargs)
~/anaconda3/lib/python3.8/site-packages/numpy/lib/shape_base.py in tile(A,reps)
1256 for dim_in,nrep in zip(c.shape,tup):
1257 if nrep != 1:
-> 1258 c = c.reshape(-1,n).repeat(nrep,0)
1259 n //= dim_in
1260 return c.reshape(shape_out)
MemoryError: Unable to allocate 12.8 GiB for an array with shape (2591,661500) and data type int64
这是代码,我在 Jupyter 笔记本上运行它。我在具有 8Gb RAM 的笔记本电脑、具有 32 GB RAM 的 PC 和具有近 12Gb RAM 的 Google Collab 计算引擎上尝试过,但错误仍然存在。
print("\nSample Data:")
print("============\n")
path = ('speech-sample-data')
sample_data = [os.path.join(dp,f) for dp,dn,filenames in os.walk(path) for f in filenames if os.path.splitext(f)[1] == '.wav']
for i in range(5):
print("Speech: ")
ipd.display(ipd.Audio(sample_data[i]))
print("Type: \n\tnormal\n")
print("\n\tFeatures\n")
data,sampling_rate = librosa.load(sample_data[i])
mfcc_features = mfcc(data,nfft=66150)
print(pd.DataFrame(mfcc_features))
print("========================================\n")
print("Speech: ")
ipd.display(ipd.Audio(sample_data[i+5]))
print("Type: \n\tToxic\n")
print("\n\tFeatures\n")
data,sampling_rate = librosa.load(sample_data[i+5])
mfcc_features = mfcc(data,nfft=66150)
print(pd.DataFrame(mfcc_features))
print("========================================\n")
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。