如何解决Python:按瞬态分割声音文件
我编写了一些分析 sound file, 的函数,并且应该通过它的瞬态(每次文件中的声音突然改变时)来分离该声音文件。您可以以慢动作 here 收听声音文件以进一步理解我的意思。
def transients_from_onsets(onset_samples):
"""Takes a list of onset times for an audio file and returns the list of start and stop times for that audio file
Args:
onset_samples ([int]): I don't really kNow what these are actually. I thought they were start times for each sound change but I don't kNow
Returns:
[(int,int)]: A list of start and stop times for each sound change
"""
starts = onset_samples[0:-1]
stops = onset_samples[1:]
transientTimes = []
for s in range(len(starts)):
transientTimes.append((starts[s],stops[s]))
return transientTimes
def transients_from_sound_file(fileName,sr=44100):
"""Takes the path to an audio file
and returns the list of start and stop times for that audio file
as a frame rate
Args:
fileName (string): The path to an audio file
sr (int,optional): The sample rate of the audio file. Defaults to 44100.
Returns:
[(int,int)]: A list of start and stop times for each sound change
"""
y,sr = librosa.load(soundFile,sr=sr)
C = np.abs(librosa.cqt(y=y,sr=sr))
o_env = librosa.onset.onset_strength(sr=sr,S=librosa.amplitude_to_db(C,ref=np.max))
onset_frames = librosa.onset.onset_detect(onset_envelope=o_env,sr=sr)
onset_samples = list(librosa.frames_to_samples(onset_frames))
onset_samples = np.concatenate(onset_samples,len(y))
transientTimes = transients_from_onsets(onset_samples)
return transientTimes,transientSamples
我写的那个很好用,但它应该是有点偏离,而且有些声音在它们应该只有一个的时候被分成了 2 个。下面是我的程序输出的结果和预期的结果。我想知道如何让我的结果看起来更像预期的结果。
这些是由我的程序确定的每个声音(瞬态)的开始和停止时间,然后是指向原始声音文件被这些开始和停止时间分割时创建的声音文件的链接
in frames: [(1536,6144),(6144,11264),(11264,15360),(15360,20992),(20992,26624),(26624,31744),(31744,36352),(36352,41984),(41984,47104),(47104,51712),(51712,56832),(56832,61440),(61440,62976),(62976,66560),(66560,71680),(71680,76800),(76800,82944),(82944,89088),(89088,92672),(92672,96768),(96768,98304),(98304,103936),(103936,107008),(107008,113664),(113664,117760),(117760,123904),(123904,128512),(128512,139264),(139264,147968),(147968,150016),(150016,153088),(153088,154624),(154624,159232),(159232,164864),(164864,169472),(169472,175616),(175616,179200),(179200,180736),(180736,189440),(189440,196096)]
in time: [(0.03566585034013605,0.1426634013605442),(0.1426634013605442,0.26154956916099775),(0.26154956916099775,0.3566585034013605),(0.3566585034013605,0.4874332879818594),(0.4874332879818594,0.6182080725623583),(0.6182080725623583,0.7370942403628118),(0.7370942403628118,0.84409179138322),(0.84409179138322,0.9748665759637188),(0.9748665759637188,1.0937527437641723),(1.0937527437641723,1.2007502947845805),(1.2007502947845805,1.319636462585034),(1.319636462585034,1.426634013605442),(1.426634013605442,1.4622998639455782),(1.4622998639455782,1.5455201814058959),(1.5455201814058959,1.6644063492063492),(1.6644063492063492,1.7832925170068026),(1.7832925170068026,1.925955918367347),(1.925955918367347,2.0686193197278913),(2.0686193197278913,2.1518396371882087),(2.1518396371882087,2.2469485714285717),(2.2469485714285717,2.282614421768707),(2.282614421768707,2.413389206349206),(2.413389206349206,2.4847209070294785),(2.4847209070294785,2.639272925170068),(2.639272925170068,2.7343818594104308),(2.7343818594104308,2.877045260770975),(2.877045260770975,2.9840428117913835),(2.9840428117913835,3.2337037641723354),(3.2337037641723354,3.4358102494331066),(3.4358102494331066,3.483364716553288),(3.483364716553288,3.55469641723356),(3.55469641723356,3.590362267573696),(3.590362267573696,3.697359818594104),(3.697359818594104,3.8281346031746035),(3.8281346031746035,3.9351321541950113),(3.9351321541950113,4.077795555555555),(4.077795555555555,4.161015873015873),(4.161015873015873,4.196681723356009),(4.196681723356009,4.39878820861678),(4.39878820861678,4.553340226757369)]
这些是开始和停止时间应该是什么
in frames: [(2067,6431),(6431,10795),(10795,15389),(15389,25495),(25495,28940),(28940,33534),(33534,38587),(38587,43640),(43640,47085),(47085,51679),(51679,55814),(55814,60867),(60867,65231),(65231,69595),(69595,75337),(75337,79931),(79931,83606),(83606,87740),(87740,96928),(96928,101521),(101521,105885),(105885,110479),(110479,115073),(115073,124031),(124031,133218),(133218,137353),(137353,142176),(142176,147229),(147229,152282),(152282,155728),(155728,160321),(160321,169739)]
in time: [(0.0479956462585034,0.1493275283446712),(0.1493275283446712,0.250659410430839),(0.250659410430839,0.3573318820861678),(0.3573318820861678,0.5919927437641723),(0.5919927437641723,0.6719854875283447),(0.6719854875283447,0.7786579591836735),(0.7786579591836735,0.8959883900226757),(0.8959883900226757,1.013318820861678),(1.013318820861678,1.0933115646258504),(1.0933115646258504,1.1999840362811791),(1.1999840362811791,1.2959985487528345),(1.2959985487528345,1.4133289795918367),(1.4133289795918367,1.5146608616780044),(1.5146608616780044,1.6159927437641723),(1.6159927437641723,1.749321723356009),(1.749321723356009,1.8559941950113379),(1.8559941950113379,1.9413275283446711),(1.9413275283446711,2.037318820861678),(2.037318820861678,2.2506637641723355),(2.2506637641723355,2.357313015873016),(2.357313015873016,2.4586448979591835),(2.4586448979591835,2.565317369614512),(2.565317369614512,2.6719898412698413),(2.6719898412698413,2.879994195011338),(2.879994195011338,3.093315918367347),(3.093315918367347,3.1893304308390027),(3.1893304308390027,3.3013202721088435),(3.3013202721088435,3.4186507029478457),(3.4186507029478457,3.5359811337868483),(3.5359811337868483,3.615997097505669),(3.615997097505669,3.722646349206349),(3.722646349206349,3.9413318820861676)]
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。