微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

Python:按瞬态分割声音文件

如何解决Python:按瞬态分割声音文件

我编写了一些分析 sound file,函数,并且应该通过它的瞬态(每次文件中的声音突然改变时)来分离该声音文件。您可以以慢动作 here 收听声音文件以进一步理解我的意思。

def transients_from_onsets(onset_samples):
    """Takes a list of onset times for an audio file and returns the list of start and stop times for that audio file

    Args:
        onset_samples ([int]): I don't really kNow what these are actually. I thought they were start times for each sound change but I don't kNow

    Returns:
        [(int,int)]: A list of start and stop times for each sound change
    """
    starts = onset_samples[0:-1]
    stops = onset_samples[1:]
    transientTimes = []
    for s in range(len(starts)):
        transientTimes.append((starts[s],stops[s]))
    return transientTimes

def transients_from_sound_file(fileName,sr=44100):
    """Takes the path to an audio file
    and returns the list of start and stop times for that audio file
    as a frame rate

    Args:
        fileName (string): The path to an audio file
        sr (int,optional): The sample rate of the audio file. Defaults to 44100.

    Returns:
        [(int,int)]: A list of start and stop times for each sound change
    """
    y,sr = librosa.load(soundFile,sr=sr)
    C = np.abs(librosa.cqt(y=y,sr=sr))
    o_env = librosa.onset.onset_strength(sr=sr,S=librosa.amplitude_to_db(C,ref=np.max))
    onset_frames = librosa.onset.onset_detect(onset_envelope=o_env,sr=sr)

    onset_samples = list(librosa.frames_to_samples(onset_frames))
    onset_samples = np.concatenate(onset_samples,len(y))
    transientTimes =  transients_from_onsets(onset_samples)
    return transientTimes,transientSamples

我写的那个很好用,但它应该是有点偏离,而且有些声音在它们应该只有一个的时候被分成了 2 个。下面是我的程序输出的结果和预期的结果。我想知道如何让我的结果看起来更像预期的结果。


这些是由我的程序确定的每个声音(瞬态)的开始和停止时间,然后是指向原始声音文件被这些开始和停止时间分割时创建的声音文件链接

in frames: [(1536,6144),(6144,11264),(11264,15360),(15360,20992),(20992,26624),(26624,31744),(31744,36352),(36352,41984),(41984,47104),(47104,51712),(51712,56832),(56832,61440),(61440,62976),(62976,66560),(66560,71680),(71680,76800),(76800,82944),(82944,89088),(89088,92672),(92672,96768),(96768,98304),(98304,103936),(103936,107008),(107008,113664),(113664,117760),(117760,123904),(123904,128512),(128512,139264),(139264,147968),(147968,150016),(150016,153088),(153088,154624),(154624,159232),(159232,164864),(164864,169472),(169472,175616),(175616,179200),(179200,180736),(180736,189440),(189440,196096)]

in time: [(0.03566585034013605,0.1426634013605442),(0.1426634013605442,0.26154956916099775),(0.26154956916099775,0.3566585034013605),(0.3566585034013605,0.4874332879818594),(0.4874332879818594,0.6182080725623583),(0.6182080725623583,0.7370942403628118),(0.7370942403628118,0.84409179138322),(0.84409179138322,0.9748665759637188),(0.9748665759637188,1.0937527437641723),(1.0937527437641723,1.2007502947845805),(1.2007502947845805,1.319636462585034),(1.319636462585034,1.426634013605442),(1.426634013605442,1.4622998639455782),(1.4622998639455782,1.5455201814058959),(1.5455201814058959,1.6644063492063492),(1.6644063492063492,1.7832925170068026),(1.7832925170068026,1.925955918367347),(1.925955918367347,2.0686193197278913),(2.0686193197278913,2.1518396371882087),(2.1518396371882087,2.2469485714285717),(2.2469485714285717,2.282614421768707),(2.282614421768707,2.413389206349206),(2.413389206349206,2.4847209070294785),(2.4847209070294785,2.639272925170068),(2.639272925170068,2.7343818594104308),(2.7343818594104308,2.877045260770975),(2.877045260770975,2.9840428117913835),(2.9840428117913835,3.2337037641723354),(3.2337037641723354,3.4358102494331066),(3.4358102494331066,3.483364716553288),(3.483364716553288,3.55469641723356),(3.55469641723356,3.590362267573696),(3.590362267573696,3.697359818594104),(3.697359818594104,3.8281346031746035),(3.8281346031746035,3.9351321541950113),(3.9351321541950113,4.077795555555555),(4.077795555555555,4.161015873015873),(4.161015873015873,4.196681723356009),(4.196681723356009,4.39878820861678),(4.39878820861678,4.553340226757369)]

sound files

这些是开始和停止时间应该是什么

in frames: [(2067,6431),(6431,10795),(10795,15389),(15389,25495),(25495,28940),(28940,33534),(33534,38587),(38587,43640),(43640,47085),(47085,51679),(51679,55814),(55814,60867),(60867,65231),(65231,69595),(69595,75337),(75337,79931),(79931,83606),(83606,87740),(87740,96928),(96928,101521),(101521,105885),(105885,110479),(110479,115073),(115073,124031),(124031,133218),(133218,137353),(137353,142176),(142176,147229),(147229,152282),(152282,155728),(155728,160321),(160321,169739)]
in time: [(0.0479956462585034,0.1493275283446712),(0.1493275283446712,0.250659410430839),(0.250659410430839,0.3573318820861678),(0.3573318820861678,0.5919927437641723),(0.5919927437641723,0.6719854875283447),(0.6719854875283447,0.7786579591836735),(0.7786579591836735,0.8959883900226757),(0.8959883900226757,1.013318820861678),(1.013318820861678,1.0933115646258504),(1.0933115646258504,1.1999840362811791),(1.1999840362811791,1.2959985487528345),(1.2959985487528345,1.4133289795918367),(1.4133289795918367,1.5146608616780044),(1.5146608616780044,1.6159927437641723),(1.6159927437641723,1.749321723356009),(1.749321723356009,1.8559941950113379),(1.8559941950113379,1.9413275283446711),(1.9413275283446711,2.037318820861678),(2.037318820861678,2.2506637641723355),(2.2506637641723355,2.357313015873016),(2.357313015873016,2.4586448979591835),(2.4586448979591835,2.565317369614512),(2.565317369614512,2.6719898412698413),(2.6719898412698413,2.879994195011338),(2.879994195011338,3.093315918367347),(3.093315918367347,3.1893304308390027),(3.1893304308390027,3.3013202721088435),(3.3013202721088435,3.4186507029478457),(3.4186507029478457,3.5359811337868483),(3.5359811337868483,3.615997097505669),(3.615997097505669,3.722646349206349),(3.722646349206349,3.9413318820861676)]

sound files

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其他元素将获得点击?
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。)
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbcDriver发生异常。为什么?
这是用Java进行XML解析的最佳库。
Java的PriorityQueue的内置迭代器不会以任何特定顺序遍历数据结构。为什么?
如何在Java中聆听按键时移动图像。
Java“Program to an interface”。这是什么意思?