微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

并行运行张量流模型时出错,按顺序运行时它可以正常工作

如何解决并行运行张量流模型时出错,按顺序运行时它可以正常工作

尝试使用 pathos.multiprocessing.Pool 并行使用多个 TensorFlow 模型

错误是:

mm

池的创建如下:

multiprocess.pool.RemoteTraceback:

Traceback (most recent call last):
  File "c:\users\burge\appdata\local\programs\python\python37\lib\site-packages\multiprocess\pool.py",line 121,in worker
    result = (True,func(*args,**kwds))
  File "c:\users\burge\appdata\local\programs\python\python37\lib\site-packages\multiprocess\pool.py",line 44,in mapstar
    return list(map(*args))
  File "c:\users\burge\appdata\local\programs\python\python37\lib\site-packages\pathos\helpers\mp_helper.py",line 15,in <lambda>
    func = lambda args: f(*args)
  File "c:\Users\Burge\Desktop\SwarmMemory\sim.py",line 38,in run
    i.step()
  File "c:\Users\Burge\Desktop\SwarmMemory\agent.py",line 240,in step
    output = self.ai(np.array(self.internal_log).reshape(-1,1,9))
  File "c:\users\burge\appdata\local\programs\python\python37\lib\site-packages\tensorflow\python\keras\engine\base_layer.py",line 1012,in __call__
    outputs = call_fn(inputs,*args,**kwargs)
  File "c:\users\burge\appdata\local\programs\python\python37\lib\site-packages\tensorflow\python\keras\engine\sequential.py",line 375,in call
    return super(Sequential,self).call(inputs,training=training,mask=mask)
  File "c:\users\burge\appdata\local\programs\python\python37\lib\site-packages\tensorflow\python\keras\engine\functional.py",line 425,in call
    inputs,line 569,in _run_internal_graph
    assert x_id in tensor_dict,'Could not compute output ' + str(x)
AssertionError: Could not compute output KerasTensor(type_spec=TensorSpec(shape=(None,4),dtype=tf.float32,name=None),name='dense_1/BiasAdd:0',description="created by layer 'dense_1'")

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "c:\Users\Burge\Desktop\SwarmMemory\sim.py",line 78,in <module>
    p.map(Sim.run,sims)
  File "c:\users\burge\appdata\local\programs\python\python37\lib\site-packages\pathos\multiprocessing.py",line 137,in map
    return _pool.map(star(f),zip(*args)) # chunksize
  File "c:\users\burge\appdata\local\programs\python\python37\lib\site-packages\multiprocess\pool.py",line 268,in map
    return self._map_async(func,iterable,mapstar,chunksize).get()
  File "c:\users\burge\appdata\local\programs\python\python37\lib\site-packages\multiprocess\pool.py",line 657,in get
    raise self._value
AssertionError: Could not compute output KerasTensor(type_spec=TensorSpec(shape=(None,description="created by layer 'dense_1'")

基本上,我正在使用提供给类 sim 的模型运行模拟。这意味着在 sim 运行后,我可以对结果使用适应度函数,并对结果应用遗传算法。

GitHub 链接了解更多信息,在分支 python-ver 下: https://github.com/HarryBurge/SwarmMemory

编辑: 万一有人需要知道将来如何做到这一点。 我使用 keras-pickle-wrapper 来腌制 keras 模型并将其传递给 run 方法

if __name__ == '__main__':
    freeze_support()

    model = Sequential()
    model.add(Input(shape=(1,9)))
    model.add(LSTM(10,return_sequences=True))
    model.add(Dropout(0.1))
    model.add(LSTM(5))
    model.add(Dropout(0.1))
    model.add(Dense(4))
    model.add(Dense(4))

    models = []
    sims = []

    for i in range(6):
        models.append(tensorflow.keras.models.clone_model(model))
        sims.append(Sim(models[-1]))
    
    p = Pool()
    p.map(Sim.run,sims)

解决方法

我是 pathos 的作者。每当您在错误中看到 self._value 时,通常发生的情况是您尝试发送到另一个处理器的内容未能序列化。诚然,错误和回溯有点迟钝。但是,您可以做的是使用 dill 检查序列化,并确定是否需要使用序列化变体之一(如 dill.settings['trace'] = True),或者是否需要稍微重构代码以更好地适应连载。如果您正在使用的类是可以编辑的,那么简单的做法是添加一个 __reduce__ 方法或类似方法来帮助序列化。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。