如何解决Snakemake决定执行期间执行哪些规则
我正在研究生物信息学管道,该管道必须能够运行不同的规则以根据输入文件的内容产生不同的输出:
def foo(file):
'''
Function will read the file contents and output a boolean value based on its contents
'''
# Code to read file here...
return bool
rule check_input:
input: "input.txt"
run:
bool = foo("input.txt")
rule bool_is_True:
input: "input.txt"
output: "out1.txt"
run:
# Some code to generate out1.txt. This rule is supposed to run only if foo("input.txt") is true
rule bool_is_False:
input: "input.txt"
output: "out2.txt"
run:
# Some code to generate out2.txt. This rule is supposed to run only if foo("input.txt") is False
如何编写规则以处理这种情况?另外,如果在执行check_input规则之前输出文件未知,我该如何写我的第一个规则?
谢谢!
解决方法
是的,snakemake必须在执行规则之前知道要生成哪些文件。因此,我建议您使用一个函数来读取所谓的“输入文件”并相应地定义工作流的输出。
例如:
def getTargetsFromInput():
targets = list()
## read file and add target files to targets
return targets
rule all:
input: getTargetsFromInput()
...
您可以在snakemake命令行上使用--config
参数定义输入文件的路径,或直接使用某种结构化的输入文件(yaml,json)并在Snakefile中使用关键字configfile:
:https://snakemake.readthedocs.io/en/stable/snakefiles/configuration.html
感谢Eric。我可以使用它:
def getTargetsFromInput(file):
with open(file) as f:
line = f.readline()
if line.strip() == "out1":
return "out1.txt"
else:
return "out2.txt"
rule all:
input: getTargetsFromInput("input.txt")
rule out1:
input: "input.txt"
output: "out1.txt"
run: shell("echo 'out1' > out1.txt")
rule out2:
input: "input.txt"
output: "out2.txt"
run: shell("echo 'out2' > out2.txt")
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。