如何解决蛇形是否支持无文件输入?
运行以下规则时得到 MissingInputException
:
configfile: "Configs.yaml"
rule download_data_from_ZFIN:
input:
anatomy_item = config["ZFIN_url"]["anatomy_item"],xpat_stage_anatomy = config["ZFIN_url"]["xpat_stage_anatomy"],xpat_fish = config["ZFIN_url"]["xpat_fish"],anatomy_synonyms = config["ZFIN_url"]["anatomy_synonyms"]
output:
anatomy_item = os.path.join(os.getcwd(),config["download_data_from_ZFIN"]["dir"],"anatomy_item.tsv"),xpat_stage_anatomy = os.path.join(os.getcwd(),"xpat_stage_anatomy.tsv"),xpat_fish = os.path.join(os.getcwd(),"xpat_fish.tsv"),anatomy_synonyms = os.path.join(os.getcwd(),"anatomy_synonyms.tsv")
shell:
"wget -O {output.anatomy_item} {input.anatomy_item};" \
"wget -O {output.anatomy_synonyms} {input.anatomy_synonyms};" \
"wget -O {output.xpat_stage_anatomy} {input.xpat_stage_anatomy};" \
"wget -O {output.xpat_fish} {input.xpat_fish};"
这是我的 configs.yaml
文件的内容:
ZFIN_url:
# Zebrafish Anatomy Term
anatomy_item: "https://zfin.org/downloads/file/anatomy_item.txt"
# Zebrafish Gene Expression by Stage and Anatomy Term
xpat_stage_anatomy: "https://zfin.org/downloads/file/xpat_stage_anatomy.txt"
# ZFIN Genes with Expression Assay Records
xpat_fish: "https://zfin.org/downloads/file/xpat_fish.txt"
# Zebrafish Anatomy Term Synonyms
anatomy_synonyms: "https://zfin.org/downloads/file/anatomy_synonyms.txt"
download_data_from_ZFIN:
dir: ZFIN_data
错误信息是:
Building DAG of jobs...
MissingInputException in line 10 of /home/zhangdong/works/NGS/coevolution/snakemake/coevolution.rule:
Missing input files for rule download_data_from_ZFIN:
https://zfin.org/downloads/file/anatomy_item.txt
我想确定此异常是否是由 input
规则的无文件输入引起的?
解决方法
请注意,您也可以使用 remote files 作为输入,这样您就可以完全避免规则 download_data_from_ZFIN
。例如:
from snakemake.remote.HTTP import RemoteProvider as HTTPRemoteProvider
HTTP = HTTPRemoteProvider()
rule all:
input:
'output.txt',rule one:
input:
# Some file from the web
x= HTTP.remote('https://plasmodb.org/common/downloads/release-49/PbergheiANKA/txt/PlasmoDB-49_PbergheiANKA_CodonUsage.txt',keep_local=True)
output:
'output.txt',shell:
r"""
# Do something with the remote file
head {input.x} > {output}
"""
远程文件将被下载并存储在本地plasmodb.org/common/.../PlasmoDB-49_PbergheiANKA_CodonUsage.txt
非常感谢@dariober,我尝试了以下代码并且成功了,
import os
from snakemake.remote.HTTP import RemoteProvider as HTTPRemoteProvider
configfile: "Configs.yaml"
HTTP = HTTPRemoteProvider()
rule all:
input:
expand(os.path.join(os.getcwd(),config["download_data_from_ZFIN"]["dir"],"{item}.tsv"),item=list(config["ZFIN_url"].keys()))
rule download_data_from_ZFIN:
input:
lambda wildcards: HTTP.remote(config["ZFIN_url"][wildcards.item],keep_local=True)
output:
os.path.join(os.getcwd(),"{item}.tsv")
threads:
1
shell:
"mv {input} > {output}"
这样的代码更像蛇形,但我还有两个问题:
- 有没有办法指定下载的输出文件名?现在我使用
mv
命令来实现这一点。 - 此
remote files
函数是否支持并行工作?我把上面的代码和--cores 6
一起试了,但是还是一个一个的下载文件。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。