如何解决Snakemake 中的错误“规则定义中的意外关键字扩展”
如标题所示,我的 Snakefile 在 all 规则中给了我一个扩展函数的语法错误。我知道这通常是由空格/缩进错误引起的,但是我已经确认文件中没有选项卡。我已经删除了每个空格,并使用 grep 搜索了文件。我很感激任何建议。
错误信息:
SyntaxError in line 14 of /PATH/to/Snakefile:
Unexpected keyword expand in rule deFinition (Snakefile,line 14)
代码:
from glob import glob
from numpy import unique
reads = glob('{}/*'.format(config['readDir']))
samples = []
for i in reads:
sampleName = i.replace('{}/'.format(config['readDir']),'')
sampleName = sampleName.replace('{}'.format(config['readSuffix1']),'')
sampleName = sampleName.replace('{}'.format(config['readSuffix2']),'')
samples.append(sampleName)
samples = unique(samples)
rule all:
expand('fastqc/{sample}_1_fastqc.html',sample=samples),expand('gene_count/{sample}.count',sample=samples)
rule fastqc:
input:
r1 = config['readDir'] + '/{sample}' + config['readSuffix1'],r2 = config['readDir'] + '/{sample}' + config['readSuffix2']
output:
o1 = 'fastqc/{sample}_1_fastqc.html',o2 = 'fastqc/{sample}_2_fastqc.html'
params:
'fastqc'
shell:
'fastqc {input.r1} {input.r2} -o {params}'
rule trim:
input:
r1 = config['readDir'] + '/{sample}' + config['readSuffix1'],r2 = config['readDir'] + '/{sample}' + config['readSuffix2']
output:
'trimmed_reads/{sample}_val_1.fq','trimmed_reads/{sample}_val_2.fq'
params:
outDir = 'trimmed_reads',suffix = '{sample}',minPhred = config['minPhred'],minOverlap = config['minOverlap']
shell:
'trim_galore --paired --quality {params.minPhred} '
'--stringency {params.minOverlap} --basename {params.suffix} '
'--output_dir {params.outDir} {input.r1} {input.r2}'
rule align:
input:
r1 = 'trimmed_reads/{sample}_val_1.fq',r2 = 'trimmed_reads/{sample}_val_2.fq'
output:
sam = temp('aligned_reads/{sample}.sam'),bam = 'aligned_reads/{sample}.bam'
params:
ref = config['hisatRef']
threads:
config['threads']
log:
'logs/{sample}_hisat2.log'
shell:
'hisat2 --dta -p {threads} -x {params.ref} '
'-1 {input.r1} -2 {input.r2} -S {output.sam} 2> {log}; '
'samtools sort -@ {threads} -o {output.bam} {output.sam}; '
rule sort_name:
input:
'aligned_reads/{sample}.bam'
output:
bam = temp('aligned_reads/{sample}_name_sorted.bam'),index = temp('aligned_reads/{sample}_name_sorted.bam.bai')
threads:
config['threads']
shell:
'samtools sort -n -@ {threads} -o {output.bam} {input}; '
rule count:
input:
bam = 'aligned_reads/{sample}.bam'
output:
'gene_count/{sample}.count'
params:
annotations = config['annotations'],minMapq = config['minMapq'],stranded = config['stranded']
shell:
'htseq-count -s {params.stranded} -a {params.minMapq} '
'--additional_attr=gene_name --additional_attr=gene_type '
'{input.bam} {params.annotations} > {output}'
解决方法
这是来自 python 的错误,因为规则 all
有两个用逗号分隔的函数。在这种情况下,第二个扩展调用会导致错误。您可以将 ,
替换为 +
以解决如下所示的错误。
expand('fastqc/{sample}_1_fastqc.html',sample=samples) + expand('gene_count/{sample}.count',sample=samples)
您也可以将两者合并为一个扩展函数,如下所示
expand(['fastqc/{sample}_1_fastqc.html','gene_count/{sample}.count'],sample=samples)
,
以下代码将解决此问题:
rule all:
input:
expand('fastqc/{sample}_1_fastqc.html',sample=samples),expand('gene_count/{sample}.count',sample=samples)
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。