如何解决返回带有 lambda 函数的文件列表时,Snakemake 上的 InputFunction 错误
我正在编写一条蛇形规则,该规则将从已解析的 yaml 中获取输入值并将与该组标签关联的文件作为列表返回,但我遇到了一个奇怪的错误。
我的函数在返回之前打印了返回输出,所以它似乎正在返回一个列表
['/SAN/vyplab/alb_projects/data/muscle/analysis/feature_counts/Ctrl1_featureCounts_results.txt','/SAN/vyplab/alb_projects/data/muscle/analysis/feature_counts/Ctrl4_featureCounts_results.txt','/SAN/vyplab/alb_projects/data/muscle/analysis/feature_counts/Ctrl2_featureCounts_results.txt','/SAN/vyplab/alb_projects/data/muscle/analysis/feature_counts/Ctrl3_featureCounts_results.txt']
然而,我得到了一个“AttributeError”,这是出乎意料的,因为我直接从之前的一些管道中改编了这个,这个管道与这个函数完美地配合
InputFunctionException in line 26 of /SAN/vyplab/alb_projects/pipelines/rna_seq_snakemake/rules/deseq2_featureCounts.smk:
AttributeError: 'str' object has no attribute 'list'
Wildcards:
bse=control
contrast=ContrastvControl
规则看起来像这样,我省略了 shell 和 params 调用,因为我认为它们不需要调试
rule run_standard_deseq:
input:
base_group = lambda wildcards: featurecounts_files_from_contrast(wildcards.bse),contrast_group = lambda wildcards: featurecounts_files_from_contrast(wildcards.contrast)
output:
os.path.join(DESEQ2_DIR,"{bse}_{contrast}" + "normed_counts.csv.gz")
辅助函数的实现
def featurecounts_files_from_contrast(grp):
"""
given a contrast name or list of groups return a list of the files in that group
"""
#reading in the samples
samples = pd.read_csv(config['sampleCSVpath'])
#there should be a column which allows you to exclude samples
samples2 = samples.loc[samples.exclude_sample_downstream_analysis != 1]
#read in the comparisons and make a dictionary of comparisons,comparisons needs to be in the config file
compare_dict = load_comparisons()
#go through the values of the dictionary and break when we find the right groups in that contrast
grps,comparison_column = return_sample_names_group(grp)
#take the sample names corresponding to those groups
if comparison_column == "":
return([""])
grp_samples = list(set(list(samples2[samples2[comparison_column].isin(grps)].sample_name)))
feature_counts_outdir = get_output_dir(config["project_top_level"],config["feature_counts_output_folder"])
fc_suffix = "_featureCounts_results.txt"
#build a list with the full path from those sample names
fc_files = [os.path.join(feature_counts_outdir,x + fc_suffix) \
for x in grp_samples]
fc_files = list(set(fc_files))
print(fc_files)
return(fc_files)
打印命令返回正确的文件,所以我认为这会起作用
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。