微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

返回带有 lambda 函数的文件列表时,Snakemake 上的 InputFunction 错误

如何解决返回带有 lambda 函数的文件列表时,Snakemake 上的 InputFunction 错误

我正在编写一条蛇形规则,该规则将从已解析的 yaml 中获取输入值并将与该组标签关联的文件作为列表返回,但我遇到了一个奇怪的错误

我的函数在返回之前打印了返回输出,所以它似乎正在返回一个列表

['/SAN/vyplab/alb_projects/data/muscle/analysis/feature_counts/Ctrl1_featureCounts_results.txt','/SAN/vyplab/alb_projects/data/muscle/analysis/feature_counts/Ctrl4_featureCounts_results.txt','/SAN/vyplab/alb_projects/data/muscle/analysis/feature_counts/Ctrl2_featureCounts_results.txt','/SAN/vyplab/alb_projects/data/muscle/analysis/feature_counts/Ctrl3_featureCounts_results.txt']

然而,我得到了一个“AttributeError”,这是出乎意料的,因为我直接从之前的一些管道中改编了这个,这个管道与这个函数完美地配合

InputFunctionException in line 26 of /SAN/vyplab/alb_projects/pipelines/rna_seq_snakemake/rules/deseq2_featureCounts.smk:
AttributeError: 'str' object has no attribute 'list'
Wildcards:
bse=control
contrast=ContrastvControl

规则看起来像这样,我省略了 shell 和 params 调用,因为我认为它们不需要调试

rule run_standard_deseq:
    input:
        base_group = lambda wildcards: featurecounts_files_from_contrast(wildcards.bse),contrast_group = lambda wildcards: featurecounts_files_from_contrast(wildcards.contrast)
    output:
        os.path.join(DESEQ2_DIR,"{bse}_{contrast}" + "normed_counts.csv.gz")

辅助函数的实现

def featurecounts_files_from_contrast(grp):
    """
    given a contrast name or list of groups return a list of the files in that group
    """
    #reading in the samples
    samples = pd.read_csv(config['sampleCSVpath'])
    #there should be a column which allows you to exclude samples
    samples2 = samples.loc[samples.exclude_sample_downstream_analysis != 1]
    #read in the comparisons and make a dictionary of comparisons,comparisons needs to be in the config file
    compare_dict = load_comparisons()
    #go through the values of the dictionary and break when we find the right groups in that contrast
    grps,comparison_column = return_sample_names_group(grp)
    #take the sample names corresponding to those groups
    if comparison_column == "":
        return([""])
    grp_samples = list(set(list(samples2[samples2[comparison_column].isin(grps)].sample_name)))
    feature_counts_outdir = get_output_dir(config["project_top_level"],config["feature_counts_output_folder"])
    fc_suffix = "_featureCounts_results.txt"

    #build a list with the full path from those sample names
    fc_files = [os.path.join(feature_counts_outdir,x + fc_suffix) \
                   for x in grp_samples]
    fc_files = list(set(fc_files))
    print(fc_files)

    return(fc_files)

打印命令返回正确的文件,所以我认为这会起作用

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。