微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

在MPI中跨多个流程创建局部结构时出现分段错误

如何解决在MPI中跨多个流程创建局部结构时出现分段错误

我对分布式并行编码非常陌生。 我试图在多个进程中创建结构(CountMin Sketch)的本地实例,然后在进程中独立初始化和更新这些本地草图,但是我遇到了分段错误

结构是

  struct CM_type{
      long long count;
      int depth;
      int width;
      int ** counts;
      unsigned int *hasha,*hashb;
  };

并且初始化和更新功能

  std::vector<CM_type> CM_Init(int width,int depth,int seed,std::vector<KmerPairs> 
         &all_kmers_from_procs1)
  {
      int j;
      prng_type * prng;
      prng=prng_Init(-abs(seed),2); 
      #pragma omp parallel shared (width,depth,prng) //private(j)
      {
          #pragma omp for
          for(int i = 0; i < size; i++) //size is the number of processes
          {
               std::vector<CM_type> cm_loc;
               cm_loc[i].depth=depth;
               cm_loc[i].width=width;
               cm_loc[i].count=0;
               cm_loc[i].counts=(int **)calloc(sizeof(int *),cm_loc[i].width);
               cm_loc[i].counts[0]=(int *)calloc(sizeof(int),cm_loc[i].depth*cm_loc[i].width);
               cm_loc[i].hasha=(unsigned int *)calloc(sizeof(unsigned int),cm_loc[i].depth);
               cm_loc[i].hashb=(unsigned int *)calloc(sizeof(unsigned int),cm_loc[i].depth);
               if (cm_loc[i].counts && cm_loc[i].hasha && cm_loc[i].hashb && cm_loc[i].counts[i])
               {
                  for (j=0;j<depth;j++)
                  {
                      cm_loc[i].hasha[j]=prng_int(prng) & MOD;
                      cm_loc[i].hashb[j]=prng_int(prng) & MOD;
                      // pick the hash functions
                      cm_loc[i].counts[j]=(int *) cm_loc[i].counts[0]+(j*cm_loc[i].width);
                  }
               }
               for (size_t it=0; it<all_kmers_from_procs1.size(); it++) 
               {
                    KmerPairs k_entry1; //KmerPairs is another structure which has two members
                    k_entry1.seq = all_kmers_from_procs1[it].seq;
                    k_entry1.k_count = all_kmers_from_procs1[it].k_count;
                    int loc = 0;
                   __sync_add_and_fetch(&cm_loc[i].count,all_kmers_from_procs1[it].k_count);
                   for(int b = 0; b < cm_loc[i].depth; b++)
                   {
                       
                       cm_loc[i].counts[b][hash31(cm_loc[i].hasha[b],cm_loc[i].hashb[b],all_kmers_from_procs1[it].seq) %cm_loc[i].width]+= all_kmers_from_procs1[it].k_count; 
                       printf("\nthe CM count is %d %d and the count is %llu \n",cm_loc[i].counts[b][hash31(cm_loc[i].hasha[b],all_kmers_from_procs1[it].seq) %cm_loc[i].width],cm_loc[i].count);
                       
                   }
              
              }
        } 
       
  } 

}

我的每个过程都有一个称为KmerPairs的结构,该结构具有2个成员(kmer序列和一个计数,该数字显示该特定kmer在该过程中存在多少次)。我想通过散列kmer序列(已转换为二进制)来创建本地最小计数草图,并由kmer的count成员增加最小计数草图中的计数。但是我遇到了这样的分段错误错误

  CountMin Sketch Started 
  [compute-2-9-ib:29987:0] Caught signal 11 (Segmentation fault: address not mapped to object at 
   address (nil))
  [compute-2-9-ib:29949:0] Caught signal 11 (Segmentation fault: address not mapped to object at 
   address (nil))
  [compute-2-9-ib:29950:0] Caught signal 11 (Segmentation fault: address not mapped to object at 
   address (nil))
  [compute-2-9-ib:29951:0] Caught signal 11 (Segmentation fault: address not mapped to object at 
   address (nil))
  [compute-2-9-ib:29952:0] Caught signal 11 (Segmentation fault: address not mapped to object at 
   address (nil))
  [compute-2-9-ib:29953:0] Caught signal 11 (Segmentation fault: address not mapped to object at 
   address (nil))
  [compute-2-9-ib:29954:0] Caught signal 11 (Segmentation fault: address not mapped to object at 
   address (nil))
  [compute-2-9-ib:29955:0] Caught signal 11 (Segmentation fault: address not mapped to object at 
   address (nil))
  [compute-2-9-ib:29956:0] Caught signal 11 (Segmentation fault: address not mapped to object at 
   address (nil))
  [compute-2-9-ib:29960:0] Caught signal 11 (Segmentation fault: address not mapped to object at 
   address (nil))
  [compute-2-9-ib:29963:0] Caught signal 11 (Segmentation fault: address not mapped to object at 
   address (nil))
  [compute-2-9-ib:29965:0] Caught signal 11 (Segmentation fault: address not mapped to object at 
   address (nil))
  [compute-2-9-ib:29970:0] Caught signal 11 (Segmentation fault: address not mapped to object at 
   address (nil))
  [compute-2-9-ib:29973:0] Caught signal 11 (Segmentation fault: address not mapped to object at 
   address (nil))
  [compute-2-9-ib:29976:0] Caught signal 11 (Segmentation fault: address not mapped to object at 
   address (nil))
  [compute-2-9-ib:29979:0] Caught signal 11 (Segmentation fault: address not mapped to object at 
   address (nil))
  [compute-2-9-ib:29981:0] Caught signal 11 (Segmentation fault: address not mapped to object at 
   address (nil))
  [compute-2-9-ib:29983:0] Caught signal 11 (Segmentation fault: address not mapped to object at 
   address (nil))
  [compute-2-9-ib:29985:0] Caught signal 11 (Segmentation fault: address not mapped to object at 
   address (nil))
  [compute-2-9-ib:29948:0] Caught signal 11 (Segmentation fault: address not mapped to object at 
   address (nil))

那是我20个进程的20个分段错误。任何指导将不胜感激。我很乐意提供更多信息。谢谢。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。