微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

在C中的哈希表问题

如何解决在C中的哈希表问题

所以我有一个任务,要在c中创建一个程序以读取几个句子(一个140mb的文件),并基于第二个输入(即数字),我需要返回第N个最常见的单词。我的想法是用线性探测构建一个哈希表,每当我得到一个新元素时,我都会根据其位置并基于djb2对它进行哈希处理,否则,如果发生冲突,我会重新哈希。之后,我根据出现的情况应用Quicksort,然后最终按索引访问。 我在用c中的线性探测完成哈希表时遇到问题。我很确定自己已经完成了,但是每次运行时,都会在lldb上出现堆缓冲区溢出。我试图找出问题所在,但仍然无法解决

我堆栈内存不足了吗?该文件相对较小,占用了大量内存。 我使用了地址消毒器,并且在插入时出现了堆缓冲区溢出。

我认为我没有触及分配区域之外的内存,但我不确定100%。

任何主意出了什么问题吗?这是table.c的实现,在下面可以看到该结构的形式。

这是来自地址sanitiser的更详细的消息:

thread #1: tid = 0x148b44,0x0000000100166b20 libclang_rt.asan_osx_dynamic.dylib`__asan::AsanDie(),queue = 'com.apple.main-thread',stop reason = Heap buffer overflow

{
 "access_size": 1,"access_type": 1,"address": 105690555220216,"description": "heap-buffer-overflow","instrumentation_class": "AddressSanitizer","pc": 4294981434,"stop_type": "fatal_error"
}

table.c:

#include "table.h"
#include "entities.h"

static inline entry_t* entryInit(const char* const value){

    unsigned int len   = strlen(value);
    entry_t* entry     = malloc(sizeof(entry));
    entry->value       = malloc(sizeof(char*) * len);
    strncpy(entry->value,value,strlen(value));
    entry->exists      = 1;
    entry->occurence   = 1;

    return entry;
}

table_t* tableInit(const unsigned int size){

    table_t* table     = malloc(sizeof(table_t));
    table->entries     = malloc(size*sizeof(entry_t));
    table->seed        = getPrime();
    table->size        = size;
    table->usedEntries = 0U;

    return table;
}

//okay,there is definitely an issue here
table_t* tableResize(table_t* table,const unsigned int newSize){

    //most likely wont happen but if there is an overflow then we have a problem
    if(table->size > newSize) return NULL;

    //create a temp array of the realloced array,then do changes there
    entry_t* temp = calloc(newSize,sizeof(entry_t));

    table->size = newSize;

    //temp pointer to an entry
    entry_t *tptr = NULL;
    unsigned int pos = 0;
    unsigned int index = 0;

    while(pos != table->size){

        tptr = &table->entries[pos];

        if(tptr->exists == 1){

            index = hashString(table->seed,tptr->value,table->size,pos);

            temp[index] = *entryInit(tptr->value);

            temp[index].occurence = tptr->occurence;

            break;
        }

        else pos++;
   }

   table->entries = temp;
   //Todo: change table destroy to free the prevIoUs array from the table
   free(temp);

   return table;
}

//insert works fine,it is efficient enough to add something in the table
unsigned int tableInsert(table_t* table,const char* const value){

    //decide when to resize,might create a large enough array to bloat the memory?
    if(table->usedEntries >(unsigned int)(2*(table->size/3))) table = tableResize(table,table->size*2);

    entry_t* entry = NULL;
    unsigned int index;
    auto int position = 0;

    while(position != table->size){

        //calculate the hash of our string as a function of the current position on the table
        index = hashString(table->seed,position);
        entry = &table->entries[index];

        if(entry->exists == 0){

            *entry = *entryInit(value);
            table->usedEntries++;
            return index;

        } else if (entry->exists == 1 && strcmp(entry->value,value) == 0){

            entry->occurence++;
            return index;

        } else{
            position++;
    }
  }
}

//there might be an issue here
static inline void tableDestroy(const table_t* const table){

    entry_t* entry = NULL;

    for (auto int i = 0; i < table->size; ++i){

        entry =&table->entries[i];

 //printf("Value: %s  Occurence: %d  Exists: %d \n",entry->value,entry->occurence,entry->exists );

       if(&table->entries[i] !=NULL)free(&table->entries[i]);
   }
   free(table);
}

entities.h:

#pragma once

typedef struct __attribute__((packed)) __entry {

    char *value;
    unsigned int exists : 1;
    unsigned int occurence;

} entry_t;

typedef struct __table {

    int size;
    int usedEntries;
    entry_t *entries;
    unsigned int seed;

} table_t;

这是我从文件中读取和处理文本的方式:

void readFromFile(const char* const fileName,table_t* table){

    FILE *fp = fopen(fileName,"r");

    if(!fp) fprintf(stderr,"error reading file. \n");

    char word[15];//long enough to hold the biggest word in the text?
    int position = 0;
    char ch;

    while((ch = fgetc(fp))!= EOF){

        //discard all the ascii chars that are not letters
        if(!(ch  >= 65 && ch <= 90) && !(ch >= 97 && ch <= 122)){

        word[position]= '\0';

        if(word[0] == NULL)continue;

        tableInsert(table,word);

        position = 0;

        continue;

        }
        else word[position++] = ch;
  }
}

任何建议我的代码有什么问题吗? 我认为调整大小可能会出现问题,并且由于内存管理存在很多问题,因此我尚未正确删除

谢谢!

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。