You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a hashtable does not support same keys, why do we then keep on iterating if we found the item with the key that matches the provided one? I believe it should be:
First of all, I think the per cent when the hash table should be resized should be greater. 10% seems to be too small. I might be wrong here, but imagine your hash table has space for 100 key-value pairs and you wait for it to only have 9 items to resize it. Isn't that a huge loss of memory space? I changed it to 30%. So the delete function now looks like this:
voidht_delete(ht_hash_table*ht, constchar*key) {
constintload=ht->count*100 / ht->size;
if (load<30)
ht_resize_down(ht);
intindex=ht_get_hash(key, ht->size, 0);
ht_item*item=ht->items[index];
inti=1;
while (item!=NULL) {
if (item!=&HT_DELETED_ITEM) {
if (strcmp(item->key, key) ==0) {
ht_del_item(item);
ht->items[index] =&HT_DELETED_ITEM;
ht->count--;
return;
}
}
index=ht_get_hash(key, ht->size, i);
item=ht->items[index];
i++;
}
}
Secondly, increasing the per cent to 30 helps with cleaning the dictionary. The author explained very well why we do not delete an item from the dictionary but free it and then change it to a pointer to the sentinel value. But the problem here is that the dictionary gets dirty. The more pointers to sentinel values there are, the less space there is for the actual key-value pairs and the more unnecessary computations need to be done in order to find the right index in the list to insert, search or delete.
The problem with resizing down. Part 2
In the resize function, we check whether the provided base_size is less than HT_INITIAL_BASE_SIZE. If the condition is true, we just return. But coming back to the part 1, I would allow it to resize to get rid of all space consuming and useless sentinel values. I propose the following:
if (base_size<HT_INITIAL_BASE_SIZE)
base_size=HT_INITIAL_BASE_SIZE;
Do not finish function execution there , but set the base_size value to the allowed minimum. In order to do that you need to change base_size to int. I guess, you might argue that resizing the dictionary more often is not that performant, but while playing with this hash table I once encountered a situation when I could not insert a key-value pair for quite a while because of the amount of sentinel values and collisions.
In the issue #42 a situation is described where a dead item might occur. I believe, there is no problem with that. The author handles this with the sentinel value. After inserting 2 items that collide and deleting the first one, in order to not break the collision chain the author uses the sentinel value. In all methods that we implement there are checks for the sentinel value. If the item we are currently inspecting is a pointer to the sentinel value, take the next one. So, updating the second value should not be an issue. After first hashing we get the index that has the pointer to the sentinel value, we iterate once more, find a new index with a pointer to a struct that has the key we are looking for, we free this struct and replace it with a new one.
It was a great repository to work through, I liked it a lot. If you have any comments regarding my comments, please post them.
The text was updated successfully, but these errors were encountered:
What are the values for HT_PRIME_1 and HT_PRIME_2?
I used the following:
#24 Hash function computing not the way described in the pseudocode
Just take the modulus operator out of the loop and get the following:
#51 #45 #12 Infinite loop caused by the double hashing
the solution is explained in issue #35. The code should be as follows:
It handles 2 cases:
if (hash_b % num_buckets == 0) hash_b = 1;
.Modifying the delete function
Here is the one provided by the author:
As a hashtable does not support same keys, why do we then keep on iterating if we found the item with the key that matches the provided one? I believe it should be:
The problem with resizing down
First of all, I think the per cent when the hash table should be resized should be greater. 10% seems to be too small. I might be wrong here, but imagine your hash table has space for 100 key-value pairs and you wait for it to only have 9 items to resize it. Isn't that a huge loss of memory space? I changed it to 30%. So the delete function now looks like this:
Secondly, increasing the per cent to 30 helps with cleaning the dictionary. The author explained very well why we do not delete an item from the dictionary but free it and then change it to a pointer to the sentinel value. But the problem here is that the dictionary gets dirty. The more pointers to sentinel values there are, the less space there is for the actual key-value pairs and the more unnecessary computations need to be done in order to find the right index in the list to insert, search or delete.
The problem with resizing down. Part 2
In the resize function, we check whether the provided base_size is less than HT_INITIAL_BASE_SIZE. If the condition is true, we just return. But coming back to the part 1, I would allow it to resize to get rid of all space consuming and useless sentinel values. I propose the following:
Do not finish function execution there , but set the base_size value to the allowed minimum. In order to do that you need to change base_size to int. I guess, you might argue that resizing the dictionary more often is not that performant, but while playing with this hash table I once encountered a situation when I could not insert a key-value pair for quite a while because of the amount of sentinel values and collisions.
#42 Dead items
In the issue #42 a situation is described where a dead item might occur. I believe, there is no problem with that. The author handles this with the sentinel value. After inserting 2 items that collide and deleting the first one, in order to not break the collision chain the author uses the sentinel value. In all methods that we implement there are checks for the sentinel value. If the item we are currently inspecting is a pointer to the sentinel value, take the next one. So, updating the second value should not be an issue. After first hashing we get the index that has the pointer to the sentinel value, we iterate once more, find a new index with a pointer to a struct that has the key we are looking for, we free this struct and replace it with a new one.
It was a great repository to work through, I liked it a lot. If you have any comments regarding my comments, please post them.
The text was updated successfully, but these errors were encountered: