Hash_b explanation incorrect? #12

alcides · 2017-08-24T09:41:31Z

It is possible that hash_b will return 0, reducing the second term to 0. This will cause the hash table to try to insert the item into the same bucket over and over. We can mitigate this by adding 1 to the result of the second hash, making sure it's never 0.

index = hash_a(string) + i * (hash_b(string) + 1) % num_buckets

hash_b(string) + 1 will never be 0, but it may end up having the same effect if hash_b(string) is (num_buckets-1). Then no matter what i is, the final result will be the same as hash_a(string) % num_buckets, effectively resulting in the same DDOS explained before.

Daoctor · 2017-08-30T06:35:57Z

Yes, i face the same problem, here is a example

int prime_a  = 151;
int prime_b = 163;
int buckets = 107;
int hash_a = 92; //calc
int hash_b = 106; //calc

then the result of (hash_a + (attempt* (hash_b+1)))%buckets will always give me the same number 92 no matter what the value of attempt is.
so i use big number to instead prime_a and prime_b, but i think this is not a prefect solution. the problem still there.

jamesroutley · 2017-08-30T17:46:54Z

Hi all, sorry for the delay in response, I've just started a new job and have been pretty busy this past week.

You're both right - that situation would cause the same issue. However, I ~ think ~ that this problem can be avoided by choosing a hash_b whose maximum number is smaller than hash_a's maximum number (which is by design the same as num_buckets). Then, hash_b(string) could never be num_buckets - 1.

Does that sound correct to you? If so, I'll definitely add this info to the tutorial.

instantaphex · 2018-01-04T03:34:26Z

I'm having this same issue but I don't quite understand your proposed fix.

I ~ think ~ that this problem can be avoided by choosing a hash_b whose maximum number is smaller than hash_a's maximum number (which is by design the same as num_buckets). Then, hash_b(string) could never be num_buckets - 1.

I'm not "choosing" a hash_b. I'm using ht_hash to calculate hash_b using the string, HT_PRIME_2, and the number of buckets in the hash table. What is the proposed fix?

howlinweed · 2018-08-08T16:17:44Z

I think @jamesroutley is suggesting to use a smaller prime for HT_PRIME_2 to ensure that hash_b remains lower than hash_a.

akerdi · 2023-04-30T14:15:07Z

attempt * (hash_b + 1) may result (hash_b + 1) == bucket_size, and after mod the bucket_size, will cause looping same index!

add a new function:

int floor_mod(const int mod_num, const int check_num) {
    return check_num % mod_num == 0 ? check_num - 1 : check_num;
}

usage and fix the problem:

const int aIndex = hashing(key, HT_PRIME_CONST_1, bucket_size);
const int bIndex = hashing(key, HT_PRIME_CONST_2, bucket_size);
const int mod_index = floor_mod(bucket_size, (bIndex + 1));
return (aIndex + attempt_i * mod_index) % bucket_size;

iColgateZz mentioned this issue Jul 16, 2024

Recap of all issues + Other problems #52

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hash_b explanation incorrect? #12

Hash_b explanation incorrect? #12

alcides commented Aug 24, 2017

Daoctor commented Aug 30, 2017 •

edited

Loading

jamesroutley commented Aug 30, 2017

instantaphex commented Jan 4, 2018

howlinweed commented Aug 8, 2018

akerdi commented Apr 30, 2023 •

edited

Loading

Hash_b explanation incorrect? #12

Hash_b explanation incorrect? #12

Comments

alcides commented Aug 24, 2017

Daoctor commented Aug 30, 2017 • edited Loading

jamesroutley commented Aug 30, 2017

instantaphex commented Jan 4, 2018

howlinweed commented Aug 8, 2018

akerdi commented Apr 30, 2023 • edited Loading

Daoctor commented Aug 30, 2017 •

edited

Loading

akerdi commented Apr 30, 2023 •

edited

Loading