-
-
Notifications
You must be signed in to change notification settings - Fork 266
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Turns out it only takes a handful of ASCII characters before the old bloom filter became 01111111 and matched everything. This implements a better bloom filter which drastically reduces collisions and makes the filter far more effective. Since the new hashing takes a little more computation, using a lookup table is a modest improvement over recalculating every character processed.
- Loading branch information
1 parent
c916c20
commit d0205e5
Showing
4 changed files
with
41 additions
and
16 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
package processor | ||
|
||
// Prime number less than 256 | ||
const BloomPrime = 251 | ||
|
||
var BloomTable [256]uint64 | ||
|
||
func init() { | ||
for i := range BloomTable { | ||
BloomTable[i] = BloomHash(byte(i)) | ||
} | ||
} | ||
|
||
func BloomHash(b byte) uint64 { | ||
i := uint64(b) | ||
|
||
k := (i^BloomPrime) * i | ||
|
||
k1 := k & 0x3f | ||
k2 := k >> 1 & 0x3f | ||
k3 := k >> 2 & 0x3f | ||
|
||
return (1 << k1) | (1 << k2) | (1 << k3) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters