-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[question] intersection between a python set and a BloomFilter #51
Comments
From my understanding that would not be possible. The only similar thing you could do is create a comparable bloom filter and put all of the items in your set into it. Now you could compare the two filters to see if all of the elements in the second one are also in the first one. But this won't give you the intersection, it will only tell you if the intersection of the sets exists. |
Alternatively with a bulk lookup method, I assume @yvesx is looking for this:
Which would be functionally equivalent with the example. My initial feeling is that this might not be much faster unless Cython can iterate through and setup a new set/list far more efficiently then the list comprehension. The case for But maybe the overhead of the function call and call to _assert_open is worth it to create bulk operations I probably need all three use-cases in 2015 so I will benchmark them and implement them in my fork if their worth it. |
You can intersect and union bloom filters but only if they have the same bit array lengths right?! Union is just bitwise or. If bit is set in one array, then it should also be set in the union. If both are unset then it should be 0 in the union too. Similarly, intersection is bitwise and. Iff bit was set in both arrays then it will be in the intersecting set. This should be quiet easy to implement. Again, obviously this only works for the same length filters. As long as that's OK, then it will work with the same error rate and everything. |
@alexjironkin as far as I know, this library already supports bitsize or/and for bloom filters of equal size |
Through an API method, or by accessing bit array and doing and/or manually? But they way do you use inyernal counters for deletion etc? Because they would have to change appropriately as well. |
What would be the best way to find out the intersection between a python set and a BloomFilter? I'm currently doing:
Is it possible to support a bulk lookup method?
The text was updated successfully, but these errors were encountered: