You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello and thanks for this great product!
In my system I expect k to be around 10 usually, and ef, let's say 50. But at some point I expect to get items with very similar l2 distance (e.g. 0.1-0.3) and in that case I will need to retrieve ALL of them. So I can dynamically increase k, and ef (to always be more than k), and re-query until I start to get items with l2 distance more than I need (e.g. > 0.3).
My concern is, what if k will become 1000, or 10000 (and ef still > k)? Will hnswlib still perform optimally? Or maybe such algorithms are nor designed to work with such big k (and also frequent changes of ef)?
The text was updated successfully, but these errors were encountered:
So, you need epsilon-Nearest Neighbor search, not K-nearest neighbors. Right?
Hnswlib can operate with large number of K (i.e. it should be optimal), but at current implementation it relies on the knowledge that of K beforehand.
The search algorithm can be modified for epsilon-Nearest Neighbor search. It would be a nice feature if someone implements it :)
Also, ef is overrided to K is ef<K inside hnswlib.
but at current implementation it relies on the knowledge that of K beforehand
If you mean that k is not calculated based on some conditions, I get it.
Do you have metrics of search time depending on k? Same search but with k = 1 ... x ... number of items. I feel that for my case, if the search speed is pretty much the same, I can apply the approach I described in the first post - to get the items in some distance range. If not, I should seek for decent implementation of epsilon ball NN (though after quick googling I didn't find one :) )
Hello and thanks for this great product!
In my system I expect k to be around 10 usually, and ef, let's say 50. But at some point I expect to get items with very similar l2 distance (e.g. 0.1-0.3) and in that case I will need to retrieve ALL of them. So I can dynamically increase k, and ef (to always be more than k), and re-query until I start to get items with l2 distance more than I need (e.g. > 0.3).
My concern is, what if k will become 1000, or 10000 (and ef still > k)? Will hnswlib still perform optimally? Or maybe such algorithms are nor designed to work with such big k (and also frequent changes of ef)?
The text was updated successfully, but these errors were encountered: