Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Equivalent function to RcppAnnoy's a$getItemsVector(i) #18

Closed
d4tum opened this issue May 28, 2023 · 4 comments
Closed

Equivalent function to RcppAnnoy's a$getItemsVector(i) #18

d4tum opened this issue May 28, 2023 · 4 comments
Labels
enhancement New feature or request

Comments

@d4tum
Copy link

d4tum commented May 28, 2023

Thanks for the great package!

There doesn't appear to be an equivalent function to a get the vector given an item's index number similar to RcppAnnoy's a$getItemsVector(i) where i is the item integer mapped to the vector within the index during build time. It appears the Python bindings has this feature but it's not been exposed in the R package -

From the readme https://github.com/nmslib/hnswlib :

get_items(ids) - returns a numpy array (shape:N*dim) of vectors that have integer identifiers specified in ids numpy vector (shape:N). Note that for cosine similarity it currently returns normalized vectors.

Would you please expose this function in the R package?

@jlmelville jlmelville added the enhancement New feature or request label May 29, 2023
@jlmelville
Copy link
Owner

I'll see what I can do @d4tum.

Note to self: the Python binding C++ code is at:
https://github.com/nmslib/hnswlib/blob/359b2ba87358224963986f709e593d799064ace6/python_bindings/bindings.cpp#L304

jlmelville added a commit that referenced this issue Jun 27, 2023
@jlmelville
Copy link
Owner

@d4tum the master branch now contains an implementation of the getItems method. If you want to get back a matrix containing the first and tenth vectors that were added to the index, call e.g. ann$getItems(c(1, 10)).

I have not yet parallelized this function so it might be slow if returning large number of vectors.

Also, the hnsw library has recently had some stability issues with valgrind failures, other memory problems or compiler warnings, all of which prevent me from submitting a new version of the library to CRAN. I was hoping there would be a clean release of the library, because I really don't want to maintain a separate patched version of it internally to this package. I may take that step (or see what scope there is for submitting some more fixes upstream), but for now there is not likely to be an imminent CRAN submission.

@d4tum
Copy link
Author

d4tum commented Jun 28, 2023

Fair enough and thank you for your efforts @jlmelville

@jlmelville
Copy link
Owner

This feature is part of the CRAN 0.5.0 release. Sorry it took so long.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants