Add other data storage types to Python bindings. #364
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a bit of a big one - happy to break this into smaller PRs if that would be useful. This pull request:
DoubleIndex
(float64)Int8Index
UInt8Index
Int16Index
UInt16Index
SpaceInterface
:data_t
, to specify the data storage type used.get_items
to return a Numpy array instead of aList[List[data_t]]
.InnerProduct
andL2Sqr
distance comparison functions that auto-unroll and auto-vectorize their inner loops (see Godbolt). This allows us to use different data types without manually having to write every comparison function. (This might make the manual SIMD functions obsolete, although I've left them in there for now out of an abundance of caution.)I'm not 100% sure if this is the best way to architect this API; in particular, should the data type be a property of
Index
, set at creation time, rather than a separate class ofIndex
? (For now, it's the latter.)The existing tests seem to pass, and the new data types allow for smaller index files on disk (in situations where reduced-precision is acceptable):
The new
int8
anduint8
data types even seem to perform about 60% faster in the best case (when using 1024 dimensions):