Skip to content

jean-pierreBoth/anndists

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

anndists

This crate provides distances computations used in some related crates hnsw_rs, annembed and coreset

All distances implement the trait Distance:

pub trait Distance<T: Send + Sync> {  
    fn eval(&self, va: &[T], vb: &[T]) -> f32;
}

Functionalities

The crate provides:

  • usual distances as L1, L2, Cosine, Jaccard, Hamming for vectors of standard numeric types, Levenshtein distance on u16.

  • Hellinger distance and Jeffreys divergence between probability distributions (f32 and f64). It must be noted that the Jeffreys divergence (a symetrized Kullback-Leibler divergence) do not satisfy the triangle inequality. (Neither Cosine distance !).

  • Jensen-Shannon distance between probability distributions (f32 and f64). It is defined as the square root of the Jensen-Shannon divergence and is a bounded metric. See Nielsen F. in Entropy 2019, 21(5), 485.

  • A Trait to enable the user to implement its own distances. It takes as data slices of types T satisfying T:Serialize+Clone+Send+Sync. It is also possible to use C extern functions or closures.

  • Simd implementation is provided for the most often used case.

Implementation

Simd support is provided with the simdeez crate on Intel and partial implementation with std::simd for general case.

Building

Simd

  • The simd provided by the simdeez crate is accessible with the feature "simdeez_f" for x86_64 processors. Compile with cargo build --release --features "simdeez_f" .... To compile this crate on a M1 chip just do not activate this feature.

  • It is nevertheless possible to experiment with std::simd. Compiling with the feature stdsimd (cargo build --release --features "stdsimd"), activates the portable_simd feature on rust nightly. This requires nightly compiler. Only the Hamming distance with the u32x16 and u64x8 types and DistL1,DistL2 and DistDot on f32*16 are provided for now.

Benchmarks and Examples

The speed is illustated in the hnsw_rs, annembed crates

Contributions

Petter Egesund added the DistLevenshtein distance.

License

Licensed under either of

at your option.

About

distances ann

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages