Update master to 0.4.0 #219

yurymalkov · 2020-06-19T23:44:18Z

No description provided.

Currently SIMD (SSE or AVX) is used for the cases when dimension is multiple of 4 or 16, when dimension size is not strictly equal to multiple of 4 or 16 a slower non-vectorized method is used. To improve performance for these cases new methods are added: `L2SqrSIMD(4|16)ExtResidual` - relies on existing `L2SqrSIMD(4|16)Ext` to compute up to *4 and *16 dimensions and finishes residual computation by method `L2Sqr`. Performance improvement compared to baseline is x3-4 times depending on dimension. Benchmark results: Run on (4 X 3300 MHz CPU s) CPU Caches: L1 Data 32 KiB (x2) L1 Instruction 32 KiB (x2) L2 Unified 256 KiB (x2) L3 Unified 4096 KiB (x1) Load Average: 2.18, 2.35, 3.88 ----------------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------------- TstDim65 14.7 ns 14.7 ns 20 * 47128209 RefDim65 50.2 ns 50.1 ns 20 * 10373751 TstDim101 24.7 ns 24.7 ns 20 * 28064436 RefDim101 90.4 ns 90.2 ns 20 * 7592191 TstDim129 31.4 ns 31.3 ns 20 * 22397921 RefDim129 125 ns 124 ns 20 * 5548862 TstDim257 59.3 ns 59.2 ns 20 * 10856753 RefDim257 266 ns 266 ns 20 * 2630926

…d 16 Currently SIMD (SSE or AVX) is used for the cases when dimension is multiple of 4 or 16, when dimension size is not strictly equal to multiple of 4 or 16 a slower non-vectorized method is used. To improve performnance for these cases new methods are added: `InnerProductSIMD(4|16)ExtResidual` - relies on existing `InnerProductSIMD(4|16)Ext` to compute up to *4 and *16 dimensions and finishes residual computation by non-vectorized method `InnerProduct`. Performance improvement compared to baseline is x3-4 times depending on dimension. Benchmark results: Run on (4 X 3300 MHz CPU s) CPU Caches: L1 Data 32 KiB (x2) L1 Instruction 32 KiB (x2) L2 Unified 256 KiB (x2) L3 Unified 4096 KiB (x1) Load Average: 2.10, 2.25, 2.46 ---------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------- TstDim65 14.0 ns 14.0 ns 20 * 48676012 RefDim65 50.3 ns 50.2 ns 20 * 12907985 TstDim101 23.8 ns 23.8 ns 20 * 27976276 RefDim101 91.4 ns 91.3 ns 20 * 7364003 TetDim129 30.0 ns 30.0 ns 20 * 23413955 RefDim129 123 ns 123 ns 20 * 5656383 TstDim257 57.8 ns 57.7 ns 20 * 11263073 RefDim257 268 ns 267 ns 20 * 2617478

Perf improvement for dimension not of factor 4 and 16

[MRG] Correct typo

…icient manner

Algorithm to perform dynamic/incremental updates of feature vectors

Fixed a typo in bindings.cpp

Update master to 0.4.0

2ooom and others added 12 commits April 19, 2020 09:50

Merge pull request #211 from 2ooom/master

a3ef160

Perf improvement for dimension not of factor 4 and 16

Correct typo

5f84edd

Merge pull request #215 from mohamed-ali/patch-1

ba16931

[MRG] Correct typo

Algorithm to support incremental updates of feature vectors in an eff…

8361676

…icient manner

Compile fix

524873b

Merge pull request #216 from apoorv-sharma/update_patch

c247540

Algorithm to perform dynamic/incremental updates of feature vectors

Fixed a typo in bindings.cpp

19e1286

Bump version

11a1219

Merge pull request #217 from Shujian2015/patch-1

2ccaccf

Fixed a typo in bindings.cpp

Update README.md

92e5b74

yurymalkov merged commit 3c6a84f into master Jun 22, 2020

sjwsl pushed a commit to sjwsl/hnswlib that referenced this pull request May 6, 2021

Merge pull request nmslib#219 from nmslib/develop

084a893

Update master to 0.4.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update master to 0.4.0 #219

Update master to 0.4.0 #219

yurymalkov commented Jun 19, 2020

Update master to 0.4.0 #219

Update master to 0.4.0 #219

Conversation

yurymalkov commented Jun 19, 2020