Skip to content

Commit

Permalink
update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
js1010 committed Feb 9, 2021
1 parent a40b463 commit 3d82c5c
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,8 +100,8 @@ nns, distances, found_cnt = ch0.search_knn(data, topk=10, ef_search=300)
### Performance

- tl;dr
- cuhnsw achieved the same build quality by 8~9 times faster build time than hnswlib with 8 vcpus on certain data and parameter setup
- cuhnsw achieved the same search quality by 3 times faster search time than hnswlib with 8 vcpus instance on certain data and parameter setup
- cuhnsw achieved the same build quality by 8-9 times faster build time than hnswlib with 8 vcpus on certain data and parameter setup
- cuhnsw achieved the same search quality by 3-4 times faster search time than hnswlib with 8 vcpus instance on certain data and parameter setup
- Note1: HNSW search algorithm can be verified by exact match since it is deterministic.
- I verified it with hnswlib, in other words, cuhnsw search and hnswlib search returns exactly same results by loading the same model file and the same queries and the same ef search.
- Note2: GPU search has the advantage over CPU search only when it comes to the `Batch` search (i.e. processing large number of queries at once.)
Expand All @@ -110,19 +110,19 @@ nns, distances, found_cnt = ch0.search_knn(data, topk=10, ef_search=300)
- build time / quality results on glove-50-angular
- used `ef_construction`=150 for hnswlib and `ef_construction=110` for cuhnsw to achieve the same build quality
- build quality is measured by the accuracy by the same search parameter (`ef_search`=300)

- build time is in seconds
| attr | 1 vcpu | 2 vcpu | 4 vcpu | 8 vcpu | gpu |
|:--------------|-----------:|-----------:|----------:|----------:|----------:|
| build time | 343.909 | 179.836 | 89.7936 | 70.5476 | 8.2847 |
| build quality | 0.863193 | 0.863301 | 0.863238 | 0.863165 | 0.865471 |

- search time comparison on glove-50-angular
- search time on 100k random queries
- search time on 1M random queries (seconds)
- search `quality` is guaranteed to the same (exact match)

| attr | 1 vcpu | 2 vcpu | 4 vcpu | 8 vcpu | gpu |
|:------------|--------:|--------:|--------:|--------:|--------:|
| search time | 52.3024 | 26.5086 | 13.9146 | 10.8525 | 3.07964 |
| search time | 556.605 | 287.967 | 146.331 | 115.431 | 29.7008 |

- the reason why the parallel efficiency significantly drops from 4 vcpu to 8 vcpu might be hyper threading (there might be only 4 "physical" cores in this instance).

Expand Down

0 comments on commit 3d82c5c

Please sign in to comment.