Benchmarking nearest neighbors

This is the version of http://github.com/erikbern/ann-benchmarks/ accompanying our SISAP 2019 paper Benchmarking Nearest Neighbor Search: Influence of Local Intrinsic Dimensionality. See the main repository for the benchmarking tool intended for use for a general audience.

Install

The only prerequisite is Python (tested with 3.6) and Docker.

Clone the repo.
Run pip install -r requirements.txt.
Run python install.py to build all the libraries inside Docker containers (this can take a while, like 10-30 minutes).

Running

Run python run.py (this can take an extremely long time, potentially days)
Run python plot.py or python create_website.py to plot results.

You can customize the algorithms and datasets if you want to:

Check that algos.yaml contains the parameter settings that you want to test
To run experiments on SIFT, invoke python run.py --dataset glove-100-angular. See python run.py --help for more information on possible settings. Note that experiments can take a long time.
To process the results, either use python plot.py --dataset glove-100-angular or python create_website.py. An example call: python create_website.py --plottype recall/time --latex --scatter --outputdir website/.

SISAP 2019 Changes

See http://ann-benchmarks.com/sisap19/ for the evaluation including plots, preprocessed datasets, and raw results.

Generating the datasets described in the paper works as follows. (We use glove-100-angular as an example.)

Run python3 create_dataset.py --dataset glove-100-angular. (This takes a long time, since it takes the whole data set as the query set.)
Run python3 compute-lid.py data/glove-100-angular.hdf5 > glove-100-angular-lid.txt to compute estimates for the LID of every single query based on its 100-NN stored in data/glove-100-angular.hdf5.
Run python3 choose-queryset.py glove-100-angular-lid.txt > glove-100-angular-queries.txt to pick the queries to use for the easy, middle, hard, and diverse query set.
Run python3 pick-queries.py data/glove-100-angular.hdf5 glove-100-angular-queries.txt to prepare hdf5 versions of these 4 datasets.
Run python3 run.py --algorithm faiss-ivf --dataset glove-100-angular-diverse (exchange algorithm and dataset accordingly) to run the experiments.
Run python3 plot.py --dataset glove-100-angular-diverse to create a basic recall/QPS plot. (If your docker service runs as root, you might need to execute this script as root as well since it will write to the result files in results/glove-100-angular-diverse. Alternatively: Change owner of files in results to your local user.)
Or use python3 to_csv.py --output results.csv --detail to generate a CSV file with all metrics that can be used for visualization through Python/pandas or R. (Again: Might need to run as root.)

Related Publication

The following publication details design principles behind the benchmarking framework:

M. Aumüller, E. Bernhardsson, A. Faithfull: ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms. SISAP 2017: 34-49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Benchmarking nearest neighbors

Install

Running

SISAP 2019 Changes

Related Publication

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
ann_benchmarks		ann_benchmarks
install		install
protocol		protocol
templates		templates
test		test
.dockerignore		.dockerignore
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
algos.yaml		algos.yaml
choose-queryset.py		choose-queryset.py
compute-lid.py		compute-lid.py
create_dataset.py		create_dataset.py
create_website.py		create_website.py
install.py		install.py
pick-queries.py		pick-queries.py
plot.py		plot.py
requirements.txt		requirements.txt
run.py		run.py
run_algorithm.py		run_algorithm.py
to_csv.py		to_csv.py

License

maumueller/ann-benchmarks-sisap19

Folders and files

Latest commit

History

Repository files navigation

Benchmarking nearest neighbors

Install

Running

SISAP 2019 Changes

Related Publication

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages