Dynamic graph without need of max_elements #172

ashfaq92 · 2019-11-20T13:06:28Z

Hi, thanks for great work.
In some domains, the size of graph is not know in advance. Each time a node is created and should be added into graph. In other words, graph should dynamically adjust the new nodes. One solution can be: giving graph maximum affordable size i.e. max_elements= sys.maxint. But this costs huge time and space overhead in indexing process. While in real workflow the nodes can be only 1000. Let suppose we don't know the number of nodes (max_elements) in advance, then how to use this library to solve this problem? What are limitations that require that max_elements should be specified in advance? It would be charming if we don't have to specify max_elements in advance. If there are some programing limitations, please describe so that we can collaboratively find the solution. If it is impossible, please share some work_around/hacks to deal with this problem.

The text was updated successfully, but these errors were encountered:

yurymalkov · 2019-11-20T15:31:31Z

Hi @ashfaq92,

This can be automatized by extending the index once it hits the capacity by making a copy of index's working memory. Also, realloc can be used instead of copying to avoid doubling the memory consumption (but not guaranteed).

On the downside, it seems that doing so would require to halt the index during the operation with a dedicated shared lock. I think that should be possible to implement it via a wrapper of the python API.

C++ version might be a much better solution, but 1) it would require a shared lock, which is C++14 (not sure if it is a big problem). 2) might cause some slowdown (not sure if a big problem either, as the lock can be acquired only once per search/insertion).

yurymalkov · 2019-11-20T19:13:23Z

I can guide through the process if you want to implement that.

ashfaq92 · 2019-11-21T13:22:32Z

Hi @yurymalkov,
Today I tried thershold (extending the index once it hits the capacity) idea with multiple configurations. After simulations on random data, I figured out that setting initial size of graph from 01-100000 takes less than 1 millisecond (on my machine). Therefore, it can be good idea to set initial graph size to 100000 (0.1 million). Then the problem arises, how to extend graph. The best configuration was doubling size after hitting threshold.
The simulations are performed on random numeric data.
But this can be considered as workaround not a proper solution.
Please guide me how to implement that. It shall be great to work with mutual collaboration.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic graph without need of max_elements #172

Dynamic graph without need of max_elements #172

ashfaq92 commented Nov 20, 2019

yurymalkov commented Nov 20, 2019

yurymalkov commented Nov 20, 2019

ashfaq92 commented Nov 21, 2019

Dynamic graph without need of max_elements #172

Dynamic graph without need of max_elements #172

Comments

ashfaq92 commented Nov 20, 2019

yurymalkov commented Nov 20, 2019

yurymalkov commented Nov 20, 2019

ashfaq92 commented Nov 21, 2019