Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding search only mode for index loading to save memory #191

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jaroslavgratz
Copy link

If an index is loaded from disk and used to search only (likely a typical use case) then it is not necessary initialize link_list_locks_ and label_lookup_ data structures. It saves approximately 350MB of memory for 5M dataset. Adding a search_only flag to enable search only mode (default is off).

@yurymalkov
Copy link
Member

Hi @jaroslavgratz ,

It seems that the code is missing something. Cannot see a part which omits creating the data structures.

@jaroslavgratz
Copy link
Author

Hi @yurymalkov,

These lines omits creating the data structures when loading data:

fdfb030#diff-171628eaa21dab74ca44c386d5a17f05R684

fdfb030#diff-171628eaa21dab74ca44c386d5a17f05R699

In fact the link_list_locks_ and label_lookup_ data structures still exist but they are not loaded.

@yurymalkov
Copy link
Member

@jaroslavgratz

Thanks, I see it. Does it give you a good saving?

There might be a simpler way to save the memory, while still preserving insertions. One can lock the elements in buckets. E.g. do link_list_locks_ [i<<8] instead of link_list_locks_ [i]. Not sure it has any measurable performance loss.

Also, the memory can be further saved if VisitedLists are substituted with hash maps. Would you like to see that in this repo?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants