Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug fix: In bruteforce.h, searchKnn return less element than expected when using filter. #597

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

RaspStudio
Copy link

@RaspStudio RaspStudio commented Oct 30, 2024

Bug Description

This bug will occur when less than k elements are added into topResults, and the remaining elements all failed to satisfy dist <= topResults.top().first(Notice that topResults.size() < k now).

If there sure is k elements satisfy dist <= topResults.top().first, the function should return them regardless of their internal order.

Example

When calling searchKnn with filter, the following code may cannot fill the topResults to size k.

for (int i = 0; i < k; i++) {
    dist_t dist = fstdistfunc_(query_data, data_ + size_per_element_ * i, dist_func_param_);
    labeltype label = *((labeltype*) (data_ + size_per_element_ * i + data_size_));
    if ((!isIdAllowed) || (*isIdAllowed)(label)) { // <- not every element satisfies
        topResults.emplace(dist, label);
    }
}

If the above loop only fill one nearest element into topResults, the following loop will never satisfy dist <= lastdist, as lastdist is the nearest element. Under this circumstance, the function will only return 1 element regardless of k.

// now topResults.size() == 1, and the only element in the container is the nearest element of all elements.
dist_t lastdist = topResults.empty() ? std::numeric_limits<dist_t>::max() : topResults.top().first;
for (int i = k; i < cur_element_count; i++) {
    dist_t dist = fstdistfunc_(query_data, data_ + size_per_element_ * i, dist_func_param_);
    if (dist <= lastdist) { // <- will never satisfy
        labeltype label = *((labeltype *) (data_ + size_per_element_ * i + data_size_));
        if ((!isIdAllowed) || (*isIdAllowed)(label)) {
            topResults.emplace(dist, label);
        }
        if (topResults.size() > k)
            topResults.pop();
        if (!topResults.empty()) {
            lastdist = topResults.top().first;
        }
    }
}
return topResults;

@RaspStudio RaspStudio changed the title bug fix: knn search with filter in bruteforce.h bug fix: knn search with filter in bruteforce.h Oct 30, 2024
@RaspStudio RaspStudio changed the title bug fix: knn search with filter in bruteforce.h bug fix: In bruteforce.h, searchKnn return less element than expected when using filter. Oct 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant