Fix heap buffer overflow caused by prefetch. #459

kishorenc · 2023-05-05T13:32:38Z

The prefetching goes past size since datal is initialized 1 index past data.

I uncovered this issue on ASAN when SSE is enabled. Also spotted earlier here: #107

The prefetching goes past `size` since `datal` is initialized 1 index past `data`. I uncovered this issue on ASAN when SSE is enabled. Also spotted earlier here: nmslib#107

yurymalkov · 2023-05-13T04:17:45Z

hnswlib/hnswalg.h

-                _mm_prefetch((char *) (visited_array + *(datal + j + 1)), _MM_HINT_T0);
-                _mm_prefetch(getDataByInternalId(*(datal + j + 1)), _MM_HINT_T0);
+                _mm_prefetch((char *) (visited_array + *(datal + j)), _MM_HINT_T0);
+                _mm_prefetch(getDataByInternalId(*(datal + j)), _MM_HINT_T0);


I think that fix defeats the purpose of the prefetch, as it need to be done for the next element.
Is there any evidence this prefetch affects the program behavior?
If so, I guess we probably should come with a different fix, e.g. min(j+1, max_value)

My understanding is that a cache line is atleast 64 bytes, so does the extra +1 really matter? What matters is that the loop is moving forward via the j variable increments so we keep fetching ahead. We should be able to confirm this via a benchmark: I don't think there will be any degradation in performance.

There are actually a couple of other places the same issue happens, which I just noticed:

https://github.com/nmslib/hnswlib/blob/master/hnswlib/hnswalg.h#L346

https://github.com/nmslib/hnswlib/blob/master/hnswlib/hnswalg.h#L1010

(Edit: updated PR to cover these as well)

Is there any evidence this prefetch affects the program behavior?

There is a buffer overflow here by accessing past the array length. Whether that causes a segfault will depend on hardware.

Also, I think the proper way to actually use a _mm_prefetch is to fetch a full 64-bytes ahead, for example how it's done here by fetching "8 doubles ahead":

https://stackoverflow.com/questions/73012072/why-such-address-is-used-with-mm-prefetch

But this is quite tricky to implement and test because modern processors have gotten really good at prefetching without any hints.

_mm_prefetch((char *) (visited_array + *(datal + j + 1)), _MM_HINT_T0);

+1 fetches the visited bit for the next item, which can be anywhere in the memory compared the previous element. Almost certain different cache and page and at least skylake's cpus were struggling without the prefetches, that is why it ended up in the code despite the obvious issue.

Yeah, it would be great if you could profile the performance on few different x64 platforms and maybe improve the prefetching performance and hopefully fix this out-of-bound issue that made people uneasy.

I think the proper way to test the performance is to do search on small dimensional data to focus on memory sparsity (I used d=4) and keep two implementations of the search function (with and without the change), so the performance would not depend on random memory layout changes in the data. That should allow measuring the performance within 1-2% accuracy.

kishorenc · 2023-05-15T05:54:51Z

@yurymalkov

+1 fetches the visited bit for the next item, which can be anywhere in the memory compared the previous element.

Ah, sorry, I missed the extra pointer de-referencing in the code and assumed a sequential lookup.

I benchmarked with/without prefetching by searching on 1M / 4-dimension points.

with prefetching: 6700 ms
without prefetching: 7800 ms

So, I've now updated the PR to use std::min() to limit the index of the next prefetch. This is comparable in performance to the current version without bound checking.

yurymalkov · 2023-05-21T01:20:35Z

Hi @kishorenc,
Thanks for the change! We would need to double check the performance

yurymalkov · 2023-05-29T00:46:33Z

@kishorenc can you please share the benchmarking code?
Also, I wonder, how do you look at the other *(data + 1) in the code? Those also have the same out-of-bounds prefetch.

kishorenc · 2023-05-29T09:04:49Z

Here's the code for the benchmark. I just ran this with/without the changes in the PR and compared.

Also, I wonder, how do you look at the other *(data + 1) in the code? Those also have the same out-of-bounds prefetch.

Yes, those are out-of-bound as well, but I didn't address them because I was not sure what their actual sizes were. In the places that I've changed, there was a clear size variable available that I could go by.

kishorenc and others added 2 commits May 5, 2023 19:01

Fix heap buffer overflow caused by prefetch.

b17e455

The prefetching goes past `size` since `datal` is initialized 1 index past `data`. I uncovered this issue on ASAN when SSE is enabled. Also spotted earlier here: nmslib#107

Merge branch 'nmslib:master' into fix-prefetch

6e946ed

yurymalkov reviewed May 13, 2023

View reviewed changes

kishorenc added 2 commits May 13, 2023 17:26

Fix prefetch overflow in other places.

b5c2eba

Use min to limit the index of next prefetch.

573ab84

czpmango approved these changes Sep 5, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix heap buffer overflow caused by prefetch. #459

Fix heap buffer overflow caused by prefetch. #459

kishorenc commented May 5, 2023

yurymalkov May 13, 2023

kishorenc May 13, 2023 •

edited

Loading

kishorenc May 13, 2023 •

edited

Loading

yurymalkov May 14, 2023

kishorenc commented May 15, 2023

yurymalkov commented May 21, 2023

yurymalkov commented May 29, 2023

kishorenc commented May 29, 2023

Fix heap buffer overflow caused by prefetch. #459

Are you sure you want to change the base?

Fix heap buffer overflow caused by prefetch. #459

Conversation

kishorenc commented May 5, 2023

yurymalkov May 13, 2023

Choose a reason for hiding this comment

kishorenc May 13, 2023 • edited Loading

Choose a reason for hiding this comment

kishorenc May 13, 2023 • edited Loading

Choose a reason for hiding this comment

yurymalkov May 14, 2023

Choose a reason for hiding this comment

kishorenc commented May 15, 2023

yurymalkov commented May 21, 2023

yurymalkov commented May 29, 2023

kishorenc commented May 29, 2023

kishorenc May 13, 2023 •

edited

Loading

kishorenc May 13, 2023 •

edited

Loading