Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: free(): invalid pointer #471

Open
lihachev9 opened this issue Jan 8, 2025 · 10 comments
Open

BUG: free(): invalid pointer #471

lihachev9 opened this issue Jan 8, 2025 · 10 comments
Labels
bug Something isn't working confirmed The bug has been confirmed

Comments

@lihachev9
Copy link

lihachev9 commented Jan 8, 2025

Describe the bug
I create a Session instance and use streaming responses. But after a while I get memory related errors, for example free(): invalid pointer.

Error message content

free(): invalid pointer
Fatal Python error: Aborted

Current thread 0x0000007f8947e1c0 (most recent call first):
  File "/home/login/Projects/find_error/.venv/lib/python3.9/site-packages/curl_cffi/curl.py", line 324 in reset
  File "/home/login/Projects/find_error/.venv/lib/python3.9/site-packages/curl_cffi/requests/session.py", line 954 in cleanup
  File "/usr/lib/python3.9/concurrent/futures/_base.py", line 329 in _invoke_callbacks
  File "/usr/lib/python3.9/concurrent/futures/_base.py", line 531 in set_result
  File "/usr/lib/python3.9/concurrent/futures/thread.py", line 58 in run
  File "/usr/lib/python3.9/concurrent/futures/thread.py", line 77 in _worker
  File "/usr/lib/python3.9/threading.py", line 892 in run
  File "/usr/lib/python3.9/threading.py", line 954 in _bootstrap_inner
  File "/usr/lib/python3.9/threading.py", line 912 in _bootstrap

Thread 0x0000007f89c7f1c0 (most recent call first):
  File "/usr/lib/python3.9/concurrent/futures/thread.py", line 75 in _worker
  File "/usr/lib/python3.9/threading.py", line 892 in run
  File "/usr/lib/python3.9/threading.py", line 954 in _bootstrap_inner
  File "/usr/lib/python3.9/threading.py", line 912 in _bootstrap

Thread 0x0000007f8b11a040 (most recent call first):
  File "/home/login/Projects/find_error/error.py", line 17 in main
  File "/home/login/Projects/find_error/error.py", line 21 in <module>

To Reproduce

import curl_cffi.requests


def read_stream(s, url):
    response = s.request('GET', url, stream=True)
    data = b''.join(chunk for chunk in response.iter_content())
    response.close()
    return data


def main():
    s = curl_cffi.requests.Session()
    url = 'http://localhost:8000/200k'
    for _ in range(5000):
        read_stream(s, url)


if __name__ == '__main__':
    main()

Versions

  • OS: linux arm64
  • curl_cffi version 0.71
  • pip freeze freeze.txt
@lihachev9 lihachev9 added the bug Something isn't working label Jan 8, 2025
@lexiforest
Copy link
Owner

Thanks for reporting, I can confirm this is a serious bug that applies to all versions. It seems to be some race condition when resetting the curl instance:

self.curl.reset()

@nekohoncho
Copy link

I'm glad an issue was created, I was intending to after confirming it was a problem with this library and creating a consistent reproduction script. We use curl-cffi in production, and have been dealing with SIGABRTs and SIGSEGVs coming off our background processes. While I wasn't initially sure if it was curl-cffi or a Python bug, I suppose this is confirmation.

segfault

pystack core --native-all core.SHSKI00786d6946.1001.2396022842c645a183eb19f2364b5665.2936670.1735898358000000 /home/naomi/.pyenv/versions/3.12.8/bin/python3

state: D zombie: True niceness: 0
pid: 2936670 ppid: 2824692 sid: 1562256
uid: 1001 gid: 1001 pgrp: 2824692
executable: SHSKI00786d6946 arguments: SHSKI00786d694642e

The process died due a segmentation fault accessing address: 0x5dcc75ebab42
Traceback for thread 3101724 [] (most recent call last):
    (C) File "../sysdeps/unix/sysv/linux/x86_64/clone3.S", line 78, in __clone3 (libc.so.6)
    (C) File "./nptl/pthread_create.c", line 447, in start_thread (libc.so.6)
    (C) File "Python/thread_pthread.h", line 237, in pythread_wrapper (libpython3.12.so.1.0)
    (C) File "./Modules/_threadmodule.c", line 1114, in thread_run (libpython3.12.so.1.0)
    (Python) File "/home/naomi/.pyenv/versions/3.12.8/lib/python3.12/threading.py", line 1032, in _bootstrap
        self._bootstrap_inner()
    (Python) File "/home/naomi/.pyenv/versions/3.12.8/lib/python3.12/threading.py", line 1075, in _bootstrap_inner
        self.run()
    (Python) File "/home/naomi/Sesshoseki/venv/lib/python3.12/site-packages/sentry_sdk/integrations/threading.py", line 101, in run
        return _run_old_run_func()
    (Python) File "/home/naomi/Sesshoseki/venv/lib/python3.12/site-packages/sentry_sdk/integrations/threading.py", line 94, in _run_old_run_func
        return old_run_func(self, *a, **kw)
    (Python) File "/home/naomi/.pyenv/versions/3.12.8/lib/python3.12/threading.py", line 1012, in run
        self._target(*self._args, **self._kwargs)
    (Python) File "/home/naomi/.pyenv/versions/3.12.8/lib/python3.12/concurrent/futures/thread.py", line 93, in _worker
        work_item.run()
    (Python) File "/home/naomi/.pyenv/versions/3.12.8/lib/python3.12/concurrent/futures/thread.py", line 65, in run
        self.future.set_result(result)
    (Python) File "/home/naomi/.pyenv/versions/3.12.8/lib/python3.12/concurrent/futures/_base.py", line 550, in set_result
        self._invoke_callbacks()
    (Python) File "/home/naomi/.pyenv/versions/3.12.8/lib/python3.12/concurrent/futures/_base.py", line 340, in _invoke_callbacks
        callback(self)
    (Python) File "/home/naomi/Sesshoseki/venv/lib/python3.12/site-packages/curl_cffi/requests/session.py", line 1072, in cleanup
        c.reset()
    (Python) File "/home/naomi/Sesshoseki/venv/lib/python3.12/site-packages/curl_cffi/curl.py", line 324, in reset
        lib.curl_easy_reset(self._curl)
    (C) File "build/temp.linux-x86_64-cpython-38/curl_cffi._wrapper.c", line 1099, in _cffi_f_curl_easy_reset (_wrapper.abi3.so)
    (C) File "???", line 0, in curl_easy_reset (_wrapper.abi3.so)
    (C) File "???", line 0, in Curl_freeset (_wrapper.abi3.so)
    (C) File "???", line 0, in curl_slist_free_all (_wrapper.abi3.so)
    (C) File "./malloc/malloc.c", line 3375, in free (libc.so.6)

abort

Core file information:
state: D zombie: True niceness: 0
pid: 3313600 ppid: 3313465 sid: 1562256
uid: 1001 gid: 1001 pgrp: 3313465
executable: SHSKIc7e7353d1e arguments: SHSKIc7e7353d1ea95

The process died due receiving signal SIGABRT
Traceback for thread 3346132 [] (most recent call last):
    [. . .]
    (Python) File "/home/naomi/Sesshoseki/venv/lib/python3.12/site-packages/curl_cffi/requests/session.py", line 1072, in cleanup
        c.reset()
    (Python) File "/home/naomi/Sesshoseki/venv/lib/python3.12/site-packages/curl_cffi/curl.py", line 324, in reset
        lib.curl_easy_reset(self._curl)
    (C) File "build/temp.linux-x86_64-cpython-38/curl_cffi._wrapper.c", line 1099, in _cffi_f_curl_easy_reset (_wrapper.abi3.so)
    (C) File "???", line 0, in curl_easy_reset (_wrapper.abi3.so)
    (C) File "???", line 0, in Curl_freeset (_wrapper.abi3.so)
    (C) File "./malloc/malloc.c", line 3398, in free (libc.so.6)
    (C) File "./malloc/malloc.c", line 4607, in _int_free (libc.so.6)
    (C) File "../sysdeps/posix/libc_fatal.c", line 132, in _IO_peekc_locked.cold (libc.so.6)
    (C) File "./stdlib/abort.c", line 79, in abort (libc.so.6)
    (C) File "../sysdeps/posix/raise.c", line 26, in raise (libc.so.6)
    (C) File "./nptl/pthread_kill.c", line 89, in pthread_kill@@GLIBC_2.34 (libc.so.6)
    (C) File "./nptl/pthread_kill.c", line 78, in __pthread_kill_internal (inlined) (libc.so.6)
    (C) File "./nptl/pthread_kill.c", line 44, in __pthread_kill_implementation (inlined) (libc.so.6)

latest curl_cffi ver

@lexiforest
Copy link
Owner

Thanks for the additional information. I saw SIGABRT a few times when running the unittests, but haven't found a reliable way to reproduce. Hopefully, this could be the root cause.

@lexiforest lexiforest added the confirmed The bug has been confirmed label Jan 10, 2025
@bocharov
Copy link

I've been using curl_cffi through yt-dlp, and I've encountered frequent memory-related errors (listed below in descending order of frequency):

double free or corruption (fasttop)
double free or corruption (!prev)
corrupted size vs. prev_size while consolidating
malloc_consolidate(): unaligned fastbin chunk detected
corrupted size vs. prev_size
malloc(): unaligned tcache chunk detected
double free or corruption (out)
free(): invalid pointer
double free or corruption (fasttop)
malloc_consolidate(): invalid chunk size

All of these issues disappeared immediately once I monkey-patched curl_cffi to make the reset method a no-op:

import curl_cffi.curl

def safe_reset(self):
    pass

curl_cffi.curl.Curl.reset = safe_reset

Of course, this is just a temporary, hacky fix that works for my specific deployment scenario, where a separate yt-dlp process is spun up, makes queries, and then terminates quickly. This approach likely won't be suitable for long-running processes because memory may accumulate (leading to a leak) if reset() is never actually performed.

From what I can tell, the calls to c.reset() occur in a background thread callback, while the main thread or other parts of the code might still reference the same cURL handle or its memory. Because curl_easy_reset() frees (and reinitializes) certain underlying resources, any subsequent operation on that handle (or associated data) can lead to a “double free” or other memory corruption. A more robust fix in curl_cffi is needed to avoid these issues.

@perklet
Copy link
Collaborator

perklet commented Jan 20, 2025

Thanks for the report. I can almost reach a conclusion that Curl.reset is not called correctly in curl_cffi, with all the evidences you posted here. Let me have a thorough investigation after the firefox support is added.

@lexiforest
Copy link
Owner

lexiforest commented Jan 25, 2025

Have you ever encountered this issue when stream is not used?

@lexiforest
Copy link
Owner

The current implementation of stream is just too tricky to be safe to use. Since libcurl does not have iterative style API for streaming, I have to use concurrent.futures to convert from the callback style. And it looks like there are some race condition when two request were sent in order and their futures are not well sychronized.

We have 2 options here:

  1. Add more locks on the python side, so that the race condition can be minimized
  2. Implement the iterative API in libcurl, while solves the problem foundamentally and makes the code elegant.

However, I don't know how long it takes, especially for option 2.

@wlritchi
Copy link

I've also been encountering this issue with stream, but only in multithreaded code. The same usage patterns have been very stable if there's only one thread. I agree that this points to a race condition in the stream code, but I haven't been able to track it down.

@lexiforest
Copy link
Owner

I'm posting my finding here, in case anyone have more interest and time in fixing this.

Basically, our problem is that curl_easy_perform won't return anything before the response is fulfilled. So we should find the main loop of curl_easy_perform and add take the control back from there. The call stack is like:

curl_easy_perform
	easy_perform
		easy_transfer   # <----while loop
			curl_multi_perform
				multi_runsingle  # state machine
					state_performing
						Curl_sendrecv
							sendrecv_dl
								Curl_xfer_write_resp
								      Struct Curl_handler->write_resp
								        rtsp_rtp_write_resp
								            rtsp_filter_rtp
									        rtp_client_write

And it's actually a simple function:

static CURLcode easy_transfer(struct Curl_multi *multi)
{
  bool done = FALSE;
  CURLMcode mcode = CURLM_OK;
  CURLcode result = CURLE_OK;

  while(!done && !mcode) {
    int still_running = 0;

    mcode = curl_multi_poll(multi, NULL, 0, 1000, NULL);

    if(!mcode)
      mcode = curl_multi_perform(multi, &still_running);

    /* only read 'still_running' if curl_multi_perform() return OK */
    if(!mcode && !still_running) {
      int rc;
      CURLMsg *msg = curl_multi_info_read(multi, &rc);
      if(msg) {
        result = msg->data.result;
        done = TRUE;
      }
    }
  }

  /* Make sure to return some kind of error if there was a multi problem */
  if(mcode) {
    result = (mcode == CURLM_OUT_OF_MEMORY) ? CURLE_OUT_OF_MEMORY :
      /* The other multi errors should never happen, so return
         something suitably generic */
      CURLE_BAD_FUNCTION_ARGUMENT;
  }

  return result;
}

So, what we can do is:

static CURLcode easy_transfer(struct Curl_multi *multi)
{
  bool done = FALSE;
  CURLMcode mcode = CURLM_OK;
  CURLcode result = CURLE_OK;

  while(!done && !mcode) {
    int still_running = 0;

    mcode = curl_multi_poll(multi, NULL, 0, 1000, NULL);

    if(!mcode)
      mcode = curl_multi_perform(multi, &still_running);

+ // go back to python
+ // data = read_buffer()
+ // go back to c

    /* only read 'still_running' if curl_multi_perform() return OK */
    if(!mcode && !still_running) {
      int rc;
      CURLMsg *msg = curl_multi_info_read(multi, &rc);
      if(msg) {
        result = msg->data.result;
        done = TRUE;
      }
    }
  }

  /* Make sure to return some kind of error if there was a multi problem */
  if(mcode) {
    result = (mcode == CURLM_OUT_OF_MEMORY) ? CURLE_OUT_OF_MEMORY :
      /* The other multi errors should never happen, so return
         something suitably generic */
      CURLE_BAD_FUNCTION_ARGUMENT;
  }

  return result;
}

@lihachev9
Copy link
Author

lihachev9 commented Jan 25, 2025

Have you ever encountered this issue when stream is not used?

No, I didn't get any errors if I don't use stream. yt-dlp uses curl-cffi with stream for requests. I made a request every 10 seconds, but after 3-6 hours my script crashed, even making request once a minute error occurred. But when I set the stream value to False, there are no more errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working confirmed The bug has been confirmed
Projects
None yet
Development

No branches or pull requests

6 participants