-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
gh-109793: Allow Switching Interpreters During Finalization #109794
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-109793: Allow Switching Interpreters During Finalization #109794
Conversation
Include/internal/pycore_atomic.h
Outdated
@@ -46,6 +46,10 @@ typedef struct _Py_atomic_address { | |||
atomic_uintptr_t _value; | |||
} _Py_atomic_address; | |||
|
|||
typedef struct _Py_atomic_ulong { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use either _Py_atomic_load_uint32
or _Py_atomic_load_uint64
instead of defining your own.
Since thread id is an unsigned long
, you'll need to use a macro, but it seems better to use the pre-existing functions.
#if SIZEOF_LONG == 8
#define _Py_atomic_load_ulong _Py_atomic_load_uint64
#elif SIZEOF_LONG == 4
#define _Py_atomic_load_ulong _Py_atomic_load_uint32
#else
#error "long must be 4 or 8 bytes in size"
#endif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Thanks @ericsnowcurrently for the PR 🌮🎉.. I'm working now to backport this PR to: 3.12. |
Sorry, @ericsnowcurrently, I could not cleanly backport this to |
…thongh-109794) Essentially, we should check the thread ID rather than the thread state pointer.
Is it really reasonable to backport this major change in a stable branch? IMO it's too late to change that, no? This change introduced a regression: Python started to crash again randomly at exit when they are daemon threads: see issue gh-110052. I confirmed (by manual testing) that the commit 32466c9 introduced the regression:
@ericsnowcurrently: Please have a look. If you don't have the bandwidth to repair the regression, I will revert soon the change. This code is very fragile :-( I fixed multiple bugs to try to handle all cases:
Note: By the way, recently, I fixed a race condition in _thread.start_new_thread(), but it doesn't seem to be related: commit 517cd82. Before the commit, the test was fine:
A system load of 91.03 is high knowning that my laptop CPU has 6 cores / 12 threads. I'm not comfortable with this change which has comments like: // XXX This isn't completely safe from daemon thraeds,
// since tstate might be a dangling pointer. Well, if it's not safe: don't do it. |
Note: _PyRuntimeState_GetFinalizing() uses the "old" pycore_atomic.h for API, whereas _PyRuntimeState_GetFinalizingID() uses the "new" pyatomic.h API.
|
I'm looking into the crashes right now. |
I'm going to follow up on gh-110052.
That's a comment I added about existing code. 😄 |
In Python 3.11, _PyThreadState_MustExit() does not deference tstate because it can be a dangling pointer: it's explained in the comment. // Check if a Python thread must exit immediately, rather than taking the GIL
// if Py_Finalize() has been called.
//
// When this function is called by a daemon thread after Py_Finalize() has been
// called, the GIL does no longer exist.
//
// tstate can be a dangling pointer (point to freed memory): only tstate value
// is used, the pointer is not deferenced.
//
// tstate must be non-NULL.
int
_PyThreadState_MustExit(PyThreadState *tstate)
{
/* bpo-39877: Access _PyRuntime directly rather than using
tstate->interp->runtime to support calls from Python daemon threads.
After Py_Finalize() has been called, tstate can be a dangling pointer:
point to PyThreadState freed memory. */
PyThreadState *finalizing = _PyRuntimeState_GetFinalizing(&_PyRuntime);
return (finalizing != NULL && finalizing != tstate);
} By the way, in the main branch, the function still has this comment: // tstate can be a dangling pointer (point to freed memory): only tstate value
// is used, the pointer is not deferenced. |
The main motivation is to address the situation where a subinterpreter gets cleaned up while the runtime is finalizing. Without this fix, the process immediately exits with an exitcode of 0, even if it would have exited with 1 (and the rest of runtime finalization is skipped). That seems like it makes a backport worth it. |
I added the The cases where it could be a problem are strictly where a subinterpreter has daemon threads. That is unlikely since subinterpreters have daemon threads disabled by default. |
…thongh-109794) Essentially, we should check the thread ID rather than the thread state pointer.
GH-110705 is a backport of this pull request to the 3.12 branch. |
…thongh-109794) Essentially, we should check the thread ID rather than the thread state pointer.
The warnings were introduced by pythongh-109794 (for pythongh-109793).
Essentially, we should check the thread ID rather than the thread state pointer.