Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix allow_threads segfault #2952

Merged
merged 1 commit into from
Feb 21, 2023
Merged

Conversation

OliverBalfour
Copy link
Contributor

Please see the corresponding issue #2951 for details. This PR adds the failing test from the issue and then a fix for it. The fix simply calls ReferencePool::update_counts at the end of allow_threads to ensure objects aren't accidentally deleted too soon.

@davidhewitt
Copy link
Member

davidhewitt commented Feb 14, 2023

Thank you very much for debugging this, a clear explanation and I agree with the proposed fix! Let's get this out as a patch release.

Please can you add a newsfragment as 2952.fixed.md.

The test failure looks like a race. I think what is likely happening is that the new test you have added is writing an object into the pool during the allow_threads call, which then races with test_pyobject_drop_without_gil_doesnt_decrease_refcnt.

I think the pool_not_dirty() assertion in that test on line 635 is excessively demanding. I think you can change it to instead just check that the object in question is not in the pool:

assert!(!POOL.pointer_ops.lock().0.contains(&obj.as_ptr()));
assert!(!POOL.pointer_ops.lock().1.contains(&obj.as_ptr()));

Finally, can you please update the test test_clone_without_gil. On line 803 it has this comment:

// Returning from allow_threads doesn't clear the pool

That comment is now clearly wrong, so I think the allow_threads / with_gil construct immediately after it can be removed.

@@ -478,6 +478,8 @@ impl<'py> Python<'py> {
gil::GIL_COUNT.with(|c| c.set(self.count));
unsafe {
ffi::PyEval_RestoreThread(self.tstate);
// Update counts of PyObjects / Py that were cloned or dropped during `f`.
gil::POOL.update_counts(Python::assume_gil_acquired());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether we should package up RestoreGuard in mod gil to centralize the usage of Save/RestoreThread, accessing GIL_COUNT and calling update_counts in that module. (I suspect that having this in separate modules, i.e. making GIL_COUNT a pub(crate) item, might have been the root cause of this slipping in.)

@OliverBalfour
Copy link
Contributor Author

Thanks for the feedback. I've added a newsfragment and made the proposed test modifications. I see there are a couple other pool_not_dirty assertions - are these safe? I don't know the codebase well.

As for packaging up RestoreGuard, this sounds like a good solution! I personally won't have time to do this though, I'm very time poor.

@davidhewitt
Copy link
Member

Thanks! I'm going to force-push here to squash. The pool_not_dirty() assertions where the GIL is never released are not a problem (won't race with other threads).

@adamreichold - I agree with the idea of adding an abstraction layer, however it's fine IMO to not do it in this fix. I've opened #2969 for one of us to remember to do this later.

@davidhewitt
Copy link
Member

bors r+

@bors
Copy link
Contributor

bors bot commented Feb 21, 2023

Build succeeded:

@bors bors bot merged commit bd07ecc into PyO3:main Feb 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants