Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concurrent invalidation and tear-down can trigger a bug #53

Open
drossetti opened this issue Oct 17, 2019 · 1 comment
Open

concurrent invalidation and tear-down can trigger a bug #53

drossetti opened this issue Oct 17, 2019 · 1 comment

Comments

@drossetti
Copy link

drossetti commented Oct 17, 2019

Condition below is benign (see #15 ) so peer_err() below is incorrect and confusing. It should be removed.

nvidia_p2p_dma_unmap_pages
{
...
#if NV_DMA_MAPPING
        if (!nv_mem_context->dma_mapping) {
                peer_err("nv_get_p2p_free_callback -- invalid dma_mapping\n");
@drossetti drossetti changed the title concurrent invalidation and tear-down can trigger bogus print concurrent invalidation and tear-down can trigger use-after-free and possibly kernel memory corruption Oct 11, 2023
@drossetti drossetti changed the title concurrent invalidation and tear-down can trigger use-after-free and possibly kernel memory corruption concurrent invalidation and tear-down can trigger a bug Oct 11, 2023
@drossetti
Copy link
Author

Updating this issue after a long time.
It turns out that the print is actually a signature of a bug in the way the MR are cleaned up, in specific conditions.
The other relevant diagnostic is the one below:

nv_mem nv_get_p2p_free_callback:144 nv_get_p2p_free_callback -- invalid page_table 

Both may be related to this issue.
The concerning case is when those checks are not able to mitigate the issue, because of the content of the memory is not zero.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant