Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On-demand program file paging: fix initialization of BSS areas #2032

Merged
merged 3 commits into from
Jul 7, 2024

Conversation

francescolavra
Copy link
Member

When on-demand paging of the program file is enabled, BSS areas in pages faulted-in on demand are zeroed in-place, i.e. a newly mapped page retrieved via the page cache is zeroed starting from the BSS offset set up when initializing the relevant vmap. This creates a problem if the page contains other data (e.g. from another loadable section of the program) at or after the BSS offset, in which case this data would be overwritten.
This change fixes the above issue by using a separate page (instead of the page from the page cache) where the initialized program data (located before the BSS offset) is copied from the page cache page, and the rest of the page (starting at the BSS offset) is zeroed out. Closes nanovms/ops#1629.

The last commit fixes an assertion failure that occurs when a page fault during a kernel context cannot be resolved synchronously.

@francescolavra
Copy link
Member Author

I fixed two issues that could potentially cause the kernel to fail to release a lock when under memory pressure.

…pped

This change aligns the aarch64 implementation of
__physical_from_virtual_locked() to that of the other
architectures.
It is possible for a page fault to occur in a kernel context: for
example, when setting up a signal frame on the stack of a user
thread, if the relevant page of the stack is not resident in memory
it will cause a fault. In these cases, if the fault cannot be
resolved synchronously (e.g. because a filesystem sync to disk is
needed to free up some memory), the existing code panics at an
assert(!is_kernel_context(ctx)) in the interrupt handler triggered
by the fault.
This change removes the above assert() so that page faults in kernel
contexts are correctly processed.
When on-demand paging of the program file is enabled, BSS areas in
pages faulted-in on demand are zeroed in-place, i.e. a newly mapped
page retrieved via the page cache is zeroed starting from the BSS
offset set up when initializing the relevant vmap. This creates a
problem if the page contains other data (e.g. from another loadable
section of the program) at or after the BSS offset, in which case
this data would be overwritten.
This change fixes the above issue by using a separate page (instead
of the page from the page cache) where the initialized program data
(located before the BSS offset) is copied from the page cache page,
and the rest of the page (starting at the BSS offset) is zeroed
out. Closes nanovms/ops#1629.

The on-demand paging implementation is being reworked to address
the following shortcomings:
- A vmap struct cannot be referenced without holding the vmap lock,
because it may be modified and/or deallocated at any time (e.g.
if the access protection flags of contiguous memory areas are
modified)
- Parallel handling of page faults from different CPUS for the
same page cannot be safely handled via a process-global pending
fault list, because it is possible for a faulting CPU to create a
new pending fault for a given page and then complete it before
another faulting CPU processes a fault for the same page: in this
case, the second CPU would not find any pending fault for the page,
even though the fault has been handled by the first CPU
The reworked implementation allows multiple faults to be pending
simultaneously for the same page, and relies on the page table lock
to prevent multiple mappings of the same page by different CPUs.
@francescolavra francescolavra merged commit 362f675 into master Jul 7, 2024
5 checks passed
@francescolavra francescolavra deleted the fix/program-paging branch July 7, 2024 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Issues using ops with Dart : pthread error: 22 (Invalid argument)
1 participant