-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-core in llbooter, capmgr, sl, cos_kernel_api(locks) and other related #346
Conversation
* This really is reverting back to use the SL_THD_WOKEN state for a thread that could call block() and was just woken() that could result in a race condition between block/wakeup events. * However, when the scheduler receives events from the kernel, they're for AEPs and must not have races. More importantly, kernel events could be redundant blocked/unblocked events. * NOTE: There could be some complex interleaving with AEP threads that I may be missing, could still have some issues. Will need to debug those as they occur.
* most changes are related to having PER_CORE variables/data-structures.
…into smp Conflicts RESOLVED: src/components/include/cos_defkernel_api.h src/components/include/cos_thd_init.h src/components/lib/cos_defkernel_api.c FIXED TO USE CACHE_ALIGNED. TODO: capmgr/sched implementations!
* TODO? Cannot create per-core thread ids because stack allocation uses unique thread ids for that. * TODO: sched component per core! * TODO: sl global data structures * Made sure that llbooter /capmgr data structures are set up such that we can have core-specific hierarchies. TODO: runscript design for that and passing that in init_args??
Conflicts RESOLVED: src/platform/i386/lapic.c
* Schedulers are implemented such that we can have core-specific hierarchies. This still needs to be designed and implemented in the linker/loader. * TODO: inter-core scheduler notifications.
* This is a workaround for issues in cos_kernel_api data-structures that are prone to races. * TODO: 1. Figure out where the races are! Use ticket locks for cos_kernel_api frontier modifications.
* Fixed race bugs in cos_kernel_api that lead to capability slot reuse. * This allows booter/capmgr to run 1 to N-1 CPUs to run initialization in parallel. (no more serializing required). * There are still some random crashes (noticed one in microbooter for heap alloc), I'll continue to debug those. But this is much stable version than before! * Tested my unit test for hierarchical scheduling with ~9 components, capmgr test, raw scheduler (scheduler that uses raw kernel API) and micro booter all on HW by running with NUM_CPU = 4.
* using locks in the cos_kernel_api!
* Talked to Yuxin about using locks here. Issues, 1. We could be propagating errors to user-level and not use locks. I think these are not performance intensive paths and the idea here is only to maintain consistency of the shared frontiers (across cores) and not to solve for example system call errors, which we'd still need to either propagate to the user or BUG()/assert() as we've now. 2. Reason for locks, vs retry logic for data structure consistencies: Having locks ensure that we don't update frontiers when we don't need to so we don't have holes created in the capability slots. Plus, again, if we need to propagate retries, we can do that with `try_lock` instead of with the internal API failures. Which would ensure we maintain simple code and less complex retry logic around it. 3. Whether locks are a long-term solution: I'd think the locks are mainly here for maintaining the cos_kernel_api library data-structure consistency and that around the APIs that are not performance sensitive or for which we don't upper-bound (??). So unless we want to have a totally lock free user-level(user of the lib) retry version, this doesn't feel like a short-term solution to me! 4. We could be work to make the lock usage more efficient: Perhaps we can. Ex: In bump_alloc_generic, we don't need locks to be used for per-core bump allocs. I have not thought this through but I don't think it matters as much but there could be room for optimizing the usage! * TODO: The only issue I think after this is with `cos_page_bump_alloc()` and I need to fix it. In my opinion I think it is probably not related to frontiers (very well could be) but is mostly related to atomicity of bump_expand and that PTE may need additional extension.. This is still not a complete thought, will debug deeper.. The reason I say this is because `page_bump_alloc()` in most of my tests don't crash except for micro_booter which tests it more rigorously in that the page_bump_allocs() cause additional PTEs to be allocated and if executions interleave, they should be causing issues!
* There is a race bug that leads to asserts trigger in page_bump_alloc from a system call. In my understanding PTEs are allocated by one core and used by other core which does also valloc.. I'm going to debug this further to get to the core of the problem.. But this fix makes things flawless on multi-core (tested with 8 cores).
Conflicts RESOLVED: src/components/implementation/tests/micro_booter/micro_booter.c src/kernel/include/shared/cos_types.h src/platform/i386/chal.c
* added a long description in sl_thd.h for what the idea is for the fix for AEP threads.. Essentially, if a thread is not blocked on cos_rcv, a kernel "unblocked" event should not bother what the thread state is at the user-level and the user-level block/yield should reset kernel state in the scheduler data structure as the thread is running or has been running without the scheduler's knowledge! Therefore kernel event isn't really a thread state, designed it that way!
* sl_xcpu_xxxx() api for cross-core requests! Currently only a single API sl_xcpu_thd_alloc() is supported and tested! * `sl_global_init()` is called from inside `sl_init()` on the first call to it, so global init is not exposed! TODO: pass the cpu bitmap in `sl_init` for information about which cpu/cores does this scheduler run on! Right now, it's hard coded within sl_init! * Added ability to create `asndcap` from one core to all other cores for cross-core notifications!
* Channels are a more generic term and using the same channel keys for ASND->ARCV and SHARED MEM would be ideal as they together form a communication channel. This change is toward that goal!
* This interface allows us to create a channel based on a static namespace using the cos_channelkey_t.. If two components want to communicate with each other but not worry about their ancestry or hierarchy, this API along with AEP creation API and shared keys enables two arbitrary components to talk to each other! We need access control for which components can talk to which others! Perhaps based on the dependencies in runscripts?? (This is not currently done however!). * I've also modified to return the mapped information if we make a duplicate map call either in memmgr or channel interface! * I was slightly confused between channel vs shmchannel. But I chose channel because it's shorter and perhaps allows us to also for ex, combine shared memory creation and asnd creation API using a channel key!
…mposite into smp Conflicts RESOLVED: src/components/implementation/capmgr/naive/cap_info.c src/components/implementation/capmgr/naive/cap_info.h src/components/implementation/tests/unit_capmgr/unit_capmgr.c src/components/implementation/tests/unit_capmgr_shmmap/unit_capmgr_shmmap.c src/components/lib/sl/sl_capmgr.c src/components/lib/sl/sl_raw.c * TODO: TESTING!
* Added `capmgr_asnd_rcv_create()` api to avoid `introspection` on a initthread to be able to create a snd endpoint.
* A ring buffer (ck_ring) is initialized in a shared memory region that is shared between a parent and its child (SPSC). * Parent creates the shared memory, initializes the ring buffer and returns the cbuf ID to the child through the schedinit_child interface! * Child then maps that address and setup it's ring/ring buffer pointers. * Every scheduler then in its `sl_sched_loop()` check if there are events in their ring buffer if there are, they're processed like other kernel events or inter-core events! Only BLOCK/WAKEUP are supported currently. * TODO: Upon `capmgr_thd_retrieve()` I'd need to get the init thread id of the child component that has this thread! This is because, the lazy thread retrieval doesn't know the mappings for child spdid to child threads! * Tested the notification path but not the actual blocks & wakeups!
* This API is used to retrieve the thread lazily, so knowing the initthd of the client component is essential at this point to setup the thd->schedthd of the thread being retrieved!
* Replaced multiple arrays of per core information with a struct that contains all of them and cache alignment for the struct, to save space!
* sl_xcpu_xxxx API are for cross-cpu requests.. passing in the cpuid of the current core as the argument is restricted in my design! * also made the return value reflect the internal error by returning the errno!
Conflicts RESOLVED: src/kernel/include/shared/cos_config.h src/kernel/include/shared/cos_types.h
* change `sched_blocked` to `rcv_suspended` * comments update for `rcv_suspended` * cos_switch to sched thread with current tcap on -EPERM for cos switch with sched tcap.
Woken race fix for any thread
Conflicts RESOLVED: src/components/include/sl.h src/components/lib/sl/sl_sched.c
I really hope that you addressed all of the issues. I looked through the commits after feedback, and there isn't much there...but I don't remember what the feedback was at this point, so that might be OK. I'm taking you on your word ;-) |
I wasn't expecting a merge of this so soon. ;-) The first few comments are for my changes in I'm not sure if I should revert the PR so it can go through the review process. |
Never mind. I just realized this is to "smp" branch and not mainline ppos. We can do the review process when I PR it to the mainline. |
I misunderstood. I thought you said before that this was ready and post-changes. I thought I did give feedback. Did I only give feedback on a subset of the PR? |
I have said that for the other "ready to merge" PRs and but not this one I think. :-) |
Summary of this Pull Request (PR)
This PR has all the required (at least the basic facilities for foundation) for cross-cpu and parent-child communication and for any 2 parties to communicate (through channels) and bug-fixes in core interfaces and libraries for multi-core.
channel:
capmgr:
sched/schedinit:
llbooter:
sl:
cos_kernel_api:
ck submodule:
Intent for your PR
Choose one (Mandatory):
Reviewers (Mandatory):
@gparmer @ryuxin @yzcode
Code Quality
As part of this pull request, I've considered the following:
Style:
Code Craftsmanship:
Testing
I've tested the code using the following test programs (provide list here):