-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core.cdata: support for common SHM object types #1082
base: master
Are you sure you want to change the base?
Conversation
Cc @lukego |
You seem to be going in a parallel direction to where we're going in the lwaftr :) I would be interested in a bigger picture design document. |
@wingo The use case driving this is to export information from the main process that initializes a 100G NIC to the worker processes that have to attach to queues. Code: https://github.com/snabbco/snabb/blob/mellanox/src/apps/mellanox/connectx4.lua#L148-L176 Currently this is implemented via shm counter objects. However, it is a bit funny to use counters because these values are not logically counters (e.g. address of a DMA ring), the values are not always uint64_t, and counters are double-buffered (have to remember to Is there an lwAFTR mechanism that we could reuse instead already now? |
If the need is just for information, the leader/follower apps include a generic one-way main-to-worker channel: https://github.com/Igalia/snabb/blob/yang/src/apps/config/README.md. |
If the goal of this information is to simply configure the worker, then given that you want to move to YANG anyway, I would define a YANG module for the parts of the mellanox drivers that need this configuration (sendpath/receivepath), then use that YANG module to serialize that data. If the data use is more complicated (e.g. worker->main process communication) then that might not be a great option. I am a little concerned that there's been a public ongoing design for multiprocess configuration, backs and forths from everyone, then something like this going in through the back door without a big picture design that I have seen, well it makes me nervous :) Perhaps that is just me being high-strung right before a deliverable :) |
If this measure is seen to be a stopgap, then that's cool too. In that case I would suggest to put these data structures somewhere other than |
@wingo The goal is to make instances of the Mellanox IO app automatically synchronize with each other even when they are running in separate processes. The information being shared is emphemeral operational data e.g. the physical address of each DMA ring and whether it is currently online. This is currently invisible to the application & app network - it is internal state of the Mellanox apps that happens to span multiple processes. The most closely related public discussion is #1068 the "IO" mechanism i.e. the desire for drivers to be able to run distributed across multiple processes that all want to send/receive traffic on the same network interfaces. Currently we have put a simple interface on the IO apps and expected their instances to coordinate behind the scenes somehow. Could alternatively model each piece of operational state with YANG and export that with Leader/Follower. So the YANG schema would have objects for e.g. the current physical address of the DMA ring for queue 7 and the index of the last successfully transmitted packet. This could then be provided to the app instances as configuration state. Is that what you have in mind? Just now this seems like taking YANG modeling a step too far - modeling state that apps actually want to hide internally - but this may be lack of imagination on my part. |
Could be. If an app has a private channel, by all means make a private channel. If you want to model that data with YANG, then it's possible to compile a public configuration to a private configuration and ship that data over a channel ring buffer. Or if you want an app-specific mechanism, that's cool too (and that code should surely live in the app's directory and not core). But I still didn't get one part of what you are doing -- are you updating these values in place? Updating strings in place seems not like a good idea :) |
I would not use YANG (or our yang code anyway) for things like "index of the last successfully transmitted packet". |
@wingo I simply want to define some variables whose scope spans process boundaries. The existing @eugeneia I suggest that we take @wingo's feedback as a NACK and I will cherry-pick any changes that I need from this branch into the mellanox driver directly. Then we can consider the issue again if/when we need to propagate this functionality into the other drivers. |
@wingo Just reflection: we need to establish the scope of use cases for YANG/Leader+Follower/Channels/etc. Where does the YANG universe start and end? Lots of objects in Snabb - program configuration, app networks, app state, histograms and timeline logs, LuaJIT state, etc. It would be nice to have a common view on when to use these mechanisms and when to do something else. |
Apologies for being grumpy and thank you for humoring me :) FWIW the core of my objection is really against SHM-based configuration (app-internal or otherwise) in |
I understand that point of view. Shared memory variables are too low-level to be a satisfactory general solution for synchronization. This is a relatively simple use case. Most of the variables are written only once, at creation time, and then used read-only. In this case shm seems simpler to me than channels because it is passive state (named values) rather than moving parts (sending messages on ring buffers). I also need to be able to handle simple exceptional cases e.g. for the worker to be able to save a tiny amount of precious state when it shuts down so that the next instance can reattach to the rings correctly. No problem to keep this local in the driver for now. I do like this strategy of keeping code outside of core until it ceases to be controversial. |
Cc @alexandergall, who I believe might have some use for this. Initially, he complained about my “we only need 64bit counters stance.” :-) |
Fix attempt to index a nil value
We anticipate increased usage of shared memory objects by multiprocessing aware apps that use the facility to coordinate using shared memory. This PR registers a bunch of basic types with
core.shm
so that they can be used in shared memory frames (or independently).The first part is straight forward and imho generally useful, it adds the following SHM types under the
cdata
module:int8_t
int16_t
int32_t
int64_t
uint8_t
uint16_t
uint32_t
uint64_t
float
double
bool
These have well defined conversion semantics as documented here: http://luajit.org/ext_ffi_semantics.html (C Type Conversion Rules). With this PR they can be conveniently used in SHM frames:
Additionally I have added some crude support for fixed length strings (three sizes for now: 64, 512, and 8192). This part of the PR might be too clunky or good enough, you decide. I am willing to scrap that part for a better alternative. See the clunky usage in action: