Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core.cdata: support for common SHM object types #1082

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

eugeneia
Copy link
Member

We anticipate increased usage of shared memory objects by multiprocessing aware apps that use the facility to coordinate using shared memory. This PR registers a bunch of basic types with core.shm so that they can be used in shared memory frames (or independently).

The first part is straight forward and imho generally useful, it adds the following SHM types under the cdata module:

  • int8_t int16_t int32_t int64_t
  • uint8_t uint16_t uint32_t uint64_t
  • float double
  • bool

These have well defined conversion semantics as documented here: http://luajit.org/ext_ffi_semantics.html (C Type Conversion Rules). With this PR they can be conveniently used in SHM frames:

myframe = shm.create_frame("myframe", {toggle = {cdata.bool, false})
cdata.set(myframe.toggle, not cdata.read(myframe.toggle)) -- toggle it

Additionally I have added some crude support for fixed length strings (three sizes for now: 64, 512, and 8192). This part of the PR might be too clunky or good enough, you decide. I am willing to scrap that part for a better alternative. See the clunky usage in action:

str = cdata.create("str.string64") -- Alternative to using cdata.string64 in create_frame
cdata.string64.copy(str, "foobar") -- could also use ffi.copy, with potential out of bounds write
ffi.string(str) => "foobar"

@eugeneia eugeneia added the rfc label Nov 30, 2016
@eugeneia
Copy link
Member Author

Cc @lukego

@wingo
Copy link
Contributor

wingo commented Dec 1, 2016

You seem to be going in a parallel direction to where we're going in the lwaftr :) I would be interested in a bigger picture design document.

@lukego
Copy link
Member

lukego commented Dec 1, 2016

@wingo The use case driving this is to export information from the main process that initializes a 100G NIC to the worker processes that have to attach to queues. Code: https://github.com/snabbco/snabb/blob/mellanox/src/apps/mellanox/connectx4.lua#L148-L176

Currently this is implemented via shm counter objects. However, it is a bit funny to use counters because these values are not logically counters (e.g. address of a DMA ring), the values are not always uint64_t, and counters are double-buffered (have to remember to counter.commit() to make them visible.)

Is there an lwAFTR mechanism that we could reuse instead already now?

@wingo
Copy link
Contributor

wingo commented Dec 1, 2016

If the need is just for information, the leader/follower apps include a generic one-way main-to-worker channel: https://github.com/Igalia/snabb/blob/yang/src/apps/config/README.md.

@wingo
Copy link
Contributor

wingo commented Dec 1, 2016

If the goal of this information is to simply configure the worker, then given that you want to move to YANG anyway, I would define a YANG module for the parts of the mellanox drivers that need this configuration (sendpath/receivepath), then use that YANG module to serialize that data. If the data use is more complicated (e.g. worker->main process communication) then that might not be a great option.

I am a little concerned that there's been a public ongoing design for multiprocess configuration, backs and forths from everyone, then something like this going in through the back door without a big picture design that I have seen, well it makes me nervous :) Perhaps that is just me being high-strung right before a deliverable :)

@wingo
Copy link
Contributor

wingo commented Dec 1, 2016

If this measure is seen to be a stopgap, then that's cool too. In that case I would suggest to put these data structures somewhere other than core.

@lukego
Copy link
Member

lukego commented Dec 1, 2016

@wingo The goal is to make instances of the Mellanox IO app automatically synchronize with each other even when they are running in separate processes. The information being shared is emphemeral operational data e.g. the physical address of each DMA ring and whether it is currently online. This is currently invisible to the application & app network - it is internal state of the Mellanox apps that happens to span multiple processes.

The most closely related public discussion is #1068 the "IO" mechanism i.e. the desire for drivers to be able to run distributed across multiple processes that all want to send/receive traffic on the same network interfaces. Currently we have put a simple interface on the IO apps and expected their instances to coordinate behind the scenes somehow.

Could alternatively model each piece of operational state with YANG and export that with Leader/Follower. So the YANG schema would have objects for e.g. the current physical address of the DMA ring for queue 7 and the index of the last successfully transmitted packet. This could then be provided to the app instances as configuration state. Is that what you have in mind? Just now this seems like taking YANG modeling a step too far - modeling state that apps actually want to hide internally - but this may be lack of imagination on my part.

@wingo
Copy link
Contributor

wingo commented Dec 1, 2016

Could be. If an app has a private channel, by all means make a private channel. If you want to model that data with YANG, then it's possible to compile a public configuration to a private configuration and ship that data over a channel ring buffer. Or if you want an app-specific mechanism, that's cool too (and that code should surely live in the app's directory and not core). But I still didn't get one part of what you are doing -- are you updating these values in place? Updating strings in place seems not like a good idea :)

@wingo
Copy link
Contributor

wingo commented Dec 1, 2016

I would not use YANG (or our yang code anyway) for things like "index of the last successfully transmitted packet".

@lukego
Copy link
Member

lukego commented Dec 1, 2016

@wingo I simply want to define some variables whose scope spans process boundaries. The existing shm mechanism seems like the natural choice except that its current set of user-defined types is sparse. Just now this seems orthogonal to YANG and channels: that is for synchronizing application-level configuration but what I am synchronizing is the internal state of some apps.

@eugeneia I suggest that we take @wingo's feedback as a NACK and I will cherry-pick any changes that I need from this branch into the mellanox driver directly. Then we can consider the issue again if/when we need to propagate this functionality into the other drivers.

@lukego
Copy link
Member

lukego commented Dec 1, 2016

@wingo Just reflection: we need to establish the scope of use cases for YANG/Leader+Follower/Channels/etc. Where does the YANG universe start and end? Lots of objects in Snabb - program configuration, app networks, app state, histograms and timeline logs, LuaJIT state, etc. It would be nice to have a common view on when to use these mechanisms and when to do something else.

@wingo
Copy link
Contributor

wingo commented Dec 1, 2016

Apologies for being grumpy and thank you for humoring me :) FWIW the core of my objection is really against SHM-based configuration (app-internal or otherwise) in core, at least at this time. I don't think there is a robust general way that can work; I could be wrong but we should start out app-specific instead of starting in core IMO. Counters and synchronization variables for me are fine tho, there's fewer ways to misuse those :)

@lukego
Copy link
Member

lukego commented Dec 1, 2016

I understand that point of view. Shared memory variables are too low-level to be a satisfactory general solution for synchronization.

This is a relatively simple use case. Most of the variables are written only once, at creation time, and then used read-only. In this case shm seems simpler to me than channels because it is passive state (named values) rather than moving parts (sending messages on ring buffers). I also need to be able to handle simple exceptional cases e.g. for the worker to be able to save a tiny amount of precious state when it shuts down so that the next instance can reattach to the rings correctly.

No problem to keep this local in the driver for now. I do like this strategy of keeping code outside of core until it ceases to be controversial.

@eugeneia
Copy link
Member Author

eugeneia commented Dec 1, 2016

Cc @alexandergall, who I believe might have some use for this. Initially, he complained about my “we only need 64bit counters stance.” :-)

dpino added a commit to dpino/snabb that referenced this pull request Jun 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants