-
Notifications
You must be signed in to change notification settings - Fork 299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[wip] snabb top --yang: dump RFC7223 interface stats as JSON #886
Conversation
Set the shared memory path (shm.path) to a private namespace for each app with prefix "app/$name". This means that apps can create shm objects such as counters and by default these will appear in a local namespace for that app.
- Use "apps/" instead of "app/" for uniformity - Set shm path to "apps/$name" when calling `app:stop' too - Unlink "apps/$name" after `app:stop' using `shm.unlink' - Add a test case to core.app selftest
# Conflicts: # src/core/app.lua
…eric representation.
What's going on here? I thought Yang was being planned as part of #696. This seems to be much more ad-hoc with no design document. |
I agree that JSON is probably not the best representation for this data, because YANG's set of data types does not map particularly well to JSON. For me you always need a schema for a computer to interpret YANG instance data, if you are not serializing in a representation that is compatible with the XML representation. |
This is exciting to me as a big step in the direction of supporting standard RFC 7223 network interface statistics. This is the raw information that network operators need in order to monitor links in their networks. Having these counters available at all major points where packets ingress/egress from Snabb (e.g. NIC, VM, kernel, important tunnel, ...) would make it much easier to produce applications that are easy to monitor. This is really only indirectly related to YANG i.e. the counters in RFC 7223 are spec'd as YANG objects and so to implement that you implicitly need to have some kind of mapping. Figuring out the right way to connect this information with the various "northbound" interfaces - snabbtop, netsnmpd, netconf, etc - is the next discussion that I think @wingo has just started :-) and of course @alexandergall already has one such solution in tree here for intel10g and netsnmp. Have only skimmed the code but looks good to me so far! I am really curious to know whether we will see a significant performance impact from things like counting unicast vs multicast packets in software when hardware counters are not available. |
Could consider I see this branch as primarily developing support for keeping track of counter values that are needed for populating some standard YANG models. This is kind of a big deal because it can involve inspecting the packet stream in apps that would otherwise be payload-agnostic e.g. the vhost_user app needing to check payload to decide whether to bump the unicast or multicast counter. Have to see what the overhead is here - and if it is high then whether any compromise makes sense. These values will then be fed into the YANG framework that you are hacking based on #696. Meanwhile snabbtop is providing an interim interface for dumping the counters during development. This all has to be integrated together (and with existing SNMP support) once all the bits are there. |
I don't immediately see the problem with using JSON. @wingo can you elaborate? There is https://tools.ietf.org/html/draft-ietf-netmod-yang-json-09 which defines a mapping for JSON so shouldn't be a problem with the types but I haven't really looked at the details. With that said, I don't really mind XML either. As for the output itself the counter64 is hex encoded, why? counter64 is a uint64 and the mapping for that is described here - https://tools.ietf.org/html/draft-ietf-netmod-yang-json-09#section-6.1 The "type" is an identityref and current ones are described in rfc7224 - https://tools.ietf.org/html/rfc7224. I guess you can use existing values, like "ethernetCsmacd" where you currently have "hardware". For snabb specific stuff we need to create a new yang model that defines new identityrefs - like snabbLink. |
@lukego: Makes sense to me, sure. Integrating all of this does have some unknowns, but sure. @plajjan I can see the value of a JSON serialization when talking to a tool that expects the standard JSON YANG mapping and which also has the schema on-hand. Just dumping JSON without a schema is likely to run into problems as you note: #696 (comment). But given that JSON is a fair bit of ceremony to parse, that it's hard to give good errors for a YANG JSON production (by the time you are validating the instance data you don't have source locations on-hand), that you have to shove numbers into strings, well at that point given that you have to build some representation of the schema in many YANG-processing places, then you might as well serialize using a nicer language (e.g. the text argument to the Backing up.... it's easy to write out data in JSON. It's not so easy to consume it, when you factor in the types and the need to validate that instance data conforms to a schema. Snabb will need to validate configurations with good messages for the user to indicate when something is wrong with the configuration. I see JSON as being in the way of that goal -- not nice to write by hand, not nice to give errors for, does not support the data types we need. But I think this is more of a discussion for #696. This PR is limited to export of state data. A YANG integration in Snabb will have to import and export configuration data too. |
@wingo Sorry for the confusion/misleading title. This PR is 90% messing with counters and 10% thinking about how they might map to YAN, the JSON part is superficial. |
@wingo I perceived this as a step in the right direction and not necessarily the final solution. I assumed one would have the schema handy but you have probably thought more about the internals of implementing this in Snabb. Right now I can't spare the brain cycles to think about this and come up with any sensible input so I'll just leave it in your capable hands :) IIRC XML (at least XML-RPC) also suffers from lack of large numbers, but maybe I'm thinking of > 64 bits. |
@wingo So again to be clear, you are the lead on all things YANG. Consider Edit: Btw, most of the confusion probably came from me confusing RFC7223 with YANG. I hope I have fixed the terminology in the PR title/text. |
@eugeneia all good by me, and the functionality of this patch is really excellent and quite welcome :) great stuff! |
@@ -116,6 +148,11 @@ function Intel82599:push () | |||
end | |||
end | |||
self.dev:sync_transmit() | |||
if self.dev.txstats then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How frequently do these counters need to be synchronized with hardware?
This code is synchronizing them every breath but that adds up to a significant number of PCIe accesses even e.g. for NICs that are completely idle. This may cause performance degradation in scenarios that we don't have CI performance coverage on at the moment e.g. app network with very many NICs where most are idle but some are active.
One alternative would be to use a timer to update every e.g. one millisecond.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, honestly I had/have trouble getting a good understanding of Intel10G stat registers code. Could definitely use a thorough review with SHM stats in mind (e.g. get rid of Intel10G.dev.get_rx/tx_stats()
altogether). In this PR I attempted to be make mostly non-disruptive (light bolt-on) changes.
Related to your second point @eugeneia, how close to implementing this are you? If you can afford to wait a month, then I think we will have made some YANG progress in some way and it will be more clear how to represent YANG instance data coming from Snabb. I think that in our output language, we will need to be able to import modules and declare certain properties as being extensions, i.e.
Big-picture-wise we are defining a language whether we want to or not, whether it is built on JSON or otherwise, and it will have a grammar, and we should specify that grammar formally, and it should be extensible, and it should be pleasant to deal with. Could be based on json; I'd go for something more ad-hoc but ymmv. WDYT? |
@wingo I am just brainstorming from the statistics perspective, and write things down that I think might be relevant. I am mostly trying to find out which tasks are based on the Snabb-YANG model and which are independent. E.g. even without having the model defined, we can tell that
I absolutely want to wait for a your YANG spec for any mapping to that. At the same time I would like to get other issues out of the way in parallel. My next steps would be to:
Does that make sense to figure out these kinds of things in parallel? Regarding JSON: disregard the mockup in this PR, I have no preference. |
Yeah totes, working in parallel is great. I suspect that given that we need to end up with a coherent whole but that we have different focuses in the near term (you, AFAIU: exposing the RFC 7223 set of counters, me: exposing the YANG counters for the lwAFTR based on an internet draft, in addition to implementing some kind of minimal schema for consuming configuration and state data) that we will develop and figure out in parallel, and then reconcile in a few weeks. As long as we keep in close enough communication we could end up with a nice result. Let me know if I have misunderstood :) Perhaps we should have a tracker bug for conversations, given that you are not on Slack or IRC AFAIU. |
@eugeneia wrt I suggest you do the same. |
@alexandergall But not every interface has a PCI address? E.g. virtual interfaces, tunnel interfaces, ... I think this kind of indexing will have to be addressed in |
Ok, to be more precise then: in SNMP, the object |
I realize now that I had overestimated the statistics provided by the 82599 NIC. There are actually just 16 register “groups” (see Where does that leave us? I am leaning towards exposing the per-PCI-address global statistics in |
option and support injecting a function to determine the current time.
This reverts commit 8bb3215.
# Conflicts: # src/core/counter.lua
@alexandergall I have added per-PCI address statistics SHM counters for Intel10G in f0ed10b, two observations:
|
@eugeneia I can't comment much on the specific issue wrt VMDq stats. From the perspective of monitoring via NC/YANG or SNMP, it seems clear that every interface (irrespective of whether it's hardware or virtual) should be exposed by name/index (the primary handle for monitoring being the name). I'm not sure how useful it is to be able to link a virtual interface to a physical one. If you do have dedicated counters, you expose them through the stats of the virtual interface and if you don't have them, the counters of the physical interface are more or less useless. At least with SNMP, there is a mechanism for this kind of mapping with the I must confess that I didn't follow this PR or #696 closely, so I applogize for not knowing what exactly has been discussed. I would prefer if we had a more generic framework for exposing data for the purpose of monitoring which is not biased towards any particular northbound interface. I'm sure that you can't assume any kind of mapping between YANG models and SNMP MIBs in general. That would make it necessary to have another layer that would transform the raw data to something specific for each protocol. For example, the Also, just getting the fields from RFC7223 wouldn't make me happy either. Not all of Also, I use |
@alexandergall My gripes is not with
This is my point of view as well. I propose that we expose a super-set of RFC7223 counters with some simplifications. Different “readers” of our statistics interface will have to process the data to a specific format such as RFC7223 or ifMIB.
If I implemented the schema described above, and refactored the code in |
@eugeneia I basically agree but I'm not sure about the restriction to I guess the question is related to how we want to organize the |
I don't want to restrict the available shmem data types to
Right now, just like higherLayerIf I suppose.
I am absolutely not set on the layout of |
I agree with that. What I'm saying is that we need something other than
Which is done... how?
Again: we already live in that future if we stop calling everything that fits into a |
Just a related background point: Currently we are allocating each shm object (e.g. counter) on a 4096-byte aligned address. This is not a valid design for x86. The problem is that the low 12 bits of the address will always be zero and this leads to "conflict misses" for Intel's set-associative cache. The CPU cache simply cannot contain many objects that have the same values in their low bits. See e.g. http://danluu.com/3c-conflict/. This has already bitten us in practice. The workaround we made for now is to double-buffer counters so that we access a local cache of the value and then periodically commit that to the 4096-byte aligned shared memory address. This is complex and lacks generality. So we need to come up with a new allocation scheme and ideally make the objects cheap to access by having more entropy in the low bits of their addresses. Simplest to implement might be to assign random padding to the beginning of the objects. However we may alternatively prefer a scheme that allocates many objects on the same page instead. |
Here I attempted to find a common ground between RFC7223 and ifTable MIB: eugeneia/snabb@f4834a5...eugeneia:statistics-superset This takes the SNMP code from
The reason I am reluctant to do this now is because I think requirements will be more clear when @wingo comes up with a model. Can't hurt to get started with what we know though! Are there any data types we need to support besides Right now the directory structure is
E.g. an app The upside of the current layout is that readers of the shm directory can determine the type of a memory mapped object based on the layout. (That was my original motivation for this layout: to avoid having to encode types in the objects or map object names to types.) The downside is that the directory structure is scattered. E.g. not a single directory containing all resources managed by an app. Alternative ideas welcome! |
Something like this would certainly make sense. I don't worry about starting a separate process for this outside the Snabb framework (for my appliance, I would simply add another systemd serivce).
An array of uint64s might make sense. For example, I'm thinking about exposing the MAC table of the bridge app via the BRIDGE-MIB. Using separate "counters" for this could be excessive for large tables. The number of entries in the table would be determined by the size of the file. Concerning the name "counter": why don't we simply call it uint64 (or just int or something), because that's what it actually is?
A simple alternative would be to add the type as an extension to the object's name. e.g. If we want different views of the same data (per app, per object etc.) we could consider creating a hierarchy in |
The current code (merged in #931) is actually written to run in the same process. I aimed for a drop-in replacement.
Hmm that seems awful specific. I will look into making
From my point of view uint64_t is the data type while
I like that idea! |
Nice. I suppose I would then do something like
before running the engine?
Another approach would be to implement a API for storing a simple number (e.g. the interface administrative status) which doesn't need any of the fancy stuff and specialize for the semantics of a counter where you need it. Probably violates the KISS principle :) Anyway, it's not terribly important and I'll stop arguing about it. |
Exactly. :-) I haven't tested the code but I have been careful about maintaining semantics. |
Unify lwAFTR and other state query around YANG
This picks up on #766 to provide RFC7223 interface statistics as shared memory and updates
snabb top
to be able to access and format the information. In brief:core.app
to support app-local counters undercounters/$app
snabb top --counters <app>
to list app counterssnabb top --yang
to dump RFC7223 model as JSON, including three interface types: hardware (NICs), virtual (VhostUser), link (links)Open questions / to-do:
Cc @lukego
Appendix:
YANG/JSON output looks like this: