-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement shiftleft/shiftright by moving packet around #1032
Conversation
Instead of memmoving a packet's data around this patch instead moves around the packet pointer itself. The only performance testing I did was "snabb lwaftr bench", where it appears to be neutral-to-slight-win. I am hoping that the Hydra/R scientists can help me out on this one. This change to the packet structure does have a knock-on effect, in two significant ways. One is that the shiftleft, shiftright, and prepend functions now return a new packet pointer and silently corrupt the old one (because probably they shifted around the data but still had to update the new "length" pointer). That's somewhat OK though. The problem comes in the datagram library, which sometimes wants to work on a user-supplied packet. The sequence was: local dgram = datagram:new(pkt) dgram:push(....) dgram:free() ... foo(pkt) This pattern is no longer valid :( The reason is that the datagram methods that mutate a packet often do so by packet.prepend(). The datagram library updates its internal packet pointer, but the external one that was initially passed in is no longer valid. So much of this patch is visiting (hopefully) all of the uses/users and updating them.
Exciting! |
@wingo How about if I setup a Hydra job that benchmarks branch |
(btw @domenkozar and others are working on extending Hydra to be able to automatically test Github PRs. This will be nice.) |
@lukego done :) i assume you want it named "optimize" just so that I can test other things by force-pushing to that branch? Sounds good to me :) We definitely need to throw some science at this one! |
Just for the record, Hydra+Github PRs integration prototype is done, but depends on NixOS/nixpkgs#19396 |
Groovy. I created jobset snabb-new-tests/optimize-murren that tests many configurations (all dpdk/qemu/etc) but only a small number of iterations per scenario (3). This should provide a solid picture of the overall impact of the change but we have to be cautious about slicing-and-dicing too much (the jobset testing next does 10x more iterations). Have to revise the R reports soon. I would particularly like to slice-and-decide datasets into groups using tools that avoid false-positives like xkcd 882 by correcting the confidence level based on how much the data is being "tortured." Visualizations are weak in this respect - easy to see patterns in randomness - and we probably need to include e.g. properly adjusted confidence intervals. |
@domenkozar any idea on this evaluation error? https://hydra.snabb.co/jobset/snabb-new-tests/optimize-murren#tabs-errors |
These are harmless, they are placeholder values for nixpkgs arhitecture to know what packages to build. I'll remove them in master :) |
The first report is up. I know that reading these things is a bit of an art, but it looks like the change is basically neutral and does not cause any performance regression there. This was tested on the murren servers without a hardware NIC. I suppose it makes sense to also test on the lugano servers that do have hardware NICs in case, for example, the alignment of memory has implications for DMA. (On reflection I believe that the 82599 requires DMA to be aligned to even-numbered addresses. If that is right then we might need to hold that invariant for |
@lukego Addressed in snabblab/snabblab-nixos@9b05d04 |
Added minimum packet alignment and updated the wingo/optimize head too. |
Report with minimum alignment will be here: https://hydra.snabb.co/build/658140 |
My conclusion is that perf-wise this doesn't regress any benchmark. The open question is its effect on lib.protocol.datagram et al (cc @alexandergall). Are we OK with these changes? If so let's merge. If not let's talk :) The context is that I'd like to get the lwaftr merged back to master early this week, and this is one of the few changes we made to core code. |
This is looking nice! Minor nit: your ' -- No headroom for the shift; re-set the headroom to the default.' comment is slightly stale with the latest change. It's now no headroom or a shift by an odd number of bytes. |
Any chance of a Like:
... and checking that free is putting the packets back into a canonical form i.e. resetting the headroom? |
Good idea @lukego and nice catch @kbara, can do. What do you think of the datagram changes tho Luke? Any out-of-tree uses of the datagram library will be silently broken (like @alexandergall's vpn). MHO: breaking datagram users is too bad but it is the right thing to do as this ultimately makes encap and decap operations like vlan tagging much cheaper, we should just own that decision. |
For my part I am fine with the datagram library changes. Causing breakage downstream is indeed regrettable. I would like to solve this by bringing more code (and tests and benchmarks) upstream so that we can take care of it while making core changes like this one. (I think this is the normal "selfish" reason that people upstream their code i.e. to share the responsibility for keeping it working.) |
Hydra run here: https://hydra.snabb.co/build/679729 |
Well goll darnit it looks like that last change tanked performance. Will give it another go. Sadness. |
Actually the commit seemed to tank correctness, not performance :) I was thinking that since |
perf to be here: https://hydra.snabb.co/build/682109 |
OK all good now, ptal @lukego |
LGTM! |
thanks :) I will merge to wingo-next. |
Looks worse on DPDK + configuration test 2.1.0 noind, which looks bimodal. Probably noise. |
@kbara yeah hard to read that one indeed. the lower mode does seems to have lower probability tho |
relatedly: I recently came across the term researcher degrees of freedom for situations where the person interpreting the data can inadvertently make uncontrolled decisions that lead to seeing things that seem more significant than they are. I believe that I have created exactly such a situation with this version of the R report that serves up a menu of sliced-and-diced graphs to choose from. Just by picking the most interesting graph from the report we are already on the way to seeing patterns in randomness. Have to think about a better presentation... |
Obligatory: http://www.xkcd.com/1725/ |
I think the current presentation is ok - we just need to bear in mind that if a couple of results look like outliers, they might well be noise, especially if they don't persist between runs. I see a lot of value in visualizing fine-grained data, seeing where bimodal results occur, etc. No matter how we display it, we have to think about it, test whether it's reproducible at times, etc. Even boiling all of the output down to one number and no graphs wouldn't shield us from misinterpreting significance. |
Fix lj_auditlog double-inclusion and accidental globals
Instead of memmoving a packet's data around this PR instead moves
around the packet pointer itself.
The only performance testing I did was "snabb lwaftr bench", where it
appears to be neutral-to-slight-win (relative to the previous strategy we had, in which we manually managed a headroom and data pointer per packet). I am hoping that the Hydra/R
scientists can help me out on this one.
This change to the packet structure does have a knock-on effect, in
two significant ways. One is that the shiftleft, shiftright, and
prepend functions now return a new packet pointer and silently corrupt
the old one (because probably they shifted around the data but still
had to update the new "length" pointer). That's somewhat OK though.
The problem comes in the datagram library, which sometimes wants to
work on a user-supplied packet. The sequence was:
local dgram = datagram:new(pkt)
dgram:push(....)
dgram:free()
...
foo(pkt)
This pattern is no longer valid :( The reason is that the datagram
methods that mutate a packet often do so by packet.prepend(). The
datagram library updates its internal packet pointer, but the external
one that was initially passed in is no longer valid. So much of this
patch is visiting (hopefully) all of the uses/users and updating them.
See also Igalia#407 and Igalia#481.
Note that this PR does not update the virtio code to use the new interfaces. (In the lwaftr branch, we use direct descriptors and push/pop the virtq tags using this mechanism.)