-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zebra: giant zapi cleanup #1828
Conversation
Need to leave |
💚 Basic BGPD CI results: SUCCESS, 0 tests failedResults table
For details, please contact louberger |
💚 Basic BGPD CI results: SUCCESS, 0 tests failedResults table
For details, please contact louberger |
Continuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-2747/ This is a comment from an EXPERIMENTAL automated CI system. Warnings Generated during build:Checkout code: Successful with additional warnings:
CLANG Static Analyzer Summary
No Changes in Static Analysis warnings compared to base19 Static Analyzer issues remaining.See details at |
Deleted spurious comment generated by a broken rebase. |
Continuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-2749/ This is a comment from an EXPERIMENTAL automated CI system. Warnings Generated during build:Checkout code: Successful with additional warnings:
CLANG Static Analyzer Summary
No Changes in Static Analysis warnings compared to base19 Static Analyzer issues remaining.See details at |
Continuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-2748/ This is a comment from an EXPERIMENTAL automated CI system. Warnings Generated during build:Checkout code: Successful with additional warnings:
CLANG Static Analyzer Summary
No Changes in Static Analysis warnings compared to base19 Static Analyzer issues remaining.See details at |
💚 Basic BGPD CI results: SUCCESS, 0 tests failedResults table
For details, please contact louberger |
zebra/zserv.c
Outdated
@@ -2569,11 +2569,32 @@ static void zread_vrf_label(struct zserv *client, | |||
return; | |||
} | |||
|
|||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is zclient_read_header?
To clarify I would prefer this is a library function that provides the struct zmsghdr *
instead of this in zserv.c because I think all protocols can and should take advanatage of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense
zebra/zserv.c
Outdated
return; | ||
} | ||
|
||
while (l < length) { | ||
while (l < hdr->length) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as a question -> the decode function ( that I commented on in the previous commit ) does not take into account subtraction of the zapi header. Is this going to be a problem here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the handler functions expect the stream pointer to be at the first byte after the header.
zserv_process_messages
resets the stream pointer and then extracts the header before calling down to the handlers, so they get the offset they expect. The length field of the header doesn't include the header itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you subtract it out because it typically does include the length of the header (see the lib/zclient.c code )
Rebase |
💚 Basic BGPD CI results: SUCCESS, 0 tests failedResults table
For details, please contact louberger |
💚 Basic BGPD CI results: SUCCESS, 0 tests failedResults table
For details, please contact louberger |
Great work. While reviewing this, I found out that the NetDEF CI is stuck in an infinite loop inside one topotest (
|
The problem seems to be in the @@ -1089,7 +1089,7 @@ void zlog_hexdump(const void *mem, unsigned int len)
unsigned long i = 0;
unsigned int j = 0;
unsigned int columns = 8;
- char buf[(len * 4) + ((len / 4) * 20) + 30];
+ char buf[BUFSIZ];
char *s = buf;
for (i = 0; i < len + ((len % columns) ? (columns - len % columns) : 0); As odd as it looks, variable length arrays are valid in C99. Changing the size of the array to However my LDP test topology is still not working because zebra is somehow droping the routes received from ospfd and ospf6d. I'll keep investigating. |
lib/zclient.c
Outdated
STREAM_GETW(zmsg, hdr->length); | ||
STREAM_GETC(zmsg, hdr->marker); | ||
STREAM_GETC(zmsg, hdr->version); | ||
STREAM_GETC(zmsg, hdr->vrf_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be STREAM_GETL()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic work, I'll merge it once CI passes (please see my inline comments as well).
One thing I noticed is that we now need to allocate a new stream
structure every time we send/receive a packet to/from a client. But I believe this overhead shouldn't be noticeable even under big loads. I think it's a price we pay to allow the implementation of a ZAPI I/O thread in the future, so IMO it's a good trade-off.
zebra/zserv.c
Outdated
struct zmsghdr hdr; | ||
ssize_t nb; | ||
bool hdrvalid; | ||
char errmsg[256]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think you could declare these four variables inside the while
loop below to reduce their scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair enough
const char *emsg = "Message has corrupt header"; | ||
zserv_log_message(emsg, msg, NULL); | ||
} | ||
if (!hdrvalid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that here and in several other places you're using this idiom to reduce one level of code indentation:
if (a && debug) {
zlog(...);
}
if (a) {
// process error
}
Instead of doing this:
if (a) {
if (debug) {
zlog(...);
}
// process error
}
I'm not sure if I like this. I think the second option reads better and is easier to follow (performance-wise they should be the same when using modern compilers). Reducing indentation levels usually improves code readability, but I don't see this happening here (my personal opinion).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I tried it the second way and it really crams it in there to fit under 80 characters. I would like to simplify this code some more so there's less indentation but I thought it was more readable the way I have it. I agree it's less than ideal and our ideas of 'readable' differ, I don't care too much either way and will change it back.
@@ -87,7 +89,7 @@ static int relay_response_back(struct zserv *zserv) | |||
|
|||
/* send response back */ | |||
stream_copy(dst, src); | |||
ret = writen(zserv->sock, dst->data, stream_get_endp(dst)); | |||
ret = writen(zserv->sock, src->data, stream_get_endp(src)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this change intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was at a previous version of this PR, not anymore. As I read it there's no difference either way. I can back it out if you like.
/* process commands */ | ||
zserv_handle_commands(client, &hdr, msg, zvrf); | ||
|
||
} while (msg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In zebra_client_read()
we read up to zebrad.packets_to_process
packets at time. Wouldn't it be a good idea to have a limit like that here as well? I think it will be necessary especially when the ZAPI I/O is segregated into another thread reading packets non-stop. And instead of zebrad.packets_to_process
we could try using thread_should_yield()
but I'm not sure if it would work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct, I have code for this on my multithreaded ZAPI branch. I can pull it into this PR if you think it belongs here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, adding a limit later together with the MT ZAPI code sounds reasonable to me.
💚 Basic BGPD CI results: SUCCESS, 0 tests failedResults table
For details, please contact louberger |
Continuous Integration Result: FAILEDSee below for issues. This is a comment from an EXPERIMENTAL automated CI system. Get source and apply patch from patchwork: SuccessfulBuilding Stage: SuccessfulBasic Tests: FailedUbuntu 14.04 deb pkg check: Successful IPv4 ldp protocol on Ubuntu 16.04: FailedRFC Compliance Test ANVL-LDP-1.24 failing: RFC Compliance Test ANVL-LDP-26.8 failing: Topology tests on Ubuntu 16.04 amd64: FailedTopology Test Results are at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-TOPOU1604-2826/test Topology Tests failed for Topology tests on Ubuntu 16.04 amd64:
see full log at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-2826/artifact/TOPOU1604/ErrorLog/log_topotests.txt IPv4 protocols on Ubuntu 14.04: FailedRFC Compliance Test ANVL-OSPF-3.1 failing: RFC Compliance Test ANVL-OSPF-31.2 failing: RFC Compliance Test ANVL-BGP4-1.3 failing: RFC Compliance Test ANVL-BGP4-1.4 failing: RFC Compliance Test ANVL-BGP4-4.1 failing: Topotest tests on Ubuntu 16.04 i386: FailedTopology Test Results are at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-TOPOI386-2826/test Topology Tests failed for Topotest tests on Ubuntu 16.04 i386:
see full log at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-2826/artifact/TOPOI386/ErrorLog/log_topotests.txt IPv6 protocols on Ubuntu 14.04: FailedRFC Compliance Test ANVL-RIPNG-1.4 failing: RFC Compliance Test ANVL-RIPNG-3.2 failing: RFC Compliance Test ANVL-OSPFV3-16.6 failing: RFC Compliance Test ANVL-BGPPLUS-AS4-4.1 failing: Topology Tests memory analysis: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-2826/artifact/TOPOU1604/MemoryLeaks/
|
@rwestphal Fixed a couple of the nits you mentioned. In |
💚 Basic BGPD CI results: SUCCESS, 0 tests failedResults table
For details, please contact louberger |
Continuous Integration Result: FAILEDSee below for issues. This is a comment from an EXPERIMENTAL automated CI system. Get source and apply patch from patchwork: SuccessfulBuilding Stage: SuccessfulBasic Tests: FailedFedora 24 rpm pkg check: Successful Topology tests on Ubuntu 16.04 amd64: FailedTopology Test Results are at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-TOPOU1604-2829/test Topology Tests failed for Topology tests on Ubuntu 16.04 amd64:
see full log at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-2829/artifact/TOPOU1604/ErrorLog/log_topotests.txt IPv4 protocols on Ubuntu 14.04: FailedRFC Compliance Test ANVL-OSPF-3.1 failing: RFC Compliance Test ANVL-OSPF-31.2 failing: RFC Compliance Test ANVL-BGP4-1.3 failing: RFC Compliance Test ANVL-BGP4-1.4 failing: RFC Compliance Test ANVL-BGP4-4.1 failing: IPv4 ldp protocol on Ubuntu 16.04: FailedRFC Compliance Test ANVL-LDP-1.24 failing: RFC Compliance Test ANVL-LDP-26.8 failing: Topotest tests on Ubuntu 16.04 i386: FailedTopology Test Results are at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-TOPOI386-2829/test Topology Tests failed for Topotest tests on Ubuntu 16.04 i386:
see full log at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-2829/artifact/TOPOI386/ErrorLog/log_topotests.txt IPv6 protocols on Ubuntu 14.04: FailedRFC Compliance Test ANVL-RIPNG-1.4 failing: RFC Compliance Test ANVL-RIPNG-3.2 failing: RFC Compliance Test ANVL-OSPFV3-16.6 failing: RFC Compliance Test ANVL-BGPPLUS-AS4-4.1 failing: Topology Tests memory analysis: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-2829/artifact/TOPOU1604/MemoryLeaks/
|
* Allocate correct amount of memory * Use snprintf() instead of sprintf() Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Just tests zlog_hexdump right now Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
All of the ZAPI message handlers return an integer that means different things to each of them, but nobody ever reads these integers, so this is technical debt that we can just eliminate outright. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Formalize the ZAPI header by documenting it in code and providing it to message handlers free of charge to reduce complexity. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
A lot of the handler functions that are called directly from the ZAPI input processing code take different argument sets where they don't need to. These functions are called from only one place and all have the same fundamental information available to them to do their work. There is no need to specialize what information is passed to them; it is cleaner and easier to understand when they all accept the same base set of information and extract what they need inline. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Group send and receive functions together, change handlers to take a message instead of looking at ->ibuf and ->obuf, allow zebra to read multiple packets off the wire at a time. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Nobody uses it, but it's got the same definition. Move the parser function into zclient.c and use it. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Get correct data size when parsing VRF ids * Move some vars into smaller scope Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Rebased zapi-cleanup, needs a bit of poking. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Continuous Integration Result: FAILEDSee below for issues. This is a comment from an EXPERIMENTAL automated CI system. Get source and apply patch from patchwork: SuccessfulBuilding Stage: SuccessfulBasic Tests: FailedAddresssanitizer topotest: Successful Topotest tests on Ubuntu 16.04 i386: FailedTopology Test Results are at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-TOPOI386-2853/test Topology Tests failed for Topotest tests on Ubuntu 16.04 i386:
see full log at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-2853/artifact/TOPOI386/ErrorLog/log_topotests.txt Topology Tests memory analysis: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-2853/artifact/TOPOI386/MemoryLeaks/Warnings Generated during build:Checkout code: Successful with additional warnings:Topotest tests on Ubuntu 16.04 i386: FailedTopology Test Results are at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-TOPOI386-2853/test Topology Tests failed for Topotest tests on Ubuntu 16.04 i386:
see full log at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-2853/artifact/TOPOI386/ErrorLog/log_topotests.txt
CLANG Static Analyzer Summary
No Changes in Static Analysis warnings compared to base19 Static Analyzer issues remaining.See details at |
Continuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-2853/ This is a comment from an EXPERIMENTAL automated CI system. Warnings Generated during build:Checkout code: Successful with additional warnings:
CLANG Static Analyzer Summary
No Changes in Static Analysis warnings compared to base19 Static Analyzer issues remaining.See details at |
Standardize arguments for ZAPI message handlers
A lot of the handler functions that are called directly from the ZAPI
input processing code take different argument sets where they don't need
to. These functions are called from only one place and all have the same
fundamental information available to them to do their work. There is no
need to specialize what information is passed to them; it is cleaner and
easier to understand when they all accept the same base set of
information and extract what they need inline.
Eliminate the humongous switch-case in the dispatcher
After doing the above work it becomes possible to eliminate the big
switch-case statement and turn it into a lookup table keyed by ZAPI
message code. Granted the compiler usually optimizes switch-case
statements into jump tables anyway but it is better to make this a
strong guarantee by doing it in the source.
Allow batching ZAPI I/O
ZAPI's original I/O loop looks like:
zserv_read
:zserv_read
This has been changed to:
zserv_handle_messages
:zserv_read
:zserv_handle_messages
zserv_read
This is more flexible as it decouples I/O from actual state changes inside
Zebra. Theoretically it also helps cache churn by grouping the same tasks
together instead of alternating between them.
Modify all handlers to stop accessing global client buffers
There is no reason for message handlers to look at a global data
structure that holds the last message received. Instead we can reduce
the amount of global state they need to know about by making them accept
a message stream, and not care about where it came from. This is also
necessary for the input queue changes above. It also moves us closer to
being able to unit test ZAPI message handlers.
Modify all handlers to not return an status code
All of the ZAPI message handlers return an integer that means different
things to each of them, but nobody ever reads these integers, so this is
technical debt that we can just eliminate outright.
Rename some functions to
zread_*
orzsend_*
to clarify actionzserv.c functions are roughly organized into three kinds of functions:
equivalents
These are variously named with prefixes like "zserv", "zebra", "zread",
"zsend", etc. Organize them by using consistent naming prefixes;
zserv_encode_*
,zsend_*
,zread_*
respectively for the three types givenabove.
Organize
zserv.c
into logical sectionsGroup the above functions together by type, and group the plumbing for
the I/O code together as well.
Add
struct zmsghdr
Formalize the ZAPI header by documenting it in code and providing it to
message handlers free of charge to reduce complexity.
Clean up zserv_read
Just some simplifications of the existing code intended to make it
easier to read, understand and debug. Also improved debugging output for
ZAPI messages by leveraging zmsghdr and making log messages consistent:
Signed-off-by: Quentin Young qlyoung@cumulusnetworks.com