-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question on undefined behavior as branch condition #891
Comments
My understanding of the spec is that either control ingress(inout Headers hdr, ...) {
apply {
hdr.eth_hdr.src_addr = 1;
}
} or control ingress(inout Headers hdr, ...) {
apply {
hdr.eth_hdr.src_addr = 2;
}
} would be something a compiler could produce, but not the version you posted. |
I agree with @jnfoster |
The P4_16 language specification goes to some lengths to avoid undefined behavior that can occur in C/C++. For example, it defines bit-for-bit what the result of signed integer addition/subtraction is, even in conditions of overflow, as Java does. For reading variables that have not been initialized, or fields of headers that are currently invalid, I believe most (or perhaps all?) of what the specification has to say on the behavior is in Section 8.22 "Reading uninitialized values and writing fields of invalid headers": https://p4.org/p4-spec/docs/P4-16-v1.2.1.html#sec-uninitialized-values-and-writing-invalid-headers As an example illustrating one of the things it says there, which is that reading an uninitialized variable multiple times might result in getting back a different value, note that this program:
would be legal to implement as the following program, because the first read of the uninitialized variable
|
Today we don't treat undefined values as constants. |
@mbudiu-vmw It seems to me that it would be incorrect for a compiler to transform This makes me wonder whether this aspect of the specification undercuts assumptions one often uses to reason about code transformations. Often one thinks about duplicating expressions and evaluating them repeatedly (expressions with no side effects) will return the same result, but the P4_16 spec explicitly allows implementations to evaluate an uninitialized variable differently on different occurrences in the source code. For example, it seems that the localCopyPropagation pass, by duplicating occurrences of expressions, might result in programs that the spec says have more possible final states than the original program did, if some of the variables read in those expressions are uninitialized? |
This is getting into esoteric details, I know, and I will try to refrain from introducing more problems than I solve, but this example seems scary to me. Consider programs 1 and 2 below, which to just about anyone look like they should have the same behavior:
According to my understanding of the spec, program 1 can execute the bodies of both if statements, because we read Program 2 cannot execute the bodies of both if statements, because |
I agree with @jafingerhut's analysis of the last example... it's weird, but it's what we got. |
Interesting discussion, thank you for all the comments. @jafingerhut control ingress(inout Headers h, inout Meta m, inout standard_metadata_t sm) {
bit<8> undef_0;
apply {
if (undef_0 == 8w1) {
h.eth_hdr.src_addr = 48w1;
}
if (undef_0 != 8w1) {
h.eth_hdr.eth_type = 16w2;
}
}
} in the mid end. Is this a problem? I am (obviously) wondering about this from the angle of equality checks. Things are becoming quite unclear to me at what point a program with undefined behavior is equal to its transformed version and when it is not. Or rather, what the expectation of a programmer may be. Based on the discussion I can see that bit<8> undef;
if (undef == 1) {
hdr.eth_hdr.src_addr = 1;
} else {
hdr.eth_hdr.src_addr = 2;
} can be transformed into just one assignment of |
I am curious myself to hear from other compiler experts whether: Note that these transformations seem correct to me if no variables are uninitialized, or at least none of them are uninitialized that would increase the number of possible behaviors of the transformed code. I see at least the following two alternatives: (1) change p4c so that it never makes such transformations, or at least never when they increase the number of possible behaviors of the program. (2) explicitly allow in the spec that a compiler can make such transformations, and thus the behavior of code reading uninitalized variables is truly "anything goes". At least, I do not see how to limit the behaviors to less than "anything goes" without making pretty tight restrictions like described in (1). Thoughts? |
(a) No (1) This is the "right" answer"
|
There are two kinds of "undefined":
|
@fruffy I do not know what others think, but Mihai's comment about relative priority of these kinds of issues, versus compiler bugs that are unrelated to uninitialized variables made me think of it. Perhaps one suggestion for your work is that if p4c gives a warning about the use of a variable that might be uninitialized, and if p4c then transforms the input program into something that looks incorrect for the reasons related to that variable being used uninitialized, consider setting those programs aside for a while, or just skip them completely and look for others that have problems when no such warnings are given? Rationale: If p4c currently detects that a variable might be used uninitialized, then it seems likely that one could enhance p4c passes some day to take that information into account. If p4c bugs are found where no such warnings are issued, or the bugs seem unrelated to the use of uninitialized values, then they are higher priority. FYI: I have added some examples and discussion about issues around uninitialized variables here: https://github.com/jafingerhut/p4-guide/blob/master/formal-verification/README.md |
@jafingerhut To handle cases such as issue 2470 in p4c I recheck whether a given violation may have been caused by a change to undefined behavior (e.g., a refinement). If that is the case, the violation is classified as "undefined violation". However, that recheck requires a precise model what constitutes a refinement and what doesn't in order to avoid false negatives. For now I will go with a more lenient model for undefined behavior that still reliably catches well-defined violations. However, it may miss some of the cases discussed in this thread or in your examples. |
I have a question on undefined behavior in conditions regarding P4. Assuming I have a block like this.
To what extent can the compiler transform this expression? Is turning the entire statement into a noop possible?
The C++ standard says that if undefined behavior is on the path anything goes. Is the same true for P4?
The text was updated successfully, but these errors were encountered: