Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PFPO is never setting the CC (and is never setting GR1 either) #527

Open
Fish-Git opened this issue Nov 29, 2022 · 7 comments
Open

PFPO is never setting the CC (and is never setting GR1 either) #527

Fish-Git opened this issue Nov 29, 2022 · 7 comments
Assignees
Labels
BUG The issue describes likely incorrect product functionality that likely needs corrected. HELP! Help is needed from someone more experienced or I'm simply overloaded with too much work right now! Ongoing Issue is long-term. Variant of IN PROGRESS: it's being worked on but maybe not at this exact moment.

Comments

@Fish-Git
Copy link
Member

The title says it all:

The PFPO instruction appears to never be setting the Condition Code, as well as appears to never be placing any type of Return Code value into General Register 1 either.

Bob's code is very hard to understand and follow (at least for me anyway) due not only to his coding style, but also due to the inherent complexity of Floating-Point concepts in general.

My mind just has a hard time wrapping itself around floating-point in general, but has an especially hard time following Bob's code too.

I could use some help on this one folks!  :-(

@Fish-Git Fish-Git added BUG The issue describes likely incorrect product functionality that likely needs corrected. HELP! Help is needed from someone more experienced or I'm simply overloaded with too much work right now! labels Nov 29, 2022
@srorso srorso self-assigned this Nov 29, 2022
@srorso
Copy link
Contributor

srorso commented Nov 29, 2022

Let me take a look at it. Give me a couple of days.

@Fish-Git
Copy link
Member Author

Fish-Git commented Nov 29, 2022

Let me take a look at it. Give me a couple of days.

Thanks, Steve! (@srorso)

MUCH appreciated.

You might want to do another git pull before you start. I just made some cosmetic tweaks to Bob's code (just comments) to make it a little easier(?) to understand his code. (Trust me, they're very minor and 100% cosmetic)

I'm going to continue looking at it myself in the hope that it will eventually sink in. If I have any "Ah-HA!" moments I'll be sure to run them by you for confirmation before doing anything.

(The endianness issue was easy compared to this one! His code is weird!)

@srorso
Copy link
Contributor

srorso commented Dec 2, 2022

I swore on my wife's ashes that I would not drop acid anymore. And here I am doing floating point and reaching for Windowpane. Sorry dear.

OK.

Fish, your comment additions are helpful and on point. The code in pfpo.c is OK I guess. It does not help that the subject matter, floating point, is by itself complex and, as a bonus, cryptic. It gets worse when you code it (see reference to mind-altering substances above). Floating point representations are finite approximations of values that have no limit on precision. Contrary to most things in computers, operations on floating point variables introduce error into the result; that error is rooted in the unavoidable approximation. All of the rounding controls and exception processing are indented to make the errors visible and controllable. In some fortunate cases, enough control is available to cause unavoidable errors to cancel each other out.

More comments in the code would greatly improve its maintainability, especially by programmers more resolute in their avoidance of certain substances.

I see where condition code is set in regs->psw.cc when the test mode bit is set or when either the source or target number formats are invalid.

And I see that the cc is not set after the operation, nor is the return code set in GR1 bits 32-63 (bits 0-31 are not changed by PFPO, so pfpo.c should not change them either).

Here is what should be done right now:

  • After line 2469, add code to set psw.cc to the value of variable cc, and set bits 32-63 of GR1 to zero. All better.

But wait, that can't be it, you think...or say. You'd be right. But the present coding allows no more than the above.

Here are things that PFPO requires that need to be added to pfpo.c:

  1. Conversions between precisions with the same source and target format, for example BFP Short to BFP Long, are not supported by PFPO. In test mode, condition code 3 should be returned. In "live" mode, a specification exception should be raised. File pfpo.c issues the specification exception for same-format source and target regardless of whether test mode has been requested.

This one is pretty easy. To correct, the code at lines 2304-2311 should be replace the code at 2431-2433.

  1. Nontrappable IEEE overflow and underflow should return condition code 1. Condition code 1 is never set in the program.

Adding this involves changes to a modest number of functions. Non-trivial, more research would be needed to provide details.

  1. Trappable IEEE overflow and underflow should generate a scaled result, with the scaling factor returned in GR1. If the Alternate Exception Action has been specified in GR0, condition code 2 should be set, otherwise a Program Interruption Data Exception should be taken. Even if the interruption occurs, the operation completes, GR0 is updated with a scale factor, and FPR4, and FPR6 if needed, are updated.

Scaled results are a bear. Scale factors are specific to target format and precision. A formula can be found on p.9-39 of PoO -13, but that's useless. Better to get the target- and precision-specific factors from PoO Figure 19-9 part 2 on p. 19-9, and Figure 20-6 on pp.20-12 to 20-14, for BFP and DFP, respectively.

BFP generates scaled results when trappable IEEE overflow or underflow occur. Adding this to pfpo.c involves changes to several functions and requires communication of the chosen scaling factor back to the function perform_floating_point_operation() function so that GR1 can carry that back to the program that executed the PFPO instruction.

  1. Hex floating point needs to be reviewed for appropriate handling of overflow and underflow. I haven't done that. Given that things need to be addressed in BFP and DFP, I suspect the same things need to be addressed in HFP. But because HFP predates the IEEE standards for floating point, it is its own thing. I cannot put on my IEEE-854 hat and make intelligent conversation about HFP. Heck, I have a tough enough time with BFP.

  2. A more comprehensive test suite is needed. The test case should generate the following verified outputs:

  • The converted result, compared in binary to the result obtained from a real system. Conversion of test sources from decimal representation or test results to decimal representation makes them easy to read but in many cases introduces error that invalidates the comparison.

  • The contents of FPR4 and FPR6. Both registers need to be checked for all operations; FPR6 should not be disturbed by short or long target precision.

  • The contents of the Floating Point Control Register

  • The contents of GR1, the return code. Usually zero, it contains the scale factor applied to a scaled result.

  • The condition code

  • The Data Exception Code (DXC) if the instruction trapped or should have trapped but used Alternate Exception Action.

  • The fact of a program interruption data exception when the Alternate Exception Action bit is zero and the result should have trapped.

Testing pfpo.c is a non-trivial task. A given input value will generate:

  • 6 different results (three precisions in each of two target formats),
  • 8 rounding modes specified in GR0
  • 5 rounding modes when GR0 references the FPC rounding mode and target is BFP
  • 8 rounding modes when GR0 references the FPC rounding mode and target is DFP
  • 3 tests for non-trap, trap, and trap with alternate exception handling.

Test results generated for each target

  • 3 x (8 + 5) x 3 results for BFP targets, or 117 results
  • 3 x (8 + 8) x 3 results for DFP targets, or 144 results
  • 3 x 8 x 3 results for HFP targets, or 72 results

First term: number of target precisions. Second term: number and source of rounding modes, GR0 or FPC. Third term: non-trap, trap, trap with alternate action. (PFPO generates very different results for non-trap versus either form of trap when IEEE-Overflow or IEEE-Underflow occurs.)

So one HFP source in one precision can, for thorough testing, generate 261 test cases. A DFP source in one precision can create 189 results, and a BFP source in one precision can generate 216 results. Just converting the value "+1.0" from nine source formats and precisions to all available target precisions will generate:

  • 3 DFP inputs: 567 result values
  • 3 BFP inputs: 648 result values
  • 3 HFP inputs: 783 result values

For a grand total of 1,998 results. Call it 2K. Not bad for one number, and not a very interesting one. But runtest is not up to the task.

I've been working on this but nowhere near something that is better than runtest for large-scale result tests.

@Fish-Git
Copy link
Member Author

Fish-Git commented Dec 2, 2022

Here is what should be done right now:

  • After line 2469, add code to set psw.cc to the value of variable cc,

That's what I was thinking of doing too. But of course looking at his code, 1 doesn't appear to ever be returned, which is wrong. Under certain conditions, PFPO is supposed to set CC1. That's one of the things I was having trouble with figuring out: where in Bob's code should he be returning 1 so that CC1 is properly returned by the instruction?

and set bits 32-63 of GR1 to zero.

Another thing I was also thinking of doing. But once again, always setting it to zero seems wrong. According to the manual, it's, I think, supposed to be the "Signed Scaling Exponent" value (whatever the heck that is), which I'm presuming just so happens, in most circumstances, to usually be zero.

Nevertheless, since it seems you're just as baffled by Bob's code as I am, I will make the suggested changes. It's probably the best we can hope for at the moment. We'll just have to document in the Release Notes that the PFPO instruction is known to not operate correctly in all circumstances. I don't like having to do that, but I think it's the best we can hope for given the circumstances.  :(

(I sure wish Bob (@rwoodpd) was more active as a developer and would jump in and make the needed changes to his own code for us! Bob? Where are you buddy?)

Here are things that PFPO requires that need to be added to pfpo.c:

  1. Conversions between precisions with the same source and target format, for example BFP Short to BFP Long, are not supported by PFPO. In test mode, condition code 3 should be returned. In "live" mode, a specification exception should be raised. File pfpo.c issues the specification exception for same-format source and target regardless of whether test mode has been requested.

This one is pretty easy. To correct, the code at lines 2304-2311 should be replace the code at 2431-2433.

Well.... close, but not quite.  :)

I understand what you're getting at though, so you're right that this fix is pretty easy. But it's not quite as easy as you make it out to be. The existing code at 2313-2318 that's returning immediately when the test bit is set, needs to have it's own mini-switch statement that simply sets numout = 0 for all the valid cases or numout = -1 for the default case, and then set the condition code to either 0 or 3 accordingly and return.

Otherwise we would have to add an if to each of our switch cases to check if test mode was active or not in order to skip actual processing. (Test mode isn't suppose to do any actual processing. Yes I know that none of the actual processing functions actually modify anything, but why waste the time executing all that code if test mode is set and you don't actually need to?)

In any case another easy fix I will certainly take care of. Thanks.

Scaled results are a bear. Scale factors are specific to target format and precision. A formula can be found on p.9-39 of PoO -13, but that's useless. Better to get the target- and precision-specific factors from PoO Figure 19-9 part 2 on p. 19-9, and Figure 20-6 on pp.20-12 to 20-14, for BFP and DFP, respectively.

I saw those too, but unfortunately they're mostly complete gibberish to me.  :(

  1. ... I cannot put on my IEEE-854 hat and make intelligent conversation about HFP. Heck, I have a tough enough time with BFP.

At least you have a IEEE-854 hat!  ;-)

I don't. And it wouldn't do any good giving one to me either: it wouldn't fit. My head's too small.

  1. A more comprehensive test suite is needed.

Duh! An understatement if I ever heard one.  :)

Testing pfpo.c is a non-trivial task.

Another understatement.  :)

I've been working on this but nowhere near something that is better than runtest for large-scale result tests.

Anything that you (or anyone else for that matter!) can do for us, Steve, would be greatly appreciated!

ANYTHING!

In the mean time I want you to know how very much I (indeed all of us!) appreciate your feedback on this issue, Steve. You're a good man.

I'll go ahead and make the few quick changes you suggested for now, and keep scratching my head over Bob's code. Maybe given enough time I actually might start to understand some of it. Or at least hopefully enough to maybe figure out where cc=1 needs to be returned and/or where the GR1 Return Code value needs to get set.

Take care, Steve, and thanks for the help.

Stay away from the acid.  ;-)

@Fish-Git Fish-Git added the Ongoing Issue is long-term. Variant of IN PROGRESS: it's being worked on but maybe not at this exact moment. label Dec 3, 2022
@srorso
Copy link
Contributor

srorso commented Dec 3, 2022

Well.... close, but not quite. :)

I understand what you're getting at though, so you're right that this fix is pretty easy. But it's not quite as easy as you make it out to be. The existing code at 2313-2318 that's returning immediately when the test bit is set, needs to have it's own mini-switch statement that simply sets numout = 0 for all the valid cases or numout = -1 for the default case, and then set the condition code to either 0 or 3 accordingly and return.

The switch numout is init'd to zero at the beginning of the function and is set positive when a valid combination of source and target formats are encountered. The default branch (invalid code or combination) for each of the three case statements sets it to negative one. I suspect one could just move the if that sets cc = 3 to follow the third case statement and things would improve.

...I've been wrong before, will be again..

@Fish-Git
Copy link
Member Author

Fish-Git commented Dec 3, 2022

The switch numout is init'd to zero at the beginning of the function and is set positive when a valid combination of source and target formats are encountered.

Again, close but no cigar. numout is indeed init'd to zero at the beginning of the function, but it isn't set to a positive value until after the actual conversion function is called in the giant switch statement that's switching based on opcode. It is that giant switch (opcode) switch that is currently the only place where the combination of source and target formats are validated. The preceding switches are only checking the source or target formats individually (switch (ofc1) and switch (ofc2)).

The default branch (invalid code or combination) for each of the three case statements sets it to negative one.

Where are you seeing this?

The default branch in the previous two switches only sets numout to negative one if that individual source or target format is invalid (ofc1 and ofc2), but neither of them validate the combination of the two. The combination of ofc1 and ofc2 is held in opcode, and the only switch that switches based on opcode (and thus the only switch that validates the combination of source and target formats) is the giant "Process their request" switch that I'm trying to avoid executing when in test mode.

And what do you mean by "each of the three case statements"? I'm seeing 9 cases (plus a default case) in each of the first two switches, and 54 cases (plus the default) in the third giant switch.

Are you tripping again? I thought you promised to stay away from that acid shit.  :)

I suspect one could just move the if that sets cc = 3 to follow the third case statement and things would improve.

Nope. The only way to do it that I can see is by adding another switch (opcode) between the first two switches (that are switching based on ofc1 and ofc2) and the giant switch (opcode) switch that processes the actual conversion request. This intermediate switch (opcode) switch would be executed only if test mode was set, and would validate the combination of source and target format without actually calling any of the conversion functions, and then return cc=0 or cc=3 as appropriate.

There are a few other tweaks I want to make related to the handling of the FPC flags defined in esa390.h that aren't specifically related to the PFPO instruction (but which do impact it) as well as one extremely minor fix to the SRNM instruction, but I need to be careful. I don't want to break things. But they're important tweaks that should be made IMHO.

I'll post my final patch here for peer review before committing it just in case my own mind is misbehaving as badly as yours apparently is.  ;-)

(Which wouldn't be the first time for ME either!)

@Fish-Git
Copy link
Member Author

Fish-Git commented Dec 13, 2022

Committed 2e984f9:

Improve PFPO instruction (partial almost fix):

NOTE: it's still not right (i.e. it still contains known bugs), but this is the best we can do for now. The condition code is at least now being set (probably not correctly in all situations though) as well as the Return Code value in GR1 too (same thing: probably not correctly in all situations), but it's an improvement over what we had before which wasn't setting either one!

It's going to take a non-insignificant amount of time and effort to get it completely fixed, but for now this should HOPEFULLY suffice.

I will continue studying Bob's code as time permits to hopefully ONE DAY figure out where CC=1 and GR1 Return Code value needs to be properly set. But don't hold your breath. I consider this to be low priority right now, as there are other more important things I'm working on.

I encourage ALL DEVELOPERS  (as well as any users out there skilled at Floating-Point!) to PLEASE take a crack at trying to help us out!   (Please?)

Thanks.

Will leave GitHub Issue status as "OnGoing"...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BUG The issue describes likely incorrect product functionality that likely needs corrected. HELP! Help is needed from someone more experienced or I'm simply overloaded with too much work right now! Ongoing Issue is long-term. Variant of IN PROGRESS: it's being worked on but maybe not at this exact moment.
Projects
None yet
Development

No branches or pull requests

2 participants