Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new AutoScaler and CustomScalerBase classes #1429

Draft
wants to merge 63 commits into
base: main
Choose a base branch
from

Conversation

andrewlee94
Copy link
Member

@andrewlee94 andrewlee94 commented Jun 4, 2024

Depends on #1436

Summary/Motivation:

This PR adds a draft of the new Scaler classes along with some general utility functions for the new scaling interface. I have generally created new functions even if older ones existed to make a clean break from the older API and assist with backward compatibility.

Draft Documentation: Initial Outline of Documentation for Scaling Tools.docx

Changes proposed in this PR:

  • New ScalerBase class
  • New AutoScaler class
  • New CustomScalerBase class
  • Utility methods for manipulating scaling suffixes
  • Demonstration tests of new methods on Gibbs reactor

Legal Acknowledgement

By contributing to this software project, I agree to the following terms and conditions for my contribution:

  1. I agree my contributions are submitted under the license terms described in the LICENSE.txt file at the top level of this directory.
  2. I represent I am authorized to make the contributions and grant the license. If my employer has rights to intellectual property that includes these contributions, I represent that I have received permission to make contributions and grant the required license on behalf of that employer.

@andrewlee94 andrewlee94 self-assigned this Jun 4, 2024
idaes/core/scaling/autoscaling.py Outdated Show resolved Hide resolved
Comment on lines +242 to +249
jac, nlp = get_jacobian(blk, scaled=True)

if con_list is None:
con_list = nlp.get_pyomo_equality_constraints()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my experience, get_jacobian is slow because forming the Pynumero NLP is slow. If we have flowsheet-level variables, we end up getting a Jacobian for the flowsheet and every subcomponent. I'm not sure how slow forming the NLP is going to be for PropertyBlock 43 in the MEA column, but you're still at least doubling the work you're doing.

I would suggest getting the Jacobian and NLP once at the beginning of constraints_by_jacobian_norm, then just referring to rows and columns of the overall object as you want to scale constraints.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been thinking about there best way to do this, and did realize I was probably getting the Jacobian more times than necessary. I was not aware of the issue of having higher-level components in the problem causing the entire problem to be emitted (and given how properties work that will be an issue). I am also thinking there might need to be some checks to deal with deactivated components (and that it might be possible to deactivate unnecessary components before getting the Jacobian to reduce the problem size).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've fixed this so this method will only be called once for any call to the scaling method.

constraints_by_jacobian_norm (line 156) collects all the constraints we want to scale and the top level block, and then only calls the autoscaler once.

# Use scipy to get all the norms
# Should be more efficient that iterating in Python
axis = (
1 # Could make this an argument to also support variable-based norm scaling
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on your comment, should we consider adding another method for variable-based norm scaling?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know if this makes sense or not - comments would be welcome (@dallan-keylogic ).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we scale variables first based on order-of-magnitude, we can then scale constraints based on row/column norm. However, norm-based variable scaling probably won't be useful for a fully unscaled model, because the most common Jacobian entries will be O(1).

For a partially-scaled model, though, filling in variable scaling factors based on column norms might be useful.



@document_kwargs_from_configdict(CONFIG)
class AutoScaler:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice to have a method like report_scaling that would give you vars and constraints in one column, scaling factor values in the 2nd column, and the scaled value in the third column.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first two you can get by doing suffix.display(). The third part - having scaled value is easy enough for Vars, but what is the "scaled value" for a Constraint? Scaled residual is easy to get, but not always that meaningful (as the scaled Jacobian is often more meaningful).

Comment on lines 478 to 481
assert jacobian_cond(methane, scaled=True) == pytest.approx(9191, abs=1)

# Check for optimal solution
assert check_optimal_termination(results)
assert len(extreme_jacobian_rows(methane, scaled=True)) == 0
assert len(extreme_jacobian_rows(methane, scaled=True)) == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, adding to my other suggestion on thinking about a reporting method for Autoscaler, these would also be good to show in report output (e.g., condition number, extreme jac rows/columns or at least the number of each).

I realize that the DiagnosticsToolbox already has the capability to report on some of these things, but just thinking it'd be nice to be able to apply scaling via AutoScaler and subsequently check what you really did to model scaling with a report method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is what the Diagnostics tools are for - putting those methods here would just be duplication of code (and the diagnostics also tell you a lot more). The eventual documentation will highlight that you really need to use diagnostics, scaling and initialization tools together to get the best results.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the diagnostics toolbox reference these new tools?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agarciadiego I am not sure there - we probably need to think about how and where we draw the line between the two. Up until now, the Diagnostics Toolbox has been about finding the issues, and hasn't said much about how to fix them (as that is not so easy to explain concisely). We can definitely mention it in the docs, and maybe we could have the display methods mention tools to help fix issues (where appropriate).

@ksbeattie ksbeattie added the Priority:Normal Normal Priority Issue or PR label Jun 6, 2024
Copy link
Contributor

@MarcusHolly MarcusHolly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent some time trying to apply the autoscalers to the BSM2 flowsheet in WaterTAP, but I think that flowsheet is too much of a monstrosity to naively expect the autoscalers to resolve all the problems. That being said, I didn't have much time to test out the other tools extensively and I'll be out for the next two weeks.

idaes/core/scaling/custom_scaler_base.py Outdated Show resolved Hide resolved
idaes/core/scaling/custom_scaler_base.py Outdated Show resolved Hide resolved
idaes/core/scaling/scaler_profiling.py Outdated Show resolved Hide resolved
Co-authored-by: MarcusHolly <96305519+MarcusHolly@users.noreply.github.com>
@ksbeattie ksbeattie added Priority:High High Priority Issue or PR and removed Priority:Normal Normal Priority Issue or PR labels Aug 8, 2024
@ksbeattie
Copy link
Member

Making this high priority so it gets the attention from others that is needed.

@andrewlee94
Copy link
Member Author

Some initial documentation for the code:

Initial Scaling Routine Profiling.docx

model.fs.unit.outlet.temperature[0].fix(2844.38)
model.fs.unit.deltaP.fix(0)

from_json(model, fname=fname, wts=StoreSpec.value())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this json file? If I wanted to use the scaling profiler on a flowsheet, would I need to generate a similar json?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is for test purposes so that we can load a solution into the model without having to spend time on initializing it first. You could replace that with a standard initialization and solve, but ti would take longer to run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority:High High Priority Issue or PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants