Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alfoa/hybrid model for batching and ensemble model #2322

Open
wants to merge 32 commits into
base: devel
Choose a base branch
from

Conversation

alfoa
Copy link
Collaborator

@alfoa alfoa commented May 18, 2024


Pull Request Description

What issue does this change request address? (Use "#" before the issue to link it, i.e., #42.)

Closes #2321

What are the significant changes in functionality due to this change request?

The development here is addressing 2 issues:

  • LogicalModel/HybridModel can now be used within an EnsembleModel
  • LogicalModel/HybridModel can now be used with Samplers/Optimizers that work using the batching strategy (e.g. GeneticAlgorithm).

In addition, an utility factory has been added in the JobHandler to collect and record Job Ids that have been used during the simulation. For now it is used as error checker but in the future can be used as base-class for identifier creation.


For Change Control Board: Change Request Review

The following review must be completed by an authorized member of the Change Control Board.

  • 1. Review all computer code.
  • 2. If any changes occur to the input syntax, there must be an accompanying change to the user manual and xsd schema. If the input syntax change deprecates existing input files, a conversion script needs to be added (see Conversion Scripts).
  • 3. Make sure the Python code and commenting standards are respected (camelBack, etc.) - See on the wiki for details.
  • 4. Automated Tests should pass, including run_tests, pylint, manual building and xsd tests. If there are changes to Simulation.py or JobHandler.py the qsub tests must pass.
  • 5. If significant functionality is added, there must be tests added to check this. Tests should cover all possible options. Multiple short tests are preferred over one large test. If new development on the internal JobHandler parallel system is performed, a cluster test must be added setting, in XML block, the node <internalParallel> to True.
  • 6. If the change modifies or adds a requirement or a requirement based test case, the Change Control Board's Chair or designee also needs to approve the change. The requirements and the requirements test shall be in sync.
  • 7. The merge request must reference an issue. If the issue is closed, the issue close checklist shall be done.
  • 8. If an analytic test is changed/added is the the analytic documentation updated/added?
  • 9. If any test used as a basis for documentation examples (currently found in raven/tests/framework/user_guide and raven/docs/workshop) have been changed, the associated documentation must be reviewed and assured the text matches the example.

ravenframework/JobHandler.py Outdated Show resolved Hide resolved
alfoa added 3 commits May 17, 2024 22:02
…github.com:idaholab/raven into alfoa/hybrid_model_for_batching_and_ensemble_model
@@ -136,7 +136,7 @@ def localInputAndChecks(self,xmlNode):
self.raiseAnError(IOError, "Input XML node for Model" + modelName +" has not been inputted!")
if len(self.modelsInputDictionary[modelName].values()) > allowedEntriesLen:
self.raiseAnError(IOError, "TargetEvaluation, Input and metadataToTransfer XML blocks are the only XML sub-blocks allowed!")
if child.attrib['type'].strip() == "Code":
if child.attrib['type'].strip() in ["Code", 'HybridModel', 'LogicalModel']:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very ugly :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be slightly less ugly if you used a set:

       if child.attrib['type'].strip() in {"Code", 'HybridModel', 'LogicalModel'}:

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -267,6 +268,12 @@ def initialize(self,runInfo,inputs,initDict=None):

# initialize model
self.modelsDictionary[modelName]['Instance'].initialize(runInfo,inputInstancesForModel,initDict)
if modelType in ['HybridModel', 'LogicalModel']:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe should be replaced by issubclass(self.modelsDictionary[modelName]['Instance'], HybridModelBase)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you suggestion will be better here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

ravenframework/Models/Model.py Outdated Show resolved Hide resolved
alfoa added 3 commits May 20, 2024 08:59
Merge branch 'alfoa/hybrid_model_for_batching_and_ensemble_model' of github.com:idaholab/raven into alfoa/hybrid_model_for_batching_and_ensemble_model
ravenframework/JobHandler.py Outdated Show resolved Hide resolved
ravenframework/JobHandler.py Outdated Show resolved Hide resolved
@@ -136,7 +136,7 @@ def localInputAndChecks(self,xmlNode):
self.raiseAnError(IOError, "Input XML node for Model" + modelName +" has not been inputted!")
if len(self.modelsInputDictionary[modelName].values()) > allowedEntriesLen:
self.raiseAnError(IOError, "TargetEvaluation, Input and metadataToTransfer XML blocks are the only XML sub-blocks allowed!")
if child.attrib['type'].strip() == "Code":
if child.attrib['type'].strip() in ["Code", 'HybridModel', 'LogicalModel']:
self.createWorkingDir = True
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is ugly and in addition this creates a sub-working directory even if, for example, the Logical/hybrid models do not use a Code. In case of HybridModel/LogicalModel using only ExternalModels/ROMs, the subdirectory is created but will stay empty. Not very elegant. @joshua-cogliati-inl @wangcj05 any ideas on how to improve this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangcj05 @joshua-cogliati-inl any ideas for this? I cannot find a better solution at this stage

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a class attribute to indicate if there is a code associated with the Model? For example, In Hybrid Model/Logical/Ensemble Model, we define a self._isCodeAvail, and assign it to true when we detect a code in the Model. @alfoa

Copy link
Contributor

@joshua-cogliati-inl joshua-cogliati-inl Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I am trying to fully understand why you need to create the directory? Just to check, it is needed if the Logical/hybrid models use a Code? (Congjian's idea sounds reasonable)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes that's why. If there is a Code in the underlying Logical/Hybrid model (contained in the ensemble model) the subfolder is required.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangcj05 @joshua-cogliati-inl can you tell me how exactly you would like that flag to be coded? (I prefer not to take a code design decision (that might be needed to be modified) on my own)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangcj05 @joshua-cogliati-inl FY: if you can send feedbacks within tomorrow I can try to address them before leaving on Friday. Otherwise it will need to wait till September.

@moosebuild
Copy link

Job Mingw Test on ab91508 : invalidated by @joshua-cogliati-inl

restarted civet

@moosebuild
Copy link

Job Precheck on 5230636 : invalidated by @alfoa

…rashes if there are variables that are not called to be optimized but are strings
@moosebuild
Copy link

Job Mingw Test on b25a176 : invalidated by @joshua-cogliati-inl

failed scripts/establish_conda_env.sh: line 153: 21634 Segmentation fault

Comment on lines 574 to 579
# start the job handler
self.localPollingThread = InterruptibleThread(target=self.localJobHandler.startLoop)
self.localPollingThread.daemon = True
self.localPollingThread.start()
# register function to kill the thread at the end of the execution
atexit.register(self.localPollingThread.kill)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, pollingThread (both here and in Simulation.py) possibly should be inside JobHandler. That said, this is just mimicking how Simulation does this, so I am not sure if this needs to be fixed now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can the localPollingThread be killed sooner, rather than waiting for the program to exit?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alfoa Could you address Josh's comment here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added in a dedicated function that is executed at the end of the step

Copy link
Collaborator

@wangcj05 wangcj05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alfoa I have some general comments for you to consider.

nPoints = preVolumeCalc.attrib.get("nPoints")
if nPoints is not None:
self.nVolumePoints = utils.intConversion(utils.floatConversion(nPoints))
print(nPoints)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

ravenframework/JobHandler.py Show resolved Hide resolved
ravenframework/Models/Code.py Show resolved Hide resolved
@@ -267,6 +268,12 @@ def initialize(self,runInfo,inputs,initDict=None):

# initialize model
self.modelsDictionary[modelName]['Instance'].initialize(runInfo,inputInstancesForModel,initDict)
if modelType in ['HybridModel', 'LogicalModel']:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you suggestion will be better here.

Comment on lines 574 to 579
# start the job handler
self.localPollingThread = InterruptibleThread(target=self.localJobHandler.startLoop)
self.localPollingThread.daemon = True
self.localPollingThread.start()
# register function to kill the thread at the end of the execution
atexit.register(self.localPollingThread.kill)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alfoa Could you address Josh's comment here.

Comment on lines 251 to 253
if not isThereACode:
isThereACode = modelType == 'Code'

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest we add these lines in Logical model and Hybrid model.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I addressed this as well.

@alfoa
Copy link
Collaborator Author

alfoa commented Aug 27, 2024

@wangcj05 @joshua-cogliati-inl I addressed all the comments.

Copy link
Collaborator

@wangcj05 wangcj05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes look good.

@joshua-cogliati-inl Do you have other comments?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DEFECT] LogicalModel/HybridModel cannot be used in an EnsembleModel
5 participants