-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infrastructure to import completed calculation jobs #12
Conversation
7116fce
to
37e0649
Compare
003_calcjob_immigrant/readme.md
Outdated
* The created calculation job node, will have an attribute `immigrated=True`. | ||
|
||
### Open questions | ||
* Should an immigrated calculation be considered as a valid node by the caching mechanism? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a difficult one.. I can see both cases having their use. Of course caching can help to reduce duplicate calculations here - but it can lead to problems if the "input reconstruction" is not perfectly accurate.
Note that the immigrated=True
attribute would (unless explicitly ignored) prevent caching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would therefore definitely be tempted to at least initially not include immigrated calculations in the caching. Once the immigration functionality matures and we become convinced that it would be safe, we can always enable it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And how about calculations that have been launched based on some immigrated result. Should the immigrated tag follow? There might be good reasons to do this in order to disentangle the results after a while to get a clearer view of the provenance. Or should we just rely on utility functions to track down the source and look for these flags etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the attribute should propagate - basing a calculation on an immigrated result is similar to e.g. manually creating a FolderData
with some input. The calculation itself has actually run within AiiDA, with the inputs that were specified.
If it is needed to know if a calculation might have an immigrated ancestor this should be a separate flag, or solved through a utility function as you mentioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am also not sure about this. As @greschd says, we'll need anyway to adapt the caching code to deal properly with the new attributes.
I'm almost tempted to leave caching active based on the following considerations:
- one could not create the hash by default, effectively disabling caching, and then provide a easy interface to say "please hash"
- most probably the inputs will not be identical to those of a real calculation submitted via AiiDA, because in many cases the function to reconstruct the inputs will have to make small assumptions
On the other hand there is the risk that the reconstruction creates wrong inputs, that are then catched by caching, so I'm not 100% sure.
Maybe there is an easy way to define the defaults of the caching mechanism so as to avoid to use caching, and this can be made as an option? This probably should rely on the existence of the attribute, so requires a small adaptation to the caching options.
Anyway, this can be done in a second phase, I don't think it's crucial
003_calcjob_immigrant/readme.md
Outdated
* The similarity in launching mechanism for native and immigration calculation jobs may actually be confusing to users. The fact that the only difference is the additional `immigrate_remote_folder` input node, might go overlooked. | ||
I actually think this will not be a problem but I am including it here as it has been brought up as a con of the current design by other in private conversations. | ||
Since one actually have to construct a `RemoteData` and include it in the inputs, I do not think that this happens by accident or to an unexpecting user. | ||
* By purposefully not providing a basic interface for the conversion of output files to inputs nodes in `aiida-core`, we run the risk that the various solutions and their interfaces, that will be designed and provided by the plugins will be wildly disparate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand it correctly that we would leave converting the existing folder into inputs up to the user, and they should just pass them as inputs to the calculation? I have a few concerns about this approach:
- How do we deal with inputs that are handled usually not at the plugin, but at the AiiDA level? For example, should each plugin implement a parser that gets back the command-line arguments and scheduler options?
- It will be difficult to provide unified instructions for immigrating calculations, or even turn it into a simple verdi command.
- What happens to calculations whose reconstructed inputs (for whatever reason) don't pass the input validation? I think in the case of an immigrated calculation, there should be a way to override the validation.
An alternative approach would be having an (optional) method (e.g. _reconstruct_inputs
) on the CalcJob
. AiiDA could implement the basic (cmdline, scheduler input) parsing. The plugins can then take this information and turn it into the correct inputs. Of course the disadvantage there is that you can only immigrate when it's implemented by the plugin. A significant advantage is that it's guaranteed that the inputs are actually obtained from the same directory as the outputs.
I'm not sure which of these approaches is better, maybe the best would even be a mix of the two.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with leaving the parsing up to AiiDA itself is that I think it is impossible for AiiDA to parse the input nodes themselves, as that is code specific, but even the code independent stuff like scheduler options and command line parameters that is written by AiiDA and the scheduler plugin, will be very difficult to reliably parse. This would have been possible if the input files would have been written by AiiDA, but that is exactly the point of the immigration process, we are dealing with files written by some other program or even by hand by a human. There is no telling what information will be there and in what form. I don't see how we could possibly implement a parser on aiida-core
that could retrieve this information in a reliable way. That is why I think we are forced to leave that up the user. Of course the plugin may simplify this work a lot and provide an easy API to do this. We have already implemented this for aiida-quantumespresso
. This is what one of my design questions pertains to. How much of this higher-level interface should we formalize in AiiDA?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aah you're right -- that had completely escaped me.
What is your opinion on being able to import calculations with inputs that wouldn't pass validation? The user could well have run a calculation that is not supported by the plugin - and it could still have value to get the outputs into AiiDA.
A related question is if we want to allow incomplete / missing inputs - if an input parser is not implemented, the end user might decide it's enough to just parse outputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is your opinion on being able to import calculations with inputs that wouldn't pass validation? The user could well have run a calculation that is not supported by the plugin - and it could still have value to get the outputs into AiiDA.
This is a good question and to be honest I am not sure. Since the plugin will have to provide the input parser, it knows of course what inputs will be valid. It is a bit difficult to imagine what kind of calculations may not be parsed into a valid set of inputs, but of course it is unlikely that this will never occur. However, also here, until we know the frequency of this scenario and the potential effects if we were to allow to disable the validation, we might want to err on the safe side. Once the functionality matures and there are more use cases we could always relax restrictions.
A related question is if we want to allow incomplete / missing inputs - if an input parser is not implemented, the end user might decide it's enough to just parse outputs.
This, however, I definitely think is not a good idea. The reason is that we already put strict rules on what can be exported and deleted in the provenance graph. We do not allow to export/delete arbitrary subgraphs as we want the resulting graphs to always obey certain rules. For example we do not allow to only delete the inputs of a process, because the process and its outputs would be meaningless in terms of provenance. Allowing to import calculations without their inputs would nullify these constraints of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree it's a good idea to start with strict validation, and only relax if needed. I think it would make sense to write that explicitly in the AEP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think @espenfl has a point that this will be quite common.. maybe we could make validation optional in this case? If there is a flag to explicitly disable the validation on immigration it should be enough of a warning, while still allowing the immigration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are my 2 cents:
I would be tempted to say we can (and should) define a common API for the function to reconstruct the inputs).
Standardising it's extremely easy, and you essentially already did it, here is the API:
inputs = get_inputs_from_folder(remote_data)
The only thing to do is probably to allow arbitrary kwargs to tune the behaviour, and decide how this should be exposed: as a method of the CalcJob as described by @greschd, or e.g. as another class exposed via entry points, so it can be provided by a different plugin package, but sharing the same entry point name (but we should be careful if two people start implementing it in parallel), possibly with a method to get the ReconstructorClass from the CalcJob class.
Then my suggestion would be to define a "standard" wrapper function in AiiDA that essentially does what you did before (I'm using as an example a factory to get from an entry point, just to make a concrete example, but could be done differently):
def immigate_calculation(calc_entry_point, remote_data, options=None):
if options in None:
options = {}
reconstructor = ReconstructorFactory(calc_entry_point)
inputs = reconstructor.get_inputs_from_folder(remote_data, **options)
inputs['immigrate_remote_data'] = remote_data
results, node = run.get_node(CalcJobClass, **inputs)
return results, node
In this way we can easily e.g. make this into a verdi
command as @greschd suggested, with very limited effort, and working for any plugin (or at least any that provides a reconstructor).
I would also be tempted to suggest to use this function by default to people, and not the explicit way.
The reason is that I could just do this:
results, node = run.get_node(CalcJobClass, immigrate_remote_data = remote_data)
i.e. not pass any other input, and this would work (unless there is validation), but I think we should discourage this by default.
Or, even worse, you could attach completely random input (maybe by mistake, let's say you are looping over many calculations to immigrate but you mess up and swap inputs).
Points to discuss:
- is it ok that this should be provided by the plugin? I think so and it encourages people to contribute to the main repo. How to recreate the inputs is very much connected to how that specific plugin defined them.
- should we allow people to go via the
run.get_node
path? Again, I'm a bit hesitant and I would be tempted to say that I don't see a big benefit in using run_get_node, while I see the potential problem of inconsistent/incoherent inputs. I would remove this responsibility from the users' hands, and rely on a Reconstructor class released in a plugin package. - should we validate? Good question: I would be tempted to say yes, considering the original point of making things as close as possible as if they were run by AiiDA (and therefore this also includes the code(s)). I see however @espenfl 's point.
Maybe we should have then a compromise: we add an input flag "dont_validate" or something like this, that allows to pass anything as input, but then this is also recorded as a second attribute. If we instead go theimmigrate_calculation
way, this would do the validation (at least by default).
As a note: we should also try to be crystal clear on what 'validation' means here. I think what you are referring is the validation of the input ports; but there will be normally no check on the actual content of the nodes (e.g. if the input parameters dict makes sense - I don't even say if the values are correct, but if it follows the correct "schema", i.e. if I were to reuse it in a real calculation, at least the calculation would run).
So even 'validated' inputs might be wrong (maybe we should find a more specific name, like ports_validated = True, or something even better).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the comments.
Yeah, I agree it's a good idea to start with strict validation, and only relax if needed. I think it would make sense to write that explicitly in the AEP.
@greschd I have included this explicitly in the AEP.
So in many cases users would not even have their input data, or at least not the complete set of it. Any reconstruction would be as pointless as nothing, more or less. It is rather likely that the user would not remember what versions of code (or what steps they did to arrive at the final results) they executed their calculation with, and then, the allowed inputs, defaults etc. might have changed, so without knowing more than just the inputs, also parsing them in a strict manner during immigration might not make sense and give a false sense of provenance.
@espenfl This is a point well taken. I am really torn about what to do here. On the one hand, the entire point of this functionality is to provide value to users, but if they cannot use it in many cases, that would make it rather pointless. That being said, the integrity of the provenance graph is very important in AiiDA. We spent a lot of effort making sure this is and remains the case and there are already a lot of limitations in place that forbid arbitrary manipulations of or additions to the provenance graph. Therefore, at least for the initial design and implementation, I would try to err on the safe-side and require strict validation. If we see that this really makes things unusable we can try to see how we can progressively add options to relax parts of the validation. I have included this sentiment more explicitly in the AEP text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @giovannipizzi for your extensive comments. If I can summarize, I think your main objection to the current proposal is that you think we should provide a standardized interface for the class/function that reconstructs inputs from an existing RemoteFolder
, and you don't like the users performing the migration by launch the process with the reconstructed inputs themselves directly. I can definitely see the first point and would even agree, but I merely see this as something that could be added on later, while we work on the underlying mechanism. That being said, I don't fully understand the second point. It seemed like you didn't like the mechanism of using the same launch system, with the addition of an additional immigrate_remote_data
input. But then in your suggestion, you merely say that we should have a wrapper function. If you are really concerned about people directly using the launch method, just providing a "wrapper" function is not going to stop them. That being said, I don't think there is a risk whatsoever. If people use and accidentally make a mistake, they can simply delete the nodes, no harm done.
But really, I think we are not really disagreeing. Like I said, I think it is good to provide a standardized interface if we can. Nothing in the proposed mechanism prohibits that, it is simply a convenience layer on top. Drawing from suggestions made by Dominik and yourself, I would propose the following. On the CalcJob
class, we implement the classmethod:
class CalcJob:
@classmethod
def immigrator(cls, entry_point_name: str = None) -> CalcJobImmigrator:
"""Load the `CalcJobImmigrator` associated with this `CalcJob` if it exists.
By default an immigrator with the same entry point as the ``CalcJob`` will be loaded, however, this can be
overridden using the ``entry_point_name`` argument.
:param entry_point_name: optional entry point name of a ``CalcJobImmigrator`` to override the default.
:return: the loaded ``CalcJobImmigrator``.
:raises: if no immigrator class could be loaded.
"""
if entry_point_name is None:
from aiida.plugins.entry_point import get_entry_point_from_class
_, entry_point_name = get_entry_point_from_class(cls.__module__, cls.__name__)
return CalcJobImmigratorFactory(entry_point_name)
The immigrators would be registered in a new entry point group aiida.calculations.immigrators
whose entry points can be loaded through the new factory:
def CalcJobImmigratorFactory(entry_point_name: str) -> CalcJobImmigrator:
"""Return the ``CalcJobImmigrator`` sub class registered under the given entry point.
:param entry_point_name: the entry point name.
:return: the loaded ``CalcJobImmigrator`` class.
:raises ``aiida.common.InvalidEntryPointTypeError``: if the type of the loaded entry point is invalid.
"""
from aiida.plugins import BaseFactory
immigrator = BaseFactory('aiida.calculations.immigrators', entry_point_name)
if not isclass(entry_point) or not issubclass(entry_point, CalcJobImmigrator):
raise InvalidEntryPointTypeError('invalid entry point name.')
return immigrator
Then the final piece of the puzzle that is remaining is the base class for the CalcJobImmigrator
. We can include something like the following:
class CalcJobImmigrator:
@abc.abstractmethod
def parse_remote_data(self, remote_data: RemoteData, **kwargs) -> typing.Dict[str, Union[Node, Dict]]:
"""Parse the input nodes from the files in the provided ``RemoteData``.
:param remote_data: the remote data node containing the raw input files.
:param kwargs: additional keyword arguments to control the parsing process.
:returns: a dictionary with the parsed inputs nodes that match the input spec of the associated ``CalcJob``.
"""
Plugins can implement this for any CalcJob
class and register it with any entry point name they desire under the aiida.calculations.immigrators
entry point group. The repository that hosts the actual CalcJob
can provide an immigrator using the same entry point name, making it easy to obtain the migrator, as the CalcJob.immigrator
classmethod will automatically fetch it. However, with this functionality, users can implement their own immigrator and override it.
Note that this would also allow us to write generic endpoint in verdi
as you suggested. Something like:
verdi calcjob immigrate REMOTE_DATA ENTRY_POINT_NAME [IMMIGRATOR_ENTRY_POINT_NAME] [OPTIONS]
The implementation would be something like
@click.command()
def calcjob_immigrate(remote_data, entry_point_name, immigrator_entry_point_name, **kwargs):
"""Immigrate a calculation job from a remote data node."""
cls = CalculationFactory(entry_point_name)
immigrator = cls.get_immigrator(immigrator_entry_point_name)
inputs = immigrator.parse_remote_data(remote_data, **kwargs)
inputs['immigrate_remote_data'] = remote_data
node, results = run.get_node(cls, **inputs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@giovannipizzi Thinking about this some more, I don't think it will be (trivially) possible to provide a built in entry point in verdi
that can be used to immigrate a calculation of any plugin, even if an immigrator is available. The reason is that each plugin may require additional arguments, in addition to the remote_data
, that are necessary to reconstruct the input. Typical example for aiida-quantumespresso
one needs to define the name of the input file. Reason being that the input file can be passed as a command line argument, or can be passed through stdin
, and so the filename can be anything. As a result, the immigrator cannot guess what the input file is in the remote_data
. Of course one can think of more complex logic that scans all input files and tries to determine automatically which file is the input file, but this may not be trivial. Even so, this illustrates a point that there might be plugins that require other information that is impossible to automatically determine.
This means that the CLI command would need to accept options that are immigrator dependent. This is possible, but would require building a more complex system to make the command pluginnable. Anyway, I think this shows that it is never going to be super easy and the user will always have to read the plugin specific documentation to understand what required arguments should be passed. So in the end, I don't think the interface can become much more standardized or easy than I originally sketched. I will include this analysis in the AEP as argumentation for the design of the interface. Feel free if you see problems with the reasoning or have solutions for it that I missed.
003_calcjob_immigrant/readme.md
Outdated
|
||
## Background | ||
When new users come to AiiDA, often they will already have completed many simulations without the use of AiiDA. | ||
Seamlessly integrating those into AiiDA, as if they *would* have been, such that they become directly interoperable and almost indistinguishable from any calculations that will be run with AiiDA in the future, is a feature that is often asked for. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But also in many cases not really possible...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you mean here. Can you indicate exactly what part of which sentence you would change and how?
003_calcjob_immigrant/readme.md
Outdated
The most natural choice of communicating the location of the output folder to the engine would therefore be a `RemoteData` node and simply pass it in as one of the inputs. | ||
The base `CalcJob` class would simply define an optional input port `immigrate_remote_folder` that can take a `RemoteData` node. | ||
The benefits of using the `RemoteData` node is that it combines the two pieces of required information, the absolute path of the folder and the computer on which it resides, into one. | ||
Another option would be to use the `metadata.options` namespace of the `CalcJob` inputs namespace, but there one would have to add at least two fields to host both pieces of information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RemoteData
seems cleaner.
003_calcjob_immigrant/readme.md
Outdated
Then again, since there is no way for the engine to check if the presented `Code` actually corresponds to whatever it was that was run to produce the output files that are to be immigrated, the user can pass *any* `Code` instance. | ||
The question then becomes whether it is better to not have information at all than to potentially have incorrect information. | ||
Ultimately though, there are a lot ways to lose provenance in AiiDA and it is always up to the user to try and minimize this. | ||
Also in this situation one should probably just instruct the user to be aware of this fact and suggest to construct a `Code` and `Computer` instance that represents the actual run code as closely as possible, as it is in their own best interest. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me it seems these should be empty by default, but can be filled by the user in case the user would like to build a more consistent provenance into the immigration process. But this is not so easy I think and there are arguments that goes both ways. Having said that I think most users would be able to give some kind of Code
that at least contains what is the name of the code. This can of course be useful in a query process later, regardless of its version and lack of proper provenance. Computer is tied to RemoteData
, or can be overridden. Maybe require that these are set and then leave it up to the plugin developers is the best approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general it looks extremely good! I left a couple of comments
003_calcjob_immigrant/readme.md
Outdated
* The created calculation job node, will have an attribute `immigrated=True`. | ||
|
||
### Open questions | ||
* Should an immigrated calculation be considered as a valid node by the caching mechanism? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am also not sure about this. As @greschd says, we'll need anyway to adapt the caching code to deal properly with the new attributes.
I'm almost tempted to leave caching active based on the following considerations:
- one could not create the hash by default, effectively disabling caching, and then provide a easy interface to say "please hash"
- most probably the inputs will not be identical to those of a real calculation submitted via AiiDA, because in many cases the function to reconstruct the inputs will have to make small assumptions
On the other hand there is the risk that the reconstruction creates wrong inputs, that are then catched by caching, so I'm not 100% sure.
Maybe there is an easy way to define the defaults of the caching mechanism so as to avoid to use caching, and this can be made as an option? This probably should rely on the existence of the attribute, so requires a small adaptation to the caching options.
Anyway, this can be done in a second phase, I don't think it's crucial
003_calcjob_immigrant/readme.md
Outdated
* The similarity in launching mechanism for native and immigration calculation jobs may actually be confusing to users. The fact that the only difference is the additional `immigrate_remote_folder` input node, might go overlooked. | ||
I actually think this will not be a problem but I am including it here as it has been brought up as a con of the current design by other in private conversations. | ||
Since one actually have to construct a `RemoteData` and include it in the inputs, I do not think that this happens by accident or to an unexpecting user. | ||
* By purposefully not providing a basic interface for the conversion of output files to inputs nodes in `aiida-core`, we run the risk that the various solutions and their interfaces, that will be designed and provided by the plugins will be wildly disparate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are my 2 cents:
I would be tempted to say we can (and should) define a common API for the function to reconstruct the inputs).
Standardising it's extremely easy, and you essentially already did it, here is the API:
inputs = get_inputs_from_folder(remote_data)
The only thing to do is probably to allow arbitrary kwargs to tune the behaviour, and decide how this should be exposed: as a method of the CalcJob as described by @greschd, or e.g. as another class exposed via entry points, so it can be provided by a different plugin package, but sharing the same entry point name (but we should be careful if two people start implementing it in parallel), possibly with a method to get the ReconstructorClass from the CalcJob class.
Then my suggestion would be to define a "standard" wrapper function in AiiDA that essentially does what you did before (I'm using as an example a factory to get from an entry point, just to make a concrete example, but could be done differently):
def immigate_calculation(calc_entry_point, remote_data, options=None):
if options in None:
options = {}
reconstructor = ReconstructorFactory(calc_entry_point)
inputs = reconstructor.get_inputs_from_folder(remote_data, **options)
inputs['immigrate_remote_data'] = remote_data
results, node = run.get_node(CalcJobClass, **inputs)
return results, node
In this way we can easily e.g. make this into a verdi
command as @greschd suggested, with very limited effort, and working for any plugin (or at least any that provides a reconstructor).
I would also be tempted to suggest to use this function by default to people, and not the explicit way.
The reason is that I could just do this:
results, node = run.get_node(CalcJobClass, immigrate_remote_data = remote_data)
i.e. not pass any other input, and this would work (unless there is validation), but I think we should discourage this by default.
Or, even worse, you could attach completely random input (maybe by mistake, let's say you are looping over many calculations to immigrate but you mess up and swap inputs).
Points to discuss:
- is it ok that this should be provided by the plugin? I think so and it encourages people to contribute to the main repo. How to recreate the inputs is very much connected to how that specific plugin defined them.
- should we allow people to go via the
run.get_node
path? Again, I'm a bit hesitant and I would be tempted to say that I don't see a big benefit in using run_get_node, while I see the potential problem of inconsistent/incoherent inputs. I would remove this responsibility from the users' hands, and rely on a Reconstructor class released in a plugin package. - should we validate? Good question: I would be tempted to say yes, considering the original point of making things as close as possible as if they were run by AiiDA (and therefore this also includes the code(s)). I see however @espenfl 's point.
Maybe we should have then a compromise: we add an input flag "dont_validate" or something like this, that allows to pass anything as input, but then this is also recorded as a second attribute. If we instead go theimmigrate_calculation
way, this would do the validation (at least by default).
As a note: we should also try to be crystal clear on what 'validation' means here. I think what you are referring is the validation of the input ports; but there will be normally no check on the actual content of the nodes (e.g. if the input parameters dict makes sense - I don't even say if the values are correct, but if it follows the correct "schema", i.e. if I were to reuse it in a real calculation, at least the calculation would run).
So even 'validated' inputs might be wrong (maybe we should find a more specific name, like ports_validated = True, or something even better).
37e0649
to
ca03fe7
Compare
@greschd @espenfl @giovannipizzi thanks a lot for all your comments. I think I have addressed the majority of your comments. Given that I think this functionality is important to users and it would be good to have a built-in solution, I would really like to try and get this concept on the road. Maybe even try and get a beta version in v2.0. To this end, I will simply update the implementation that I made a year and a half ago, and add the functionality for the standardization of the immigrator interface. If we keep this as an AEP, experience teaches that nothing much will happen as things remain too theoretical. I think that we really need to start testing concrete to see what is working and what not and what might be missing. Once we have a PR of the implementation ready to merge, I can update this AEP to match any developments that took place there. That would be my suggestion on how to move forward on this. Let me know if you have comments. |
Also the two final open questions have been addressed and the main text is updated to reflect the decisions.
OK, as agreed in the meeting of this week, I am now merging this since there haven't been any further comments or objections. |
commit b4b4053 Author: Giovanni Pizzi <giovanni.pizzi@epfl.ch> Date: Wed Dec 15 20:20:05 2021 +0100 AEP 006 - Efficient object store for the AiiDA repository (aiidateam#11) commit 0a5675d Author: Sebastiaan Huber <mail@sphuber.net> Date: Fri Sep 10 18:16:30 2021 +0200 Update README.md (aiidateam#26) commit 5b45258 Author: Sebastiaan Huber <mail@sphuber.net> Date: Fri Sep 10 18:14:31 2021 +0200 AEP 004: Infrastructure to import completed calculation jobs (aiidateam#12) commit 4855195 Author: Chris Sewell <chrisj_sewell@hotmail.com> Date: Sun Jan 10 15:20:52 2021 +0000 Add Archive format AEP (aiidateam#21)
submitted
README.md