📑 Feature Request: Playground documentation and usage #1491

GemmaTuron · 2025-01-07T10:32:14Z

Describe your feature request.

I have tried to use the Playground for model testing but I find it extremely difficult to use with the documentation currently provided in GitBook. User-friendly steps should be detailed as well as instructions on how to interact with the Nox command, similar to the issues I found in the Model Tester documentation. Below a few questions to clarify before I can rewrite the docs as well as some bugs I am encountering:

The playground consistently fails on MacOS with the following error. Is it only set up to work in Linux?

ERROR commands.py::test_command[fetch-eos3b5e] - FileNotFoundError: [Errno 2] No such file or directory: 'systemctl'
ERROR commands.py::test_command[serve-eos3b5e] - FileNotFoundError: [Errno 2] No such file or directory: 'systemctl'
ERROR commands.py::test_command[run-eos3b5e] - FileNotFoundError: [Errno 2] No such file or directory: 'systemctl'
ERROR commands.py::test_command[close-eos3b5e] - FileNotFoundError: [Errno 2] No such file or directory: 'systemctl'

What is the sequence of commands that one should use for the playground. By guessing from the last section, but truly, this needs to be the first thing in the documentation, it should be something like:

pip install ersilia[test]
nox -f test/playground/noxfile.py -s setup
nox -f test/playground/noxfile.py -s test...

How many tests are currently available? Is only these four?
- test_from_github
- test_from_dockerhub
- test_auto_fetcher_decider
- test_conventional_run
Inside each test, aside from how the model is fetched, the tests that are run are the same? I struggle to understand the difference between the playground and the test command except in the way the model is fetched. All the tests that we want to do in the model outputs (ie not nulls, not wildly different between runs) only happen in the test command? Then isn't the playground incomplete?
The playground is not testing H5 files, but I guess this is part of the refactoring being done?
The models are not deleted, only closed? Meaning they will remain in the users computers after being tested? I see the delete command first in the line of Command Execution Summary, I do not know if this is a bug and is actually run after close or not:

                                                                     Command Execution Summary                                                                      
┌────────────────────────────────────────────────────┬─────────────────┬─────────────────┬─────────────────┬──────────────────────┬────────────────────────────────┐
│ Command                                            │ Description     │ Time Taken      │ Max Memory      │ Status               │ Checkups                       │
├────────────────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼──────────────────────┼────────────────────────────────┤
│ ersilia -v delete eos3b5e                          │ delete          │ 0.04 min        │ 142.82 MB       │ PASSED               │                                │
│ ersilia -v fetch eos3b5e --from_dockerhub          │ fetch           │ 0.79 min        │ 143.92 MB       │ PASSED               │ ✔ Folder exists at             │
│                                                    │                 │                 │                 │                      │ /home/gturon/eos/dest/eos3b5e  │
│                                                    │                 │                 │                 │                      │ ✔ DockerHub status is True     │
│ ersilia -v serve eos3b5e                           │ serve           │ 0.04 min        │ 142.90 MB       │ PASSED               │ ✔ DockerHub status is True     │
│ ersilia run -i files/input.csv -o files/result.csv │ run             │ 0.02 min        │ 175.70 MB       │ PASSED               │ ✔ File exists at               │
│                                                    │                 │                 │                 │                      │ files/result.csv               │
│                                                    │                 │                 │                 │                      │ ✔ File content check at        │
│                                                    │                 │                 │                 │                      │ files/result.csv               │
│ ersilia close                                      │ close           │ 0.18 min        │ 141.09 MB       │ PASSED               │                                │
└────────────────────────────────────────────────────┴─────────────────┴─────────────────┴─────────────────┴──────────────────────┴────────────────────────────────┘

The example is generated with a simple function, instead of the example command? This can lead tofailing with models that have more complex inputs, why is the example command not implemented instead?
How do I specify the model I want to test? Do I need to modify the config.yml manually? and if so, when does it use the model_id field vs the model_ids field? Shouldn't this be passed as a parameter of the nox command instead? Each model that you want to test. Currently I see both the model_id and model_ids lines with models but the nox command is only running the single model not the list.
Same for python version, is it specified in the config.yml file only?
When I run the command: nox -f test/playground/noxfile.py -s test_from_github if the config.yml file is not manually edited on the fetch_flags line, it will still pull the model from dockerhub according to what I see in the command execution summary: ersilia -v fetch eos5guo --from_dockerhub. It is not clear to me how this should happen, I believe it is a bug?
cc for @DhanshreeA the testing from_github with model eos5guo fails, same as the test command, so there seems to be an issue with the model? It does work from dockerhub though. I do not understand @Abellegese why the serve and run appear as "PASSED" if the fetch has failed:

                                                                     Command Execution Summary                                                                      
┌────────────────────────────────────────────────────┬─────────────────┬─────────────────┬─────────────────┬──────────────────────┬────────────────────────────────┐
│ Command                                            │ Description     │ Time Taken      │ Max Memory      │ Status               │ Checkups                       │
├────────────────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼──────────────────────┼────────────────────────────────┤
│ ersilia -v delete eos5guo                          │ delete          │ 0.03 min        │ 142.53 MB       │ PASSED               │                                │
│ ersilia -v fetch eos5guo --from_github             │ fetch           │ 0.32 min        │ 146.68 MB       │ FAILED               │ ✔ Folder exists at             │
│                                                    │                 │                 │                 │                      │ /home/gturon/eos/repository/e… │
│ ersilia -v serve eos5guo                           │ serve           │ 1.67 min        │ 145.94 MB       │ PASSED               │                                │
│ ersilia run -i files/input.csv -o files/result.csv │ run             │ 0.02 min        │ 179.25 MB       │ PASSED               │ ✔ File exists at               │
│                                                    │                 │                 │                 │                      │ files/result.csv               │
│                                                    │                 │                 │                 │                      │ ✔ File content check at        │
│                                                    │                 │                 │                 │                      │ files/result.csv               │
│ ersilia close                                      │ close           │ 0.18 min        │ 140.75 MB       │ PASSED               │                                │
└────────────────────────────────────────────────────┴─────────────────┴─────────────────┴─────────────────┴──────────────────────┴────────────────────────────────┘
========================================================================= short test summary info ==========================================================================
FAILED commands.py::test_command[fetch-eos5guo] - AssertionError: Command 'fetch' failed for model ID eos5guo
================================================================= 1 failed, 3 passed in 133.17s (0:02:13) ==================================================================
nox > Command pytest commands.py -v failed with exit code 1
nox > Session test_from_github failed.

Same for model eos3b5e --from_github, it does fail at fetch time, I do not know if due to the example command as mentioned in the model test issue or a different reason

The text was updated successfully, but these errors were encountered:

Abellegese · 2025-01-07T22:47:30Z

Hi @GemmaTuron thanks for the comments. Your comments highly valuable for better functionality of this testing systems.

for Q1) I created a PR that supports macOS but to only check the docker status, unfortunately we can not manipulate docker from python and subprocess in the macOS machine.
Q2) I will update this in the docs, that is correct. But you can also pick any session and run, wont necessary be sequential. But if you want to run test_fetch_multiple_models and test_serve_multiple_models in the github workflow, requires grouping them for parallelization since they are dependent.
Q3) There are six including test_fetch_multiple_models and test_serve_multiple_models. Originally decided in this issue #1368.
Q4) So playground was mainly inspired by what can we check after we performed some command (could be fetch, serve, run...). We do this check using rules defined in rules,py. For instance lets take this rule:

@register_rule("folder_exists")
class FolderExistsRule(CommandRule):
    def __init__(self):
        pass

    def check(self, folder_path, expected_status):
        actual_status = Path(folder_path).exists() and any(Path(folder_path).iterdir())
        if actual_status != expected_status:
            raise AssertionError(
                f"Expectation failed for FolderExistsRule: "
                f"Expected folder to {'exist' if expected_status else 'not exist'}, "
                f"but it {'exists' if actual_status else 'does not exist'}."
            )
        return {
            "name": f"Folder exists at {folder_path}",
            "status": actual_status,
        }

This rule will be executed after we fetch a command. Using this rule for instance I can check if the necessary model folder is existed in the required folder after fetch. My expected status is true, since in this case I want the folder exists in eos folder.

Other rule example:

@register_rule("file_exists")
class FileExistsRule(CommandRule):
    def __init__(self):
        pass

    def check(self, file_path, expected_status):
        actual_status = Path(file_path).exists()
        if actual_status != expected_status:
            raise AssertionError(
                f"Expectation failed for FileExistsRule: "
                f"Expected file to {'exist' if expected_status else 'not exist'}, "
                f"but it {'exists' if actual_status else 'does not exist'}."
            )
        return {
            "name": f"File exists at {file_path}",
            "status": actual_status,
        }

The above rule can executed after we execute the run command. If I specify the output of the file in the run cmd, I can check if the CLI created that file or not. After this we have FileContentCheckRule if the file exists it check its content are valid or not (previously it only supported json and csv not it h5 as well as it supports datastructures).

So on a quick glance above, you can see that we play on those ersilia commands using any rule we defined to check the healthiness of the model or the CLI.

Q5) Now it supports h5.
Q6) The reason putting delete command first is used to prevent fetching command failure. In the first PR created for this playground, didnt have this delete flag, but in the workflow fetch command starts to fail because if the model fetched on the previous jobs in the workflow, we can fetch it here again, requires model removal and it works. Its a feature,
Q7) This also included in the new PR, but the example command could not produce more than one example smiles, its a bug, I was running into this problem before.
Q8) This also included in the new PR, you can specify everything (all options in the config.yml) from the nox session like this:

nox -f noxfile.py test_from_github model_id=eos2db3 python_version=3.8 delete_mode=false
nox -f noxfile.py test_serve_multiple_models model_ids=[eos2db3, eos3b5e] runner=multiple

Q9) Now you can also pass python version from the command as above
Q10) You are right, it was a bug and fixed in the new PR.
Q11) The fetch failed because you have model already fetch and the serve and run worked because of that, serving and running the existing model. Thats why you can specifiy the delete_model option.

Abellegese · 2025-01-07T22:48:55Z

@GemmaTuron those comments of yours are very useful. I will update the pipelines for better use.

GemmaTuron · 2025-01-08T07:49:27Z

Thanks @Abellegese we will discuss more in detail about all of this tomorrow. I think there is too much redundancy between the playground and the test module and it does not really make sense to duplicate all these efforts. Please do not modify the documentation at this point, I will take care of it.

GemmaTuron · 2025-01-08T09:24:45Z

Hi @Abellegese I am noting this down in preparation for tomorrow's meeting. Please do not modify any more code or open more PRs before we can discuss everything.

Q1: Where was the documentation that specified the playground only worked for Linux? We were considering adding the Playground as the Model Test Workflow, if it cannot be used in a MacOS platform it is not useful to that end
Q2: I will modify the docs this time to make sure they are comprehensive. I hope this serves as a good example for future documentation. I need to understand what you mean about the multiple models command for example, as this is not documented anywhere. What / how do you "group for parallelization"?
Q3: Where are these six explained? I cannot see them anywhere
Q4: the question still stands. What is the difference between the checks used by the playground and the checks used by the Test Module? Shouldn't those be consolidated into a single one? Otherwise you are defining those rules for the Playground and the same ones as checks in the test. Feels like a duplication of effort.
Q5: ok thanks
Q6: If the model is not fetched the delete command still shows PASS, which is weird
Q7: What is the problem with the example command? Where is this bug explained? @DhanshreeA please can you confirm that the example command is working as it should?
Q8: how to pass all the information is not clear nor documented. Please before making any new PRs let's discuss
Q9: Same as above
Q10: So how does it work now? If I want to test from GitHub do I need to edit the config.yml or not?
Q11: I don't understand. The Model is First Deleted as you mention in Q6, so if deleted why is it still in the system? The delete option I think should be always by default on

GemmaTuron · 2025-01-08T10:19:24Z

And as I play with it some more questions:
Q1: wouldn't it be better to specify a place outside the Ersilia repo to save the files? Now it saves the files in the /test folder, so if I do changes to ersilia I need to revert the ones in the /test folder least I push those test files as well.

Q2: about the different commands. can you explain why we have separately a fetch and serve for multiple models whereas for the single models they are fetched, served and run inside the same command (or I understand they are)

Q3: why / when is the autofetcher used?

Q4: In the conventional run, why are all those values hardcoded? the output file name for example?

@nox.session(venv_backend="conda", python=get_python_version())
def test_conventional_run(session):
    """Run pytest for standard and conventional run."""
    install_dependencies(session)
    update_yaml_values(
        {
            "runner": "single",
            "cli_type": "all",
            "fetch_flags": "--from_dockerhub",
            "output_file": "files/output_eos9gg2_0.json",
            "output_redirection": "true",
            "delete_model": True,
        }
    )
    logger.info("Standard and Conventional Run: Conventional")
    session.run("pytest", "commands.py", "-v", silent=False)

Q5: minor point -- if Docker is not active, the error you get is the following. I don't know if we want a more informative message or because this is aimed at developers it is enough:

Command: ersilia -v fetch eos3b5e --from_dockerhub
Description: fetch
Error: Expectation failed for FolderExistsRule: Expected folder to exist, but it does not exist.

Abellegese · 2025-01-08T14:10:09Z

Hi @GemmaTuron we will discuss them in detail in the meeting.

DhanshreeA · 2025-01-08T14:31:35Z

@Abellegese I don't understand, where is the example command not producing more than one input?

Abellegese · 2025-01-08T14:41:47Z

Hi @DhanshreeA I attached it below. Same problem from the python API.

DhanshreeA · 2025-01-08T15:06:39Z

@Abellegese Right, I see. The example command is fetching from the predefined example input file. Could you try ersilia example -n 3 -f input.csv --random? I will update the GitBook docs to reflect that by default, predefined is set, and if an example file exists, then the input will always be fetched from there.

Abellegese · 2025-01-08T15:29:20Z

Thanks @DhanshreeA

GemmaTuron · 2025-01-14T12:08:30Z

More info on the usage of the playground. @Abellegese I see these weird conda environments created by Nox, how do I delete them?

(ersilia) GemmaErsilia:ersilia gemmaturon$ conda env list
# conda environments:
#
                         /Users/gemmaturon/github/ersilia-os/ersilia/test/playground/.nox/setup
                         /Users/gemmaturon/github/ersilia-os/ersilia/test/playground/.nox/test_auto_fetcher_decider
                         /Users/gemmaturon/github/ersilia-os/ersilia/test/playground/.nox/test_fetch_multiple_models
                         /Users/gemmaturon/github/ersilia-os/ersilia/test/playground/.nox/test_from_dockerhub
                         /Users/gemmaturon/github/ersilia-os/ersilia/test/playground/.nox/test_from_github
base                     /Users/gemmaturon/miniconda3
chem                     /Users/gemmaturon/miniconda3/envs/chem
eos3b5e                  /Users/gemmaturon/miniconda3/envs/eos3b5e
eosbase-bentoml-0.11.0-py310     /Users/gemmaturon/miniconda3/envs/eosbase-bentoml-0.11.0-py310
eosbase-bentoml-0.11.0-py311     /Users/gemmaturon/miniconda3/envs/eosbase-bentoml-0.11.0-py311
ersilia               *  /Users/gemmaturon/miniconda3/envs/ersilia

GemmaTuron · 2025-01-14T15:34:11Z

Update and please @Abellegese confirm if this is correct:

There were legacy files inside a ./nox folder and an entire /ersilia copy inside the playground after running it. I need to delete them I understand?

Abellegese · 2025-01-14T15:45:35Z

Hi @GemmaTuron for the first question, nox create an isolated venv thats why it creates them. Second question you specifiy overwrite_ersilia_repo that deletes the ersilia folder if it exists. Deleting the files inside .nox folder I dont think will change things. Inside it, there sessions with isolated venv there not ersilia. Ersilia cloned ersilia from github is saved inside the playground folder. If that answer you questions.

GemmaTuron · 2025-01-14T18:02:40Z

Hi @Abellegese

Sorry I do not understand.

The venv(s) that .nox creates, how do I delete them?
The ersilia folder that is cloned from github inside the playground folder: this is not a good practice - no files that are only user-specific (any tests etc) should be in a git folder that can inadvertently be pushed back into the main GitHub folder. We should create a playground folder somewhere else if anything. Let's discuss this on Thursday's meeting.

Please let me know how can I delete the .nox created venvs

GemmaTuron · 2025-01-16T14:00:09Z

@Abellegese I found that there is a hidden folder inside ersilia/test/playground/.nox that if it is deleted eliminates these environments. Is that right?
And I still think we should move the testing folders outside the cloned github repo folder.

Abellegese · 2025-01-16T14:58:10Z

Yes @GemmaTuron, those are the session isolated environment. Yes indeed there has to be some ways to clean them up after some sessions. Also note that when you run nox it removes them and creates a new one. In the PR I have created, I added some feature to reuse if the venv for sessions already created.

The github repo yes we need to move it maybe to ~/eos/tmp/plaground. Here we can store things that was dynamically created. Also I was thinking to move the entire nox session folder there in the above dir.

DhanshreeA · 2025-01-23T08:01:05Z

My observations from the current playground implementation:

Ersilia CLI repo gets cloned wherever the playground is run, it should ideally not be cloned afresh if Ersilia is already being used to run the playground which is what we will end up doing in the CI pipelines.
The playground does not clean up after itself in terms of environments, ersilia installation, extra files created, etc. This is a non issue on CI because those are ephemeral machines, however this will definitely be an issue with users running the playground on their machines.
The playground tests are quite flaky, with a success rate of 1 in every 3 runs running end to end successfully.
In the test_cli_single particularly, fetch fails over and over again but the serve, run, and close commands seem to work. This does not make sense, because how can a model that didn't fetch be served or run at all?
The logs for the playground tests are limited and do not give a clear picture of what's going on and why a test failed. I see that the logs are generated for each nox session, however those don't obviously show up in the pipeline run, and currently aren't being uploaded as artifacts, which should happen so we can go and figure out what went wrong.

Abellegese · 2025-01-23T11:57:45Z

Thanks @DhanshreeA .

Abellegese · 2025-01-28T20:03:57Z

Hey @DhanshreeA below are the rules I came up with for the commands

1. Fetch

Flags:

auto_fetcher, from_dockerhub, from_s3, from_dockerhub, version

Checks:

Verify the destination and repository folder in eos for the fetched model.
Verify if the the dest folder contains a necessary files and content as below:
- Check the model_source.text file and check if the it has the correct source for the fetch models.
- Check the api_schema.json file existence and content.
- Check the file from_dockerhub.json and check also check if it is "docker_hub": true if fetched from the dockerhub else "docker_hub": false
- Check if status.json contains done: true value.
Ensure the Docker image exists if fetched from DockerHub.
Ensure the Conda env exists if fetched from from_github or from_s3.
Exit with status 0 if no runtime error encountered.

Notes:

The rules selected for fetch believed to be check the main important functionality of the fetch commands, for instance:
- Copying information
- Copying schecma files
- Downloading and saving models (including from dockerhub)
- and more

2. Serve

Flags:

No flag used

Checks:

If a session folder is created for the served model.
Check the session.json and eosxxxx.pi file existence and validate its content.
Check the session.json "service_class": "pulled_docker" if it matches the correct service class of the fetched model
- Apply necessary Docker-related operations.
Verify if the api respond status_code of 200, to check if it served correctly
Exit with status 0 from the click CLI runner .invoke function if no runtime error encountered.

3. Run

Flags:

inp_types (str, list, csv)
output_types (csv, json, h5)

Checks:

Verify the existence of the output files generated by run command.
Ensure the generated file content is valid (not None, null, or empty).
Exit with status 0 from the click CLI runner .invoke function if no runtime error encountered.

4. Catalog

Flags:

--more, --as-json, --f, --local, --hub

Checks:

If local, display the fetched models with their sources.
Validate the correctness of JSON structure (specifically if all the keys and the values is not empty or missing).
Ensure the file generated using the -f flag contains all required entries.

5. Example

Flags:

--sample, -n, --random, --predefined, -c, -f

Additional Checks:

Verify the input key has compound entries for --simple and --predefined flags when displaying in the terminal.
Same for file generated file.
Matching length between generated and requested sample sizes.
Verify valid compound entries (optional)

6. Delete

Flags:

--all

Checks:

Verify that all containers and images if model fetched from Docker Hub are removed.
Verify the destination and repository for the model should not exist after cleanup.
Conda environemt should be removed.

7. Close

Flags:

No flag used

Checks:

Verify if session files removed

8. Test

Flags:

--deep, from_dockerhub, as_json

Checks

No checks used here because most of its opertion depends on other commands which their execption will be caught in thier own session.
Simple exit status 0 from the click CLI runner .invoke function if no runtime error encountered is enough for test commands.

Abellegese · 2025-01-28T20:06:30Z

I will write here how to use the commands in detail:

Abellegese · 2025-02-01T20:51:00Z

Playground CLI Usage Guidelines

Installation

To use Playground CLI, install ersilia first using instruction given here. Then install package for testing purposes as given below in ersilia activated venv:

pip install -e ".[test]"

nox installs ersilia in its isolated virtual environment from local source when everytime we run nox session such as execute.

The playground test folder found in test/playground. Either you go into this folder, which does not require to specify the nox file or you specify the nox file but in the ersilia root directory. For instace, if you go into playground fodler, you can then run a simple command like clean as given below:

nox -s clean -- --cli <command> [options]

Or, from ersilia root directory, simply:

nox -s clean -f test/playground/noxfile.py [path/to/noxfile.py] -- --cli <command> [options]

Command mutual dependency

The commands in Ersilia are interdependent, meaning that running a single command often requires executing a series of prerequisite commands. For example, to run a model, you must first execute fetch, followed by serve, and finally run. If we want to test, for instance the healthiness of the run command, we need to execute the prerequisite commands. To simplify this process in the testing playground, we have introduced a CLI Dependency Map. This map outlines the necessary commands required to execute a given command. The details are provided below.

serve: fetch
run: fetch, serve
close: serve
example: serve

delete command require fetching but we can run it after all commmands that we specified are finished execution. Also we can specify it before fetch. Now if we specify those command alone, for instance:

nox -s execute -- --cli run

other required commands will be executed first, which in the above case fetch and serve. This a bit simplify the commands.

Handling python virtual and files

nox venv files will be stored at ~/eos/playground/.nox
Other files such as input and output will be stored at ~/eos/playground/files and error logs will be stored at ~/eos/playground/logs.
Those files gets cleared out with nox session called clean.

Options & Flags

🔹 Nox built-in flags

Nox provides built-in commands that can be used to run tasks. These commands are:

-p: for specifying the python version. If you dont specify the python version, the sessions by default will be executed on these python versions, 3.8, 3.9, 3.10, 3.11, and 3.12.
-fb: stands for force backend, used to change the backend to run nox sessions. By default the sessions will be executed on conda, but we can change it to virtualenv using this command. More detailed example given as below in the table.

Note that, both of this commands should be specified before -- args that separate nox and custom flags (eg.nox -s execute [nox flags] -- [custom]).

Flag	Description	Example
`-p`	Used to specify the python verison	`nox -s execute -p 3.8 -- [other flags after this]` `nox -s execute -p 3.8 3.9 --`
`-fb`	Used to change python backends (eg. from conda to virtualenv)	`nox -s execute -fb virtualenv -- [other flags after this]`

🔹 General Settings

Flag	Description	Default	Example
`--activate_docker`	Activates or deactivates Docker. Use: To test if auto autofetcher decides not fetch from dockerhub if the docker is inactive and vice versa.	`true`	`nox -s execute -- --activate_docker true`
`--log_error`	Enables or disables logging of errors as file, which will be stored in `~/eos/playground/logs/`. Each command failures will create a standalone file, with datetime on it in a string format. For instance `catalog_20250129_145802.txt`	`true`	`nox -s execute -- --log_error false`
`--silent`	Enable or disable logs from ersilia command execution	`true`	`nox -s execute -- --silent false`
`--show_remark`	Displays a remark column in the final execution summary table. Remark is the output being displayed in the terminal if the ersilia commnads executed successfully	`false`	`nox -s execute -- --show_remark true`
`--max_runtime_minutes`	Sets the maximum execution time for a run command. Use: To test model speed if its appeared to be slow.	`10`	`nox -s execute -- --max_runtime_minutes 5`
`--num_samples`	Sets the sample size to create input for `run` command.	`10`	`nox -s execute -- --max_runtime_minutes 5`

🔹 Command Flags

Note that when you pass values for any of the flags given below, will overwrite the default values.

Command Selection (`--cli`)

Flag	Description	Default	Example
`--cli`	Specifies ersilia commands to run in order (`fetch`, `serve`, `run`, `catalog`, `example`, `test`, `close`, `delete`). Default is all, which executes commands in this order: `"fetch", "serve", "run", "close", "catalog", "example", "delete", "test".`	`all`	`nox -s execute -- --cli fetch serve run` `nox -s execute -- --cli run`

Ersilia Command flags

Note: every ersilia flags for the commands such as fetch: --from_gituhb, delete: --all, etc, should bas passed without --. For example nox -s execute -- --fetch from_dockerhub version [img-tag].

Flag	Description	Default	Example
`--fetch`	Fetches models from sources (`from_github`, `from_dockerhub`, `from_s3`, `version`)	`--from_github`	`nox -s execute -- --fetch from_dockerhub version dev` `nox -s execute -- --fetch from_s3`
`--run`	We don't specifically use this flag, instead we use `--input_types` and `--outputs` which are used to specifiy the input types such as `str, list, csv` and output file types `result.csv, result.json, result.h5`. Then the flag will be generate automatically in such format `["-i", "input", "-o", "output"]`	`None`	`None`
`--example`	Generates example input for a model (`-n`, `--random`, `-f`). If we specify file name eg. `example.csv`, it will be saved in the path we specify to it, meaning we should pass the path to the file.	`["-n", 10, "--random"]`	`nox -s execute -- --example -n 10 random/predefined -c -f example.csv`
`--catalog`	Retrieves model catalog from `local` or `hub`	`["--more", "--local", "--as-json"]`	`nox -s execute -- --catalog hub`
`--test`	Tests models at different levels (`shallow`, `deep`, `from_github`, `from_dockerhub`, `from_s3`)	`["--shallow", "--from_github"]`	`nox -s execute -- --test deep from_dockerhub/from_s3/from_github`
`--delete`	Used to delete modes and has one flag `all`	`None`	`nox -s execute -- --delete all`

Note that when you pass values for any of the flags given below, will overwrite the default values.

🔹 Other flags

Flag	Description	Default	Example
`--outputs`	This is used with `run` command and used to specify output files (`result.csv`, `result.h5`). Note that we only specified the file name, the path will be automatically set to `~/eos/playground/files/{file_name.{csv, json, h5}}`	`[results.{csv, json, h5}]`	`nox -s execute -- --outputs result.csv result.h5`
`--input_types`	This is also used with `run` command to define input formats (`str`, `list`, `csv`). The same	`List of (str, list, csv)`	`nox -s execute -- --input_types str list csv`
`--runner`	Specifies execution mode (`single`, `multiple`). The `single` mode is used to execute commands using one model (for instance, the default model ID for this mode is `eos3b5e`). Whereas the `multiple` mode will use multiple models to execute the given commands. For this mode, by default, we run `eos5axz, eos4e40, eos2r5a, eos4zfy, eos8fma`.	`single`	`nox -s execute -- --runner multiple`
`--single`	Used to specify or override the default model ID used for single running mode.	`eos3b5e`	`nox -s execute -- --single eosxxxx`
`--multiple`	Used to specify or override the default model IDs used for multiple running mode.	`[eos5axz, eos4e40, eos2r5a, eos4zfy, eos8fma]`	`nox -s execute -- --multiple eosxxxx eosxxxx eosxxxx`

Example Usage

All example given below assumes you are at dir test/playground. This does not require to specify the noxfile. Nox by default use noxfile.py which found in test/playground.

Run all commands with their default values

nox -s execute -p 3.11

Fetch Models from DockerHub, Serve it and run it. With python 3.10

In this example the run input types and output files are default.

nox -s execute -p 3.10 -- --cli fetch serve run --fetch from_dockerhub

1. Fetch Models from DockerHub, Serve it and run it just with by specifying one command "run"

In this example we specifies input types and output files.

nox -s execute -p 3.10 -- --cli run --fetch from_dockerhub --input_types str list --outputs result.csv result.h5

2. Other examples

nox -s execute -p 3.10 -- --cli serve run catalog example

This will clean all nox related resources (venv, files, logs...)

nox -s clean

To test the closing command, if it successfull cleared out sessions created during serve.

nox-s execute -p 3.10 -- --cli close

Test a model in shallow mode. We use delete first because there is a fetching process during test and we want to cleared out previously existing models in the syste.

nox -s execute -p 3.10 -- --cli delete test

Running with multiple mode

nox -s execute -p 3.10 -- --cli fetch serve run --fetch from_dockerhub --runner multiple --outputs result.csv

Environment Variables

Variable	Description	Example Value
`TEST_ENV`	Used to pass `yes_or_no` prompt at `test` command	`TEST_ENV=true`
`CONFIG_DATA`	used to pass config data to the pytest file	`CONFIG_DATA=config_json_data`

GemmaTuron · 2025-02-03T11:06:57Z

Okay, a lot of work here thanks @Abellegese !

Help me clarify a bit so we can write solid documentation for end-users. I don't think there is need for additions to the playground at the moment just to better understand what goes on under the hood and maybe some small bugfixes. We can then decide if any final edits are required.

Basic steps I am running:

Checkout the corresponding Playground branch as it was not merged at the time of testing
Pip install ersilia in editable mode and [test] extension on a conda env py3.12
cd into ersilia/tests/playground
run nox -s execute -p 3.11

Comments:

It seems to require sudo password the first time you run it but not after? At this step commands.py [sudo] password for gturon:. If that is the behaviour needs to be added to the documentation. Is that correct?
About flags. If I understood the documentation correctly, you need to pass the -- twice. For example nox -s execute -- --activate_docker true? The only commands that do not use this are -p and -fb? I think -v as well
Error logs: there is a sample_error_log.txt file in ersilia/test/playground/files. Maybe this is no longer needed if the logs go into eos? As for the logs, I don't get anything really informative even if the models are failing. For example see attached, the model failed because docker was not active but the only thing that popped in the logs was this:
delete_20250203_113011.txt
Input/output files: while some files appear in the eos/files, I also see files in ersilia/test/playground being created. In particular, when I simply run nox -s execute -p 3.11 which I believe uses eos3b5e by default, I see the following files: file.csv, file.h5, file.json, input.csv, output1.csv, output2.csv, result.csv in my playground folder in ersilia/tests, not in eos, and I need to delete them manually.
docker_active: perhaps I am not understanding this correctly. It is true by default, so does it mean it will try to activate docker if it is not active? At the moment the tests fail unless I activate docker manually from my end.
I tried to use the command with docker_active false to see what it does but it fails. I understand with docker_active false it will simply try to run all the commands fetching from github? This is the output, can you help me understand where it fails? I think this is because the test command is failing but it should not right? Also all the tests in the Playground table that is printed at the end "Command Execution Summary" appear as passed but those do not include the test command: test_20250203_120017.txt and nox_test.txt
Seeing the flags listed under nox:

usage: nox [-h] [--version] [-l] [--json] [-s [SESSIONS ...]] [-p [PYTHONS ...]] [-k KEYWORDS] [-t [TAGS ...]] [-v] [-ts] [-db {conda,mamba,micromamba,virtualenv,venv,uv,none}]
           [-fb {conda,mamba,micromamba,virtualenv,venv,uv,none}] [--no-venv] [--reuse-venv {yes,no,always,never}] [-r] [-N] [-R] [-f NOXFILE] [--envdir ENVDIR]
           [--extra-pythons [EXTRA_PYTHONS ...]] [-P [FORCE_PYTHONS ...]] [-x] [--no-stop-on-first-error] [--error-on-missing-interpreters] [--no-error-on-missing-interpreters]
           [--error-on-external-run] [--no-error-on-external-run] [--install-only] [--no-install] [--report REPORT] [--non-interactive] [--nocolor] [--forcecolor]

are the more technical ones explained somewhere? and is the -db the same as -fb ?

Is there a way to get the results of the playground in a table?

Abellegese · 2025-02-03T13:21:29Z

Q1) Yes as you know we have docker manipulation system so it requires priviliges
Q2) Yes so basically -- is a nox way of recognizing or separating out flags that are not built-in.
Q3)) Correct we dont need those logs inside the playground test. We have now traceback supported error logs. But as I see in your logs not showed up which confuses me. I could not reproduce the error. But seems there is a "permission denied" which has been blocking me doing testing on delete command, @DhanshreeA solved the issue but seems coming again.
Q4)) This files iles: file.csv, file.h5, file.json, input.csv, output1.csv, output2.csv, result.csv are coming from the test command and we agreed to put there (cwd) so you might need to delete them manually.

Abellegese · 2025-02-03T13:30:22Z

Q5) On macOS you need to do it manually and if your OS is mac it will raise runtime error telling user to do it manually

 elif system_platform == "Darwin":  # macOS
      print("Stopping Docker programmatically is not supported on macOS.")
      raise RuntimeError("Cannot stop Docker programmatically on macOS.")

This is also true for starting docker but at least I implemented opening desktop version of docker if it is installed in user system.

GemmaTuron · 2025-02-03T13:50:07Z

Q5) On macOS you need to do it manually and if your OS is mac it will raise runtime error telling user to do it manually

elif system_platform == "Darwin": # macOS
print("Stopping Docker programmatically is not supported on macOS.")
raise RuntimeError("Cannot stop Docker programmatically on macOS.")
This is also true for starting docker but at least I implemented opening desktop version of docker if it is installed in user system.

Thanks @Abellegese this is what I understood but I am working on Ubuntu, so shouldn't it be automatic? Maybe it is a user setting somewhere on Docker Desktop. At the moment it does not activate Docker Desktop if I pass the flag as true

Abellegese · 2025-02-03T13:50:54Z

Q6) Exactly if docker is not activated, since fetch flag by default is None, it will decide to fetch from github. But I saw the log and its confusing to know what happened. But the traceback locate error point in ersilia.utils.docker, which I used simple docker to get running containers. This might be related to the docker inactive issue.

I think @GemmaTuron I need to put small exception handling to raise Error specific to this docker inactive issue in user computer.

Abellegese · 2025-02-03T13:54:37Z

Q5) On macOS you need to do it manually and if your OS is mac it will raise runtime error telling user to do it manually
elif system_platform == "Darwin": # macOS
print("Stopping Docker programmatically is not supported on macOS.")
raise RuntimeError("Cannot stop Docker programmatically on macOS.")
This is also true for starting docker but at least I implemented opening desktop version of docker if it is installed in user system.

Thanks @Abellegese this is what I understood but I am working on Ubuntu, so shouldn't it be automatic? Maybe it is a user setting somewhere on Docker Desktop

Yes if this is ubuntu it should be automatic. I am using ubuntu but I could not reproduce this in my computer

Abellegese · 2025-02-03T13:56:53Z

Q7) db: stands for default backend is not always guarantee to change the backend. Whereas fb (stands for force backend) can always change the backend to whatever we choose.

GemmaTuron · 2025-02-03T15:03:45Z

Q7) db: stands for default backend is not always guarantee to change the backend. Whereas fb (stands for force backend) can always change the backend to whatever we choose.

that helps. I will only reference -fb in the docs

GemmaTuron · 2025-02-03T15:20:36Z

Ok @Abellegese

Before I ask more questions summary of what we have discussed agreed so far:

Q1) Yes as you know we have docker manipulation system so it requires priviliges - Ok this is added in the documentation now
Q2) Yes so basically -- is a nox way of recognizing or separating out flags that are not built-in. - Ok this is also clear in the documentation
Q3) Correct we dont need those logs inside the playground test. We have now traceback supported error logs. But as I see in your logs not showed up which confuses me. I could not reproduce the error. But seems there is a "permission denied" which has been blocking me doing testing on delete command, @DhanshreeA solved the issue but seems coming again. - We can remove this folder from ersilia then? Will you modify your PR? About the Permission Denied Error I'll let Dhanshree share more about it.
Q4) This files: file.csv, file.h5, file.json, input.csv, output1.csv, output2.csv, result.csv are coming from the test command and we agreed to put there (cwd) so you might need to delete them manually. - I see, I did not know those came from the test command. I believe it would be best they are stored somewhere else easy to delete, what do you think? @DhanshreeA had you thought of that already?
Q5) Docker. I have tried with a model for which I know Docker works (eos4e40) and indeed I think it worked! Starting Docker... Maybe it was a problem of the model eos3b5e as I report in the test command
Q6) I believe this is a problem with the model itself and the test command and I have followed up on that in the Model tester issue.
Q7) Ok, I will add only the -fb in the docs

Abellegese · 2025-02-03T15:35:36Z

Yep thats it @GemmaTuron nicely summerized.

GemmaTuron · 2025-02-03T15:53:42Z

And then a few questions to better understand:

Running nox -s execute -p 3.10 -- --cli fetch is equivalent to running nox -s execute -p -- --fetch ?
If I do not specify from, in the nox -s execute -p 3.10 -- --cli fetch it will fetch from_github only or like the baseline nox command it will fetch from all sources?
If the --single or --multiple flag is not passed everything runs by default on eos3b5e?
I don't trylu understand the explanation on the --run flag. If I run nox -s execute -p -- --run what is the default behaviour? Where is the model fetched from?
Delete flag: it deletes the model specified at fetch time, or only works with all by default?
Close flag: is not in the table but I guess it exists? does it make sense?
I need help understanding the output of the playground. I have run nox -s execute -p 3.10 --activate_docker true --silent false --single eos4e40 and I get 2 failed, 6 passed, 5 warnings in 1196.95s (0:19:56) but the only real fail I see is in the catalog command. I do not see where the other fail is, and the warnings I cannot find. The catalog error can be due to the .json file not being updated appropiately? catalog_20250203_162302.txt and eos4e40_playground.txt

I have updated the documentation based on the information you provided! Hope you like it, its here

Abellegese · 2025-02-03T17:01:03Z

Q1) Nope its not equivalent --cli used to define ersilia commands that we want to run. --fetch is used to pass ersilia's fetch flags to nox. Same for other nox flags such as --example, --catalog.
Q2) If you don't specify anything, it fetches by automatically deciding. If docker is inactive from github.
Q3) Yep but you can pass any model you are interested in. For example --single eos2db3 for --multiple eos2db3 eos3b5e....
Q4) I just explained it there but better to remove it from the documentation. I used it internally to build the run flags. Note that if you don't specify any fetching sources such as --fetch from_github for instance, it decides automatically. Run commands by default run model for input types of str, list, csv and output types of result.{csv, json, h5}. We can pass those using --input_types and --outputs.
Q5) By default it deletes eos3b5e but you can pass --cli delete --delete all to delete all.
Q6) Yep it does need any flag thats why I didn't put serve and close. But we can show them in usage example
Q7) So according to the error log :

the first error is that it founds None in the generated catalog command json results Input Shape': None and same for 'Output Shape': None. Thats why it raises the error.

1) Check 'Catalog json content is valid': False and Details: 'Validation failed for key 'Input Shape' in object: {'Index': 2, 'Identifier': 'eos3mk2', 'Slug': 'bbbp-marine-kinase-inhibitors', 'Title': 'BBBP model tested on marine-derived kinase inhibitors', 'Task': ['Classification'], 'Input Shape': None, 'Output': ['Probability'], 'Output Shape': None, 'Model Source': 'Local Repository'}'

GemmaTuron · 2025-02-03T17:04:09Z

Hi @Abellegese

Thanks, but then for Q1 it is the same to run --cli fetch than to run directly --fetch, correct? osrry if that was not clear.
How come the catalog is not correct? I think this has to do with issues in the Model test. does the model test modify the local catalogue by any chance?

Abellegese · 2025-02-03T17:14:15Z

Hi @GemmaTuron this one is important question that weather test command is modifying the catalog result. Can you tell me more details on how you used the test command for this model (eos3mk2)? Or you go to eos/temp/eos3mk2/ and see if the metadata.yml is recently updated ?

Abellegese · 2025-02-03T17:16:07Z

The first question is, its not --fetch is a way to pass you fetch flags so that you can fetch from you desired sources, But --cli fetch uses autodecider by default to fetch models.

GemmaTuron · 2025-02-03T17:30:24Z

The first question is, its not --fetch is a way to pass you fetch flags so that you can fetch from you desired sources, But --cli fetch uses autodecider by default to fetch models.

ahh makes sense thanks!

GemmaTuron · 2025-02-04T11:54:41Z

Okay @Abellegese
To summarise and wrap up the Playground features, a summary of everything I think needs to be updated:

There is a block "permission denied" on the delete command. This needs to be fixed system wide?
The logs in the playground test are not required and can be eliminated
the test command (not the playground) produces some files that are stored locally from where you run the test. Would it make sense to store them somewhere specific, like a temporal folder?
The issue with the catalog is difficult to debug. I do not think it is related to the playground itself so we can consider it in the test command if anything

For the rest I think the documentation is basic but sufficient also because this feature is aimed at more advanced developers and they can play with the different command combinations as they see fit

Abellegese · 2025-02-04T13:05:43Z

Thanks @GemmaTuron I will a change for the 2 point.

Abellegese · 2025-02-04T13:24:46Z

The rest of the problem will be addressed in the test command fix.

GemmaTuron added the enhancement New feature or request label Jan 7, 2025

github-project-automation bot added this to Ersilia Model Hub Jan 7, 2025

github-project-automation bot moved this to On Hold in Ersilia Model Hub Jan 7, 2025

DhanshreeA changed the title ~~📑 Feature Request: Playground documentatio and usage~~ 📑 Feature Request: Playground documentation and usage Jan 7, 2025

DhanshreeA assigned Abellegese Jan 8, 2025

DhanshreeA self-assigned this Jan 22, 2025

Abellegese mentioned this issue Jan 30, 2025

Testing Playground Refactor #1528

Merged

📑 Feature Request: Playground documentation and usage #1491

📑 Feature Request: Playground documentation and usage #1491

Comments

GemmaTuron commented Jan 7, 2025 • edited Loading

Describe your feature request.

Abellegese commented Jan 7, 2025

Abellegese commented Jan 7, 2025

GemmaTuron commented Jan 8, 2025

GemmaTuron commented Jan 8, 2025

GemmaTuron commented Jan 8, 2025

Abellegese commented Jan 8, 2025

DhanshreeA commented Jan 8, 2025

Abellegese commented Jan 8, 2025

DhanshreeA commented Jan 8, 2025

Abellegese commented Jan 8, 2025

GemmaTuron commented Jan 14, 2025

GemmaTuron commented Jan 14, 2025

Abellegese commented Jan 14, 2025

GemmaTuron commented Jan 14, 2025

GemmaTuron commented Jan 16, 2025

Abellegese commented Jan 16, 2025

DhanshreeA commented Jan 23, 2025

Abellegese commented Jan 23, 2025

Abellegese commented Jan 28, 2025 • edited Loading

1. Fetch

Flags:

Checks:

Notes:

2. Serve

Flags:

Checks:

3. Run

Flags:

Checks:

4. Catalog

Flags:

Checks:

5. Example

Flags:

Additional Checks:

6. Delete

Flags:

Checks:

7. Close

Flags:

Checks:

8. Test

Flags:

Checks

Abellegese commented Jan 28, 2025

Abellegese commented Feb 1, 2025 • edited Loading

Playground CLI Usage Guidelines

Installation

Command mutual dependency

Handling python virtual and files

Options & Flags

🔹 Nox built-in flags

🔹 General Settings

🔹 Command Flags

Command Selection (--cli)

Ersilia Command flags

🔹 Other flags

Example Usage

Run all commands with their default values

Fetch Models from DockerHub, Serve it and run it. With python 3.10

1. Fetch Models from DockerHub, Serve it and run it just with by specifying one command "run"

2. Other examples

Environment Variables

GemmaTuron commented Feb 3, 2025 • edited Loading

Abellegese commented Feb 3, 2025 • edited Loading

Abellegese commented Feb 3, 2025 • edited Loading

GemmaTuron commented Feb 3, 2025 • edited Loading

Abellegese commented Feb 3, 2025

Abellegese commented Feb 3, 2025

Abellegese commented Feb 3, 2025

GemmaTuron commented Feb 3, 2025

GemmaTuron commented Feb 3, 2025

Abellegese commented Feb 3, 2025

GemmaTuron commented Feb 3, 2025

Abellegese commented Feb 3, 2025 • edited Loading

GemmaTuron commented Jan 7, 2025 •

edited

Loading

Abellegese commented Jan 28, 2025 •

edited

Loading

Abellegese commented Feb 1, 2025 •

edited

Loading

Command Selection (`--cli`)

GemmaTuron commented Feb 3, 2025 •

edited

Loading

Abellegese commented Feb 3, 2025 •

edited

Loading

Abellegese commented Feb 3, 2025 •

edited

Loading

GemmaTuron commented Feb 3, 2025 •

edited

Loading

Abellegese commented Feb 3, 2025 •

edited

Loading