Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐈 Task: Refactor models with run.sh script #1222

Closed
kurysauce opened this issue Aug 5, 2024 · 12 comments
Closed

🐈 Task: Refactor models with run.sh script #1222

kurysauce opened this issue Aug 5, 2024 · 12 comments
Assignees

Comments

@kurysauce
Copy link
Contributor

kurysauce commented Aug 5, 2024

Summary

The updated test script reveals models that do not have an existing run.sh script. @GemmaTuron and @miquelduranfrigola agree that the solution is to refactor models to include run.sh. Running the update test script on a model will yield the error "Check halted. Either run.sh file does not exist, or model was not fetched via --from_github or --from_s3.", if a model does not already have a run.sh file.

Objective(s)

Refactor models to include a run.sh file. Models eos4e40 and eos3b5e do not have a run.sh file

Documentation

No response

@miquelduranfrigola
Copy link
Member

Thank you @kurysauce !

For now, I am tagging here @HarmonySosa, @LauraGomezjurado and @DhanshreeA and we'll see who has capacity to jump into this issue.

Two comments:

  1. eos3b5e should be easy to refactor and perhaps @HarmonySosa can help here, since she is dealing with it in the context of ersilia-pack?
  2. eos4e40 should not be too difficult to refactor either, and perhaps we can gain inspiration from a related model eos3804?

@DhanshreeA
Copy link
Member

Also adding @Malikbadmus and @dzumii here.

@dzumii
Copy link
Contributor

dzumii commented Aug 6, 2024

Ok, checking this out

@kurysauce
Copy link
Contributor Author

Currently working on eos4e40 with @miquelduranfrigola

@kurysauce
Copy link
Contributor Author

Log on local bash run for eos4e40 shows that outputs are being captured.

(eos4e40) @kurysauce ➜ /workspaces/eos4e40/model/framework (main) $ bash run.sh . /workspaces/eos4e40/model/framework/input.csv /workspaces/eos4e40/model/framework/pred.csv
Reading file /workspaces/eos4e40/model/framework/input.csv
1 molecules read!
Running model!
content after reading the file: ['CC(C)(C)c1cc(O)c(cc1O)C(C)(C)C']
['python /workspaces/eos4e40/model/framework/code/save_features.py --data_path /workspaces/eos4e40/model/framework/data.csv --save_path /workspaces/eos4e40/model/framework/features.npz --features_generator rdkit_2d_normalized', 'python /workspaces/eos4e40/model/framework/code/predict.py --test_path /workspaces/eos4e40/model/framework/data.csv --checkpoint_dir /workspaces/eos4e40/model/checkpoints/final_model --preds_path /workspaces/eos4e40/model/framework/pred.csv --features_path /workspaces/eos4e40/model/framework/features.npz --no_features_scaling']
Calculations done
this is the result: [{'50uM_Inhibition': 0.0025936177141886673}]
each item is:{'50uM_Inhibition': 0.0025936177141886673}
header is none: so it is dict_keys(['50uM_Inhibition'])
appending header: dict_values([0.0025936177141886673])
Writing results to /workspaces/eos4e40/model/framework/pred.csv
pred.csv:
['50uM_Inhibition'],[0.0025936177141886673]

I also modified the code in processing the output and reading the smile input, not exactly sure on the expected data structure that the Ersilia command is looking for:

values = []
header = None
print(f"this is the result: {result}")
for item in result: 
    print(f"each item is:{item}")
    if header is None:
        print(f"header is none: so it is {item.keys()}")
        header = list(item.keys())
        print(f"appending header: {item.values()}")
        values.append(header) 
    values.append(list(item.values()))  

print("Writing results to", output_file)
with open(output_file, "w") as f:
    writer = csv.writer(f)
    writer.writerow(values)

@kurysauce
Copy link
Contributor Author

Log on local bash run for eos4e40 shows that outputs are being captured.

(eos4e40) @kurysauce ➜ /workspaces/eos4e40/model/framework (main) $ bash run.sh . /workspaces/eos4e40/model/framework/input.csv /workspaces/eos4e40/model/framework/pred.csv
Reading file /workspaces/eos4e40/model/framework/input.csv
1 molecules read!
Running model!
content after reading the file: ['CC(C)(C)c1cc(O)c(cc1O)C(C)(C)C']
['python /workspaces/eos4e40/model/framework/code/save_features.py --data_path /workspaces/eos4e40/model/framework/data.csv --save_path /workspaces/eos4e40/model/framework/features.npz --features_generator rdkit_2d_normalized', 'python /workspaces/eos4e40/model/framework/code/predict.py --test_path /workspaces/eos4e40/model/framework/data.csv --checkpoint_dir /workspaces/eos4e40/model/checkpoints/final_model --preds_path /workspaces/eos4e40/model/framework/pred.csv --features_path /workspaces/eos4e40/model/framework/features.npz --no_features_scaling']
Calculations done
this is the result: [{'50uM_Inhibition': 0.0025936177141886673}]
each item is:{'50uM_Inhibition': 0.0025936177141886673}
header is none: so it is dict_keys(['50uM_Inhibition'])
appending header: dict_values([0.0025936177141886673])
Writing results to /workspaces/eos4e40/model/framework/pred.csv
pred.csv:
['50uM_Inhibition'],[0.0025936177141886673]

I also modified the code in processing the output and reading the smile input, not exactly sure on the expected data structure that the Ersilia command is looking for:

values = []
header = None
print(f"this is the result: {result}")
for item in result: 
    print(f"each item is:{item}")
    if header is None:
        print(f"header is none: so it is {item.keys()}")
        header = list(item.keys())
        print(f"appending header: {item.values()}")
        values.append(header) 
    values.append(list(item.values()))  

print("Writing results to", output_file)
with open(output_file, "w") as f:
    writer = csv.writer(f)
    writer.writerow(values)

resolved

@miquelduranfrigola
Copy link
Member

Thanks @kurysauce this is awesome! Anytihng specific you need from me at this stage?

@kurysauce
Copy link
Contributor Author

kurysauce commented Aug 19, 2024

Thanks @kurysauce this is awesome! Anytihng specific you need from me at this stage?

Nope! PR has been merged. Just waiting for the implementation from @HarmonySosa on model eos3b5e to close the issue.

@kurysauce kurysauce reopened this Aug 19, 2024
@kurysauce
Copy link
Contributor Author

kurysauce commented Aug 24, 2024

CLOSING COMMMENTS:
Existing incorporated models need to be tested for structural issues in repositories or hidden errors in main.py script. Open new issues as needed.

@miquelduranfrigola
Copy link
Member

Many thanks @kurysauce ! Very useful summary. We will take it from here. CC'in @DhanshreeA and @GemmaTuron

@DhanshreeA
Copy link
Member

DhanshreeA commented Sep 4, 2024

Update from @kurysauce 's work:

Many thanks again for the fantastic work Kurt!

@GemmaTuron
Copy link
Member

I will close this issue as just the eos4tcc remains open with its own issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

5 participants