Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

staging: run recipes from firebase #179

Merged
merged 45 commits into from
Dec 18, 2023
Merged

Conversation

rugeli
Copy link
Collaborator

@rugeli rugeli commented Jul 13, 2023

Problem

What is the problem this work solves, including
closes #167

Solution

This is a staging branch for run recipes from firebase.

features in this branch that need to be reviewed:

  • a new DBUploader class for data conversion and uploading recipes into modules

already reviewed:

  • firebase credential input
  • AWSHandler for storing simularium results to S3
  • save metadata to firebase
  • open a new tab for simularium result
  • handlers are initiated using the DATABASE_IDS enum

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)

Steps to Verify:

  1. Have firebase project setup and your personal credentials ready
  2. Upload a recipe to your firebase: upload -r examples/recipes/v2/gradients.json
  3. Pack the remote recipe you uploaded: pack -r firebase:recipes/gradients_v_default -c examples/packing-configs/run.json
  4. check up the result on Simularium, AWS S3 and Firebase

Keyfiles (delete if not relevant):

  1. DBRecipeHandler.py
  2. autopack/__init__.py

@rugeli rugeli requested review from mogres and meganrm July 13, 2023 19:14
@rugeli rugeli marked this pull request as draft July 13, 2023 22:58
@rugeli rugeli force-pushed the feature/run-recipes-from-firebase branch from af55458 to a928c1d Compare July 13, 2023 22:59
@github-actions
Copy link

github-actions bot commented Jul 13, 2023

Packing analysis report

Analysis for packing results located at cellpack/tests/outputs/test_spheres/spheresSST

Ingredient name Encapsulating radius Average number packed
ext_A 25 236.0

Packing image

Packing image

Distance analysis

Expected minimum distance: 50.00
Actual minimum distance: 50.01

Ingredient key Pairwise distance distribution
ext_A Distance distribution ext_A

@rugeli rugeli changed the base branch from feature/resovle-grads-in-comp to main July 13, 2023 23:08
@codecov-commenter
Copy link

codecov-commenter commented Aug 2, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (37b661a) 98.52% compared to head (93787fc) 98.63%.
Report is 16 commits behind head on main.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #179      +/-   ##
==========================================
+ Coverage   98.52%   98.63%   +0.10%     
==========================================
  Files          16       18       +2     
  Lines         476      511      +35     
==========================================
+ Hits          469      504      +35     
  Misses          7        7              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rugeli rugeli force-pushed the feature/run-recipes-from-firebase branch from 3c0bbf1 to a928c1d Compare August 2, 2023 23:51
rugeli and others added 6 commits September 27, 2023 12:32
* turn off resolving inheritance while uploading

* able to upload recipes having "inherit" key

* get download and pack to work, refactors needed

* refactors
@rugeli rugeli changed the title Feature/run recipes from firebase staging: run recipes from firebase Oct 23, 2023
* refactor AWS and firebase handler

* databases initiation handling
@rugeli rugeli marked this pull request as ready for review November 9, 2023 23:19
@rugeli rugeli force-pushed the feature/run-recipes-from-firebase branch from 8dcc36c to 3e691a2 Compare November 13, 2023 22:08
@rugeli
Copy link
Collaborator Author

rugeli commented Nov 29, 2023

In this PR, I’d appreciate your focus on reviewing three main parts:

  1. autopack/init.py/load_file - where the firebase recipe data is detected and accessed
  2. recipe_loader.py/_read - this is where the firebase data is read and load
  3. DBRecipeHanlder.py/DBRecipeLoader - where the db data to local data format conversion happens

Also, sorry for mixing the already-reviewed features with the awaiting-review features in one PR. I’ll manage my PR flow better moving forward to make it easier for everyone to review.

Copy link
Collaborator

@mogres mogres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add instructions to create firebase credentials somewhere? These could be a page in the wiki that can be linked elsewhere.

The updated README is in the upload local recipes to s3PR - #189, you can view the instructions here: https://github.com/mesoscope/cellpack/pull/189/files

So after some discussion and having addressed the changes, the recommended PR merge order should be: #191 - #189 - (WIP) run recipes with "inherit" key - #179, feel free to review the first two when you have time. I'll convert this PR to a draft for now. Sorry for the confusion!

Maybe we could also add the link to create firebase credentials in the prompt to provide the credentials path?

return json.load(file_name)


def write_json_file(path, data):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will overwrite existing files at the path. Would it make sense to add an option for checking this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just found out that we have checks for existing keys and contents(like user and firebase creds) before calling this function. So it should be safe to leave this method in its original form.

and "all_partners" in obj["partners"]
and not obj["partners"]["all_partners"]
)
else obj.get("partners", [])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for recipes downloaded from a remote db, do we expect this to work if obj["partners"]["all_partners"] is not empty?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for asking this! The if statement here is to determine whether the recipes retrieved from firebase have partners. During the upload process, the key "all_partners" is added to "partners" of each obj no matter there is partner or not.

The structure of the partners in Firebase is as follows:

  • for recipes without partners: obj[partners] == {'all_partners': []}
  • for recipes with partners: obj[partners] == {'all_partners': [<cellpack.autopack.interface_objects.partners.Partner object at xyz>]}

if "gradients" in recipe_data:
# gradients in firebase recipes are already stored as a list of dicts
if "gradients" in recipe_data and not isinstance(
recipe_data["gradients"], list
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to initialize GradientData instances from the existing list of dictionaries on firebase? i.e. loop over the list (instead of a dict) and instantiate?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just like with Partners, the cellpack system organizes gradients into a list during the upload process. Therefore, we can skip processing gradients in firebase recipes.

def test_is_nested_list():
assert DataDoc.is_nested_list([]) is False
assert DataDoc.is_nested_list([[], []]) is True
assert DataDoc.is_nested_list([[1, 2], [3, 4]]) is True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should DataDoc.is_nested_list([1, [1, 2]]) be True or False?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thought! Currently it returns False, though it should return True. While we don't typically encounter such nested lists like [1, [1, 2]] in our recipe data, but you are right that our method should be robust enough to handle all types of nested lists. I just modified the function to run a loop, should be good this time.

.gitignore Outdated Show resolved Hide resolved
@mogres
Copy link
Collaborator

mogres commented Dec 12, 2023

Looks good overall! I was able to upload and run the gradient recipe remotely. Thanks for adding detailed tests and documentation :)

@rugeli
Copy link
Collaborator Author

rugeli commented Dec 12, 2023

Can you add instructions to create firebase credentials somewhere? These could be a page in the wiki that can be linked elsewhere.

The updated README is in the upload local recipes to s3PR - #189, you can view the instructions here: https://github.com/mesoscope/cellpack/pull/189/files

So after some discussion and having addressed the changes, the recommended PR merge order should be: #191 - #189 - (WIP) run recipes with "inherit" key - #179, feel free to review the first two when you have time. I'll convert this PR to a draft for now. Sorry for the confusion!

Maybe we could also add the link to create firebase credentials in the prompt to provide the credentials path?

Good idea! We are going to refactor the the authentication methods to integrate firebase credentials in issue #214

rugeli and others added 3 commits December 12, 2023 13:53
Co-authored-by: Saurabh Mogre <saurabh.mogre@alleninstitute.org>
Copy link
Collaborator Author

@rugeli rugeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mogres Thank you for your thorough review! I've just addressed several issues. Please feel free to point out any additional edge cases or concerns that I may overlooked.

and "all_partners" in obj["partners"]
and not obj["partners"]["all_partners"]
)
else obj.get("partners", [])
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for asking this! The if statement here is to determine whether the recipes retrieved from firebase have partners. During the upload process, the key "all_partners" is added to "partners" of each obj no matter there is partner or not.

The structure of the partners in Firebase is as follows:

  • for recipes without partners: obj[partners] == {'all_partners': []}
  • for recipes with partners: obj[partners] == {'all_partners': [<cellpack.autopack.interface_objects.partners.Partner object at xyz>]}

if "gradients" in recipe_data:
# gradients in firebase recipes are already stored as a list of dicts
if "gradients" in recipe_data and not isinstance(
recipe_data["gradients"], list
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just like with Partners, the cellpack system organizes gradients into a list during the upload process. Therefore, we can skip processing gradients in firebase recipes.

def test_is_nested_list():
assert DataDoc.is_nested_list([]) is False
assert DataDoc.is_nested_list([[], []]) is True
assert DataDoc.is_nested_list([[1, 2], [3, 4]]) is True
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thought! Currently it returns False, though it should return True. While we don't typically encounter such nested lists like [1, [1, 2]] in our recipe data, but you are right that our method should be robust enough to handle all types of nested lists. I just modified the function to run a loop, should be good this time.

@rugeli rugeli merged commit eb4c0fa into main Dec 18, 2023
7 checks passed
@rugeli rugeli deleted the feature/run-recipes-from-firebase branch December 18, 2023 20:01
@rugeli rugeli restored the feature/run-recipes-from-firebase branch January 10, 2024 19:19
@rugeli rugeli deleted the feature/run-recipes-from-firebase branch June 5, 2024 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run a recipe from firebase
4 participants