-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nnunet ensemble #198
Nnunet ensemble #198
Conversation
Static code check failure was due to a new vulnerability found in pytorch versions under 2.2.0. I tried updating the dependencies but poetry still resolves torch version 2.1.2 despite the pyproject.toml allowing any version above 2.1.1. I'm assuming some other dependency is preventing us from upgrading to a newer torch version so I just ignored the vulnerability for now |
As for the smoke tests, it has something to do with the metrics, but I didn't change anything in fl4health, all the changes were restricted to the research folder, so i don't know why the smoke tests are failing. Also I'm not sure if there's a way to see which smoke tests are failing but here is the error I get when I run them myself. |
Ok strange that they all passed now but I'll take it |
Also I was able to remove ignoring the mlflow vulnerabilities so I guess that issue was solved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All in all, some great work! I have comments mostly around small details, clarifications and documentation requests.
I also echo Masoom's point about supporting evaluation without requiring the entire test set in RAM. As you mention, it seems like a lot of this would be transferrable. Up to you if you want to tackle that in this PR, or start another branch off this one for future PR adding that functionality. If there is a good reason to still support the RAM-based evaluation we can keep it in, otherwise it may be simpler and less code to support solely the disk-based evaluation.
Ok modifying stuff to use disk ended up being a bit more work than I thought. What we have works and is a nice quick and dirty method. We can discuss a big refactor for a future PR. I will look into the comments John made to see if any of them still apply |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
PR Type
[Feature | Fix | Documentation | Other ]
Short Description
This PR introduces major updates to the prediction and evaluation pipelines for nnunet models. Namely a script that can be called to do all of inference, lesion extraction and picai_eval evaluation.
Notes
I would like to mention that in the second half of working on this, the code had to become a lot more involved than I would have liked. I think this is a good solution for now as it is robust, generalizable to multiple use cases (ie. different amounts of models, multiclass segmenetation etc.) and just works. However in the future I would love to refactor this somehow, make it shorter and cleaner. I think there will be an opportunity across the entire picai research folder to refactor at some point and delete code. However I didn't want to expand the scope of this PR as refactoring would potentially be much more involved under the hood of the various API's being used and take a lot of time. Additionally, waiting on some of these packages to fix issues or make improvements to their own API's (I create an issue and they normally just ask me to do the PR for them lol)
That being said, if you see some easy simplifications I'd love to know. I'm really in deep on this one, so my head hurts when trying to reimagine it. This was a frustrating PR because I don't love the comprimises I had to make.