-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Inference script logging #194
Conversation
The function now accepts a single argument: log_file_path. If an empty string is provided, logging messages are only printed on stdout and not saved to a file.
This does not make sense to me. Let's say I have a GPU and a CPU cluster. I create all my subjects segmentations on the GPU cluster first in FreeSurfer/FastSurfer conform output (i.e. storing the output in $sid/mri/aparc.DKTatlas+aseg.mgz). In this case I want the log-files for the segmentation in $sid/scripts/deep-seg.log and not all appended to one file. Depending on how many subjects I have this might get massive and is really hard to find a single subject which might have failed. What is the point of appending all outputs to one file? |
One issue may be that, since we now have to specify |
Yes, the output directory is known (passed via --outdir or --pred_name), same with the subject ID, so why not just use this for the logging? i.e. create a separate scripts directory under subjectid as before. Could just have a log flag to indicate if you want to generate the log-file or not |
Why is the path
In any case, with these replacements, the use can have either: per-subject or general logging, it is just a question what would be the default.
I understand your reasoning, but I would prefer this to be an environment variable, I would just additionally make it an environment variable. Also, maybe you even want to allow "$SUBJECT_DIR" to be a replaced by the current subject directory withing the script....
I don't quite understand, tqdm has options to work with the logging... that is why you do the FastSurfer/FastSurferCNN/eval.py Line 158 in 3bc21af
What you are doing here is not logging, but just "printing" use tqdm.wite() https://tqdm.github.io/docs/tqdm/#write instead of print = e.g. from tqdm import write as print
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comments
Right, but to keep with the Freesurfer directory structure this would have to be handled differently for the two cases: The original idea was a boolean |
To be consistent with the Freesurfer directory structure (
I see your points.
Maybe it would be less convenient to have to set the log file path (if the user wants a custom path) with an environment variable? I think a script argument may be better especially if run_prediction.py is used directly. About tqdm: the idea is that the progress bar is probably more useful on the GUI, and maybe not so much in the log file and so it can be excluded (if we log only from run_prediction.py, it already is, since LOGGER does not register its prints). |
I agree with @AhmedFaisal95. Command line arguments are probably easier for most people than ENV variables. For the logging, we could also do the same as for the --pred_name/orig_name routine:
|
I thought tqdm was superior on both command line and logfile level. @AhmedFaisal95 Is there any question left on why the stdout with tqdm is "broken" or how to fix it, or is this resolved? |
No technical question; if its output should still be included in the log file, I can figure out how to fix the formatting in the written file (for e.g. what you mentioned above). (Note: the stdout output in the terminal is fine.) I think a question from an earlier discussion is whether its output should be included in the log file at all. |
Maybe we should have a meeting about these points. I don't really like run_prediction to process multiple files at once. For the use case described above (GPU cluster for segmentation, followed by CPU cluster for recon-surf) once might as well loop through subjects and call run_fastsurfer --segonly etc. That will guarantee that outputs go into FreeSurfer directory structures and log files will be there. Batch processing inside run_prediction is more targeted at users who want only the segmentation on lots of files. Since we do not provide any statistics (volumes of structures etc), they need their own downstream processing anyway. They know what they are doing and they can get a single log file. Or they can loop over their subjects and call run_prediction individually. In fact, I see no real reason to further support batch processing inside run_prediction. |
I think there was a confusion about log files and logging in shell and python scripts. It should be resolved by your changes that move logging into the python script instead of doing it in the shell script with tee. Further, the progress bar should not be part of the logfile at all in my opinion. With respect to processing, this really is a scheduling problem. It would probably be easier, if we keep support for multiple subjects for segmentation-only-mode, just because they are indivdually so fast, that it would be extra hassle to involve a cluster scheduler... (i.e. better usability) |
Yes, that is what we decided yesterday in our meeting also. @LeHenschel will work on logging for batch mode. |
Just for reference, this is a sample
|
We have decided to merge these changes in for now, especially to fix the current logging behaviour (logging inside the program directory and overwriting the same file). However, |
Description
In the current
feature/vinn
implementation,run_prediction.py
(inference) logs are saved in./FastSurferVINN/logs/expr_fastsurfer.log
. The main directory (FastSurferVINN
) only can be changed by modifying theLOG_DIR
variable in the model config files (such as this one) or setting theout_dir
argument. If FastSurfer is run multiple times with the same config, the same file is appended to.In addition, a separate log file is already saved within
run_fastsurfer.sh
under$SUBJECT_DIR/scripts/deep-seg.log
.Following some discussions with @m-reuter, we decided it would be better to log to one file within
run_prediction.py
, whose path should be specified through an argument:log_name
. If not provided, the logs will not be written to a file; this will be the default behaviour.In the call to
run_prediction.py
inrun_fastsurfer.sh
, this argument will be set to$SUBJECT_DIR/scripts/deep-seg.log
.One advantage of writing the logs within the python script instead of dumping the output of its command within the shell script (
run_fastsurfer.sh
) is that the tqdm output is skipped. In the latter case, the output is included but is not formatted properly (see example below in screenshot below).Changes
This PR applies these changes, which involve the following:
logging.py
: Modifying thesetup_logging
function to accept a single argument:log_file_path
. If an empty string is provided, logging messages are only printed on screen and not saved to a file. TheLOG_DIR
variable is not used in this function anymore.run_prediction.py
: implementing the optional arglog_name
and adapting thesetup_logging
call.run_fastsurfer
: addinglog_name
arg to therun_prediction.py
call and removing the piping of its output to the log file.Note:
run_prediction.py
in a single, all logs are saved to the same file (if specified).log_name
file exists, it is overwritten.Extra
tqdm
output indeep-seg.log
when written fromrun_fastsurfer.sh