Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GENFI] Enable use of tsv to extract more clinical data #1005

Merged
merged 14 commits into from
Oct 26, 2023

Conversation

MatthieuJoulot
Copy link
Contributor

In its current state, this converter extracts only the most essential data from the spreadsheets.
This PR proposes to use a tsv given by the user specifying the name of the fields he wants, regardless of which spreadsheet the data is in.
The data must be given in the with underscores instead of spaces, and no capital letters
The data will then be written in the right place (participants.tsv/sessions.tsv/scans.tsv) using a mapping of the data.

@pep8speaks
Copy link

pep8speaks commented Oct 23, 2023

Hello @MatthieuJoulot! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2023-10-25 12:12:06 UTC

@MatthieuJoulot MatthieuJoulot marked this pull request as ready for review October 24, 2023 09:19
Copy link
Member

@NicolasGensollen NicolasGensollen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MatthieuJoulot !
I made a first pass with some minor suggestions along the way.
Are you planning to add some clinical data to the CI dataset as well ?

clinica/iotools/converters/genfi_to_bids/genfi_to_bids.py Outdated Show resolved Hide resolved
clinica/iotools/converters/genfi_to_bids/genfi_to_bids.py Outdated Show resolved Hide resolved
docs/Converters/GENFItoBIDS.md Outdated Show resolved Hide resolved
MatthieuJoulot and others added 2 commits October 25, 2023 08:41
Co-authored-by: Gensollen <nicolas.gensollen@gmail.com>
Copy link
Member

@NicolasGensollen NicolasGensollen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MatthieuJoulot !

The PR LGTM. I have a few additional suggestions but should be good to go once implemented.

if path_to_clinical_tsv:
additional_data_df = pd.read_csv(path_to_clinical_tsv, sep="\t")

path_to_mapping_tsv = os.path.join(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use Path objects consistently across the code base rather than strings. Especially when it makes the code much easier to read. I would modify both path_to_mapping_tsv and path_to_ref_csv.

To be verified, but this would become:

from clinica.utils.filemanip import get_parent
genfi_data_folder = get_parent(__file__, 3) / "data"
path_to_ref_csv = genfi_data_folder / "genfi_ref.csv"
...
path_to_mapping_tsv = genfi_data_folder / "genfi_data_mapping.tsv"

@@ -45,9 +50,19 @@ def convert_images(
write_bids,
)

# check that if a clinical tsv is given, a path to the clinical data is given as well
if path_to_clinical_tsv and not path_to_clinical:
raise ValueError("Missing a clinical_data_path.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can imagine users getting stuck here when providing only path_to_clinical_tsv but not path_to_clinical since these concepts are quite close for the distracted reader of the documentation.

I would be very explicit to ease debugging. Maybe something like:

raise ValueError(
    "The Genfi2BIDS converter is unable to convert the clinical data because "
    "the path to these data was not provided while a TSV file with additional "
    f"data was given ({path_to_clinical_tsv}). You can either use the appropriate "
    "option from the clinica command line interface to provide the missing path, "
    "or chose to not convert clinical data at all."
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

@@ -9,6 +9,7 @@ def convert_images(
bids_dir: PathLike,
path_to_clinical: Optional[PathLike],
gif: bool,
path_to_clinical_tsv: Optional[PathLike] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add the = None to the path_to_clinical argument as well ?

Also, is gif optional ? If not, it should be before the optional arguments, if yes, it should be declared as optional with a default value.

Copy link
Member

@NicolasGensollen NicolasGensollen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @MatthieuJoulot !

@NicolasGensollen NicolasGensollen merged commit bf8e61d into aramis-lab:dev Oct 26, 2023
16 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants