Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify resclust outputs #330

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

Conversation

VGPReys
Copy link
Contributor

@VGPReys VGPReys commented Aug 30, 2023

To solve the two improvments addressed in issue #327:

  1. Dict keys have been casted to string.
    Tests data have been modified too.
    While printed in log, dict is converted to string, and default ' character are replaced to ", making sure the json library will be able to loads(string) the data.

  2. While providing the --output=path/to/results argument, the user can now add a path where to generate the Clutering results.
    An additional test function was added to make sure the output is indeeed created while providing the argument.

mgiulini
mgiulini previously approved these changes Aug 30, 2023
Copy link
Collaborator

@mgiulini mgiulini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, it works! maybe you can put also a non-json table in the output folder?
something like the standard arctic3d clustered_residues.out file

Cluster 1 -> 85 87 88 161 162
Cluster 5 -> 55 57 72 74

@mgiulini mgiulini linked an issue Aug 30, 2023 that may be closed by this pull request
@rvhonorato
Copy link
Member

rvhonorato commented Aug 31, 2023

@VGPReys please don't write titles like: "Solving issue #327" - instead describe what this pull request will add to the codebase 😉 and please don't make one pull request that adds more than one thing

@rvhonorato rvhonorato added the enhancement New feature or request label Aug 31, 2023
Comment on lines 130 to 131
if (strcl := str(clusters[cl])) not in cl_dict.keys():
cl_dict[strcl] = [ligands[cl]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will change the type of the key from integer to string and can have breaking consequences, why not use integers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the point of the PR, integer as keys cannot be loaded using the json library.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is the default behaviour of json.load, casting the type into it needs you to reclass a Decoder

@VGPReys VGPReys changed the title Solving issue #327 Modify resclust outputs Aug 31, 2023
@rvhonorato rvhonorato self-requested a review August 31, 2023 12:21
Copy link
Member

@rvhonorato rvhonorato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree with this and the code looks very messy.

json.loads cannot read keys as integers because this is the default json format.

This PR adds an anti-pattern solution as convenience and I don't think this is a good idea, what needs to be done is to re-think the output data structure and adopt a standard that is compliant with the default format.

Arguably changing 1 to "1" does achieve that but it relies heavy on python dynamic type casting and less on the output data formalism, which can be very prone to errors in the future.

up to you @mgiulini

@mgiulini
Copy link
Collaborator

mgiulini commented Sep 4, 2023

I don't agree with this and the code looks very messy.

json.loads cannot read keys as integers because this is the default json format.

This PR adds an anti-pattern solution as convenience and I don't think this is a good idea, what needs to be done is to re-think the output data structure and adopt a standard that is compliant with the default format.

Arguably changing 1 to "1" does achieve that but it relies heavy on python dynamic type casting and less on the output data formalism, which can be very prone to errors in the future.

up to you @mgiulini

Hi there, sorry for the late reply.

For what I see, adding the output option that generates a clustered_residues.out file like the following

Cluster 1 -> 85 87 88 161 162
Cluster 5 -> 55 57 72 74

can be useful and is in line with the other output files. As for the json, I agree that that modification is not in the purpose of the project and of this CLI in particular. Maybe @VGPReys you can load the cluster data from the clustered_residues.out file via pandas, what do you think?

@VGPReys
Copy link
Contributor Author

VGPReys commented Sep 4, 2023

I don't agree with this and the code looks very messy.
json.loads cannot read keys as integers because this is the default json format.
This PR adds an anti-pattern solution as convenience and I don't think this is a good idea, what needs to be done is to re-think the output data structure and adopt a standard that is compliant with the default format.
Arguably changing 1 to "1" does achieve that but it relies heavy on python dynamic type casting and less on the output data formalism, which can be very prone to errors in the future.
up to you @mgiulini

Hi there, sorry for the late reply.

For what I see, adding the output option that generates a clustered_residues.out file like the following

Cluster 1 -> 85 87 88 161 162
Cluster 5 -> 55 57 72 74

can be useful and is in line with the other output files. As for the json, I agree that that modification is not in the purpose of the project and of this CLI in particular. Maybe @VGPReys you can load the cluster data from the clustered_residues.out file via pandas, what do you think?

Ok then let's just stick to the textual version in clustered_residues.out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Retrieving results from resclust output
3 participants