Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ag_residues always 0 #5

Open
igortru opened this issue Oct 24, 2024 · 7 comments
Open

ag_residues always 0 #5

igortru opened this issue Oct 24, 2024 · 7 comments

Comments

@igortru
Copy link

igortru commented Oct 24, 2024

Hi, I am talking only about provided notebooks.
I don't see how you include antigen structure in affinity calculation process:
ag_residues always 0.

ag_residues=0,
self.ag_residues = ag_residues
antigen_max_pixels = self.ag_residues
idx_list += [i+max_res_h+max_res_l for i in range(min(antigen_max_pixels, img.shape[-1]-(h+l)))]

technically, my interest - check affinity for complex I folded myself, it is not clear how it can be done.
for example : how create npy file which contain chothia numbering for antibody (anarci?) and antigen at the same time.

@kevinmicha
Copy link
Owner

kevinmicha commented Oct 24, 2024

Hi igortu,

ag_residues is a legacy variable, so it is not a bug that it is always zero.

ANTIPASTI expects Chothia numbering for the antibody regions; you might find this function helpful. Although I have not tested it myself, it appears straightforward and promising. It is part of the IgFold repository and AbNumber-based.

Alternatively, if you already have a PDB file with any numbering scheme and a list of residues using Chothia numbering, you can save them in the appropriate folders under notebook/test_data/ (i.e., structure/ and list_of_residues/), and then adapt the second cell of the affinity prediction notebook.

Hope this helps!

@igortru
Copy link
Author

igortru commented Oct 24, 2024

Am I understand correctly that masked dccm matrix (input_shape) has always constant size and somehow include information about antigen?
How exactly you create list_of_residues files from pdbs? They have different from anarci format.

@kevinmicha
Copy link
Owner

Hi @igortru,

Sorry for my late reply. I had missed this message.

Yes, the dccm matrices have always the same shape:

  • The antibody chains are aligned following the Chothia numbering convention
  • As the Normal Modes are computed for the antibody-antigen complex, the antigen information is contained in the antibody pixels through a mechanism that we call 'antigen imprinting' (see our paper for more details)

The obtention of a list of residues from a PDB files occurs in the generate_fv_pdb function under the Preprocessing class (click here for the exact location). These need to follow, as mentioned, the Chothia numbering convention; I am unsure what you mean by 'they have different from anarci format'.

Best,
Kevin

@igortru
Copy link
Author

igortru commented Nov 7, 2024

Hi, Kevin!
thank you for explanations

about npy format:
if I use alphafold predictions , I need provide npy file for
get_lists_of_lengths function
before 
generate_fv_pdb call.


def load_test_image(self):
    r"""Returns a test normal mode correlation map which is masked according to the existing residues in the training set.

    """
    pdb_id = self.test_pdb_id

    if self.alphafold is True:
        h, l, _ = self.get_lists_of_lengths(selected_entries=str(pdb_id[:-3]).split())
 ////
               self.generate_fv_pdb(self.test_structure_path+pdb_id+self.file_type_input, lresidues=lresidues, hupsymchain=hupsymchain, lupsymchain=lupsymchain)

@kevinmicha
Copy link
Owner

Hi,

Specify the desired PDB code as the selected entry. This will point to the directory where the residue lists are stored and fetch the corresponding numpy array. For this to work, the residue list needs to already be saved in that directory (either automatically via the main pipeline or manually, if you don't have a Chothia-annotated PDB).

Hope this is helpful!

@igortru
Copy link
Author

igortru commented Nov 12, 2024

Hi Kevin!
You are author, you decide what implemented,but in my opinion
your tool just template which requires understanding what is going inside.
It is not user friendly.
Also you do not provide command line version.

From my perspective, all numbering issues need to be hidden inside.
Run some numbering tool like ANARCI , generate corresponding numbered pdb, etc.

I have tried your tool on my heavy chain only predicted antibody-antigen pairs
( AF2 multimer).It was not easy modify your tool for this task.
At the end I have received results which are very different from experimental.

@igortru
Copy link
Author

igortru commented Nov 12, 2024

Screenshot 2024-10-23 at 5 44 16 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants