Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flexible segments definition #919

Open
AnnaKravchenko opened this issue Jul 2, 2024 · 4 comments · May be fixed by #969
Open

Flexible segments definition #919

AnnaKravchenko opened this issue Jul 2, 2024 · 4 comments · May be fixed by #969
Assignees
Labels
CNS Improvements in the CNS engine enhancement Enhancing an existing feature of adding a new one m|emref emref module m|flexref flexref module m|mdref mdref module m|mdscoring mdscoring module

Comments

@AnnaKravchenko
Copy link
Contributor

AnnaKravchenko commented Jul 2, 2024

Definition of flexible/semi-flexible segments in refinement modules is somewhat enigmatic:
One has to use parameter nfleX with X matching a sequential number of the molecule to be flexible, i.e. if
molecules = [‘molecule1’, ‘molecule2’]
and one wants 1 segment of molecule2 to be flexible, one needs to define parameter nfle2 = 1 and not, for example, nfle1, nfle3 etc.
This way of defining a segment feels much less intuitive compared to a definition of the symmetrical segments.
Plus it’s not at all explained at www.bonvinlab.org/haddock3/

Would it be possible to define flexible and semi-flexible segments similarly to the symmetry segment, i.e. using chain/segment id? Let’s say mol2.pdb contains chain B, then, current definition of two flexible segments looks like:

molecules = [“mol1.pdb”, “mol2.pdb”]
# molecule 2 has 2 flexible segments
nfle2 = 2 
# 1st flexible segment of molecule 2 starts with residue 1
fle_sta_2_1 = 1 
# 1st flexible segment of molecule 2 ends with residue 4
fle_end_2_1 = 4 
# 2nd flexible segment of molecule 2 starts with residue 6
fle_sta_2_2 = 6 
# 2nd flexible segment of molecule 2 starts with residue 18
fle_sta_2_2 = 18 

Possible simplified definition of the flexible segment could look like:

# 1st flexible segment belongs to chain B
flex_seg_1 = ‘B’ 
# 1st flexible segment starts with residue 1 (within chain B)
flex_1_sta = 1 
# 1st flexible segment ends with residue 4 (within chain B)
flex_1_end = 4 
# 2st flexible segment belongs to chain B
flex_seg_2 = ‘B’ 
# 2nd flexible segment starts with residue 6 (within chain B)
flex_2_sta = 6 
# 2nd flexible segment ends with residue 18 (within chain B)
flex_2_end = 18 

Alternatively, a better description of current semi/flexibility definition should be provided.

Current definition:

nfle1
default: 0
type: integer
title: Number of fully flexible segments
min: 0
max: 1000
short description: This defines the number of fully flexible segments.
long description: This parameter defines the number of fully flexible 
segments for the specified molecule. If >=1 then those must be defined 
manually with starting and end residue numbers in the fle_sta_* and 
fle_end_* variables.
group: flexibility
explevel: expert

Possible enhanced definition:

nfle1
default: 0
type: integer
title: Number of fully flexible segments in the 1st molecule
min: 0
max: 1000
short description: This defines the number of fully flexible segments 
of the 1st molecule.
long description: This parameter defines the number of fully flexible 
segments for the molecule that is the 1st entry in the ‘molecules' 
parameter. If >=1 then those must be defined manually with starting 
and end residue numbers in the fle_sta_* and fle_end_* variables.
group: flexibility
explevel: expert
@AnnaKravchenko AnnaKravchenko added enhancement Enhancing an existing feature of adding a new one documentation Improve docs labels Jul 2, 2024
@amjjbonvin
Copy link
Member

amjjbonvin commented Jul 2, 2024 via email

@rvhonorato
Copy link
Member

Looking into the code you need this two numbers because of a loop;

while ($nchain1 < $data.ncomponents) loop nloop1
evaluate($nchain1 = $nchain1 + 1)

that later is used in the expression;

do (store5 = $nchain1) ( resid $fle_sta_$nchain1_$nf : $fle_end_$nchain1_$nf

looks to me like this change can be achieved just by changing this whole nloop1:

evaluate($nchain1 = 0)
while ($nchain1 < $data.ncomponents) loop nloop1
evaluate($nchain1 = $nchain1 + 1)
if ($nfle$nchain1 = 0) then
display NO FULLY FLEXIBLE SEGMENTS for molecule $nchain1
else
display FULLY FLEXIBLE SEGMENTS for molecule $nchain1
evaluate($nf=0)
while ($nf < $nfle$nchain1) loop Xfflex
evaluate($nf=$nf + 1)
do (store5 = $nchain1) ( resid $fle_sta_$nchain1_$nf : $fle_end_$nchain1_$nf
and segid $prot_segid_$nchain1 )
display FULLY FLEXIBLE SEGMENT NR $nf FROM $fle_sta_$nchain1_$nf TO $fle_end_$nchain1_$nf
end loop Xfflex
end if
end loop nloop1

to:

evaluate($nchain1 = 0)
while ($nchain1 < $data.ncomponents) loop nloop1
    evaluate($nchain1 = $nchain1 + 1)
    evaluate($nflex = 0)
    evaluate($continue = true)
    while ($continue) loop count_flex
        evaluate($nflex = $nflex + 1)
        if (defined(flex_seg_$nflex) = FALSE) then
            evaluate($continue = false)
            evaluate($nflex = $nflex - 1)
        end if
    end loop count_flex
    
    if ($nflex = 0) then
        display NO FULLY FLEXIBLE SEGMENTS for molecule $nchain1
    else
        display FULLY FLEXIBLE SEGMENTS for molecule $nchain1
        evaluate($nf = 0)
        while ($nf < $nflex) loop Xfflex
            evaluate($nf = $nf + 1)
            do (store5 = $nchain1) ( resid $flex_$nf_sta : $flex_$nf_end
                                     and segid $flex_seg_$nf )
            display FULLY FLEXIBLE SEGMENT NR $nf FROM $flex_$nf_sta TO $flex_$nf_end IN CHAIN $flex_seg_$nf
        end loop Xfflex
    end if
end loop nloop1

So you first counts how many flexible segments are defined using your new format. Then uses these new variables in the main loop. The segid part becomes $flex_seg_$nf instead of $prot_segid_$nchain1 and the residue range is $flex_$nf_sta and $flex_$nf_end.

I do think this is a very valid (and not so complicated) change

@rvhonorato
Copy link
Member

Also, this syntax is used for some rather undocumented option to limit the random AIRs definition to selected segments per molecule.

And lets please not have this kind of thing, if this changes also helps us to tackle this then it's two for one

@amjjbonvin
Copy link
Member

Actually it should be doable to refactor this as suggested as it is not used to limit the random air sampling to specific segments. We have now other parameters controlling that. I will look into it.

@amjjbonvin amjjbonvin reopened this Aug 6, 2024
amjjbonvin added a commit that referenced this issue Aug 7, 2024
@rvhonorato rvhonorato added m|emref emref module m|flexref flexref module m|mdref mdref module m|mdscoring mdscoring module CNS Improvements in the CNS engine and removed documentation Improve docs labels Aug 7, 2024
@VGPReys VGPReys linked a pull request Aug 9, 2024 that will close this issue
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CNS Improvements in the CNS engine enhancement Enhancing an existing feature of adding a new one m|emref emref module m|flexref flexref module m|mdref mdref module m|mdscoring mdscoring module
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants