Flexible segments definition #919

AnnaKravchenko · 2024-07-02T09:32:43Z

Definition of flexible/semi-flexible segments in refinement modules is somewhat enigmatic:
One has to use parameter nfleX with X matching a sequential number of the molecule to be flexible, i.e. if
molecules = [‘molecule1’, ‘molecule2’]
and one wants 1 segment of molecule2 to be flexible, one needs to define parameter nfle2 = 1 and not, for example, nfle1, nfle3 etc.
This way of defining a segment feels much less intuitive compared to a definition of the symmetrical segments.
Plus it’s not at all explained at www.bonvinlab.org/haddock3/

Would it be possible to define flexible and semi-flexible segments similarly to the symmetry segment, i.e. using chain/segment id? Let’s say mol2.pdb contains chain B, then, current definition of two flexible segments looks like:

molecules = [“mol1.pdb”, “mol2.pdb”]
# molecule 2 has 2 flexible segments
nfle2 = 2 
# 1st flexible segment of molecule 2 starts with residue 1
fle_sta_2_1 = 1 
# 1st flexible segment of molecule 2 ends with residue 4
fle_end_2_1 = 4 
# 2nd flexible segment of molecule 2 starts with residue 6
fle_sta_2_2 = 6 
# 2nd flexible segment of molecule 2 starts with residue 18
fle_sta_2_2 = 18

Possible simplified definition of the flexible segment could look like:

# 1st flexible segment belongs to chain B
flex_seg_1 = ‘B’ 
# 1st flexible segment starts with residue 1 (within chain B)
flex_1_sta = 1 
# 1st flexible segment ends with residue 4 (within chain B)
flex_1_end = 4 
# 2st flexible segment belongs to chain B
flex_seg_2 = ‘B’ 
# 2nd flexible segment starts with residue 6 (within chain B)
flex_2_sta = 6 
# 2nd flexible segment ends with residue 18 (within chain B)
flex_2_end = 18

Alternatively, a better description of current semi/flexibility definition should be provided.

Current definition:

nfle1
default: 0
type: integer
title: Number of fully flexible segments
min: 0
max: 1000
short description: This defines the number of fully flexible segments.
long description: This parameter defines the number of fully flexible 
segments for the specified molecule. If >=1 then those must be defined 
manually with starting and end residue numbers in the fle_sta_* and 
fle_end_* variables.
group: flexibility
explevel: expert

Possible enhanced definition:

nfle1
default: 0
type: integer
title: Number of fully flexible segments in the 1st molecule
min: 0
max: 1000
short description: This defines the number of fully flexible segments 
of the 1st molecule.
long description: This parameter defines the number of fully flexible 
segments for the molecule that is the 1st entry in the ‘molecules' 
parameter. If >=1 then those must be defined manually with starting 
and end residue numbers in the fle_sta_* and fle_end_* variables.
group: flexibility
explevel: expert

The text was updated successfully, but these errors were encountered:

amjjbonvin · 2024-07-02T10:00:21Z

It would be possible but requires a lot of refactoring of the CNS code… Also, this syntax is used for some rather undocumented option to limit the random AIRs definition to selected segments per molecule. I.e. I would go for a better description of the parameters.

rvhonorato · 2024-08-06T08:20:47Z

Looking into the code you need this two numbers because of a loop;

haddock3/src/haddock/modules/refinement/flexref/cns/flex_segment.cns

Lines 17 to 18 in 01eb9ae

    
           while ($nchain1 < $data.ncomponents) loop nloop1 
        
               evaluate($nchain1 = $nchain1 + 1)

that later is used in the expression;

haddock3/src/haddock/modules/refinement/flexref/cns/flex_segment.cns

Line 26 in 01eb9ae

do (store5 = $nchain1) ( resid $fle_sta_$nchain1_$nf : $fle_end_$nchain1_$nf

looks to me like this change can be achieved just by changing this whole nloop1:

haddock3/src/haddock/modules/refinement/flexref/cns/flex_segment.cns

Lines 16 to 31 in 01eb9ae

    
           evaluate($nchain1 = 0) 
        
           while ($nchain1 < $data.ncomponents) loop nloop1 
        
               evaluate($nchain1 = $nchain1 + 1) 
        
               if ($nfle$nchain1 = 0) then 
        
                   display NO FULLY FLEXIBLE SEGMENTS for molecule $nchain1 
        
               else 
        
                   display FULLY FLEXIBLE SEGMENTS for molecule $nchain1 
        
                   evaluate($nf=0) 
        
                   while ($nf < $nfle$nchain1) loop Xfflex 
        
                       evaluate($nf=$nf + 1) 
        
                       do (store5 = $nchain1) ( resid $fle_sta_$nchain1_$nf : $fle_end_$nchain1_$nf 
        
                                                and segid $prot_segid_$nchain1 ) 
        
                       display FULLY FLEXIBLE SEGMENT NR $nf FROM $fle_sta_$nchain1_$nf TO $fle_end_$nchain1_$nf 
        
                   end loop Xfflex 
        
               end if 
        
           end loop nloop1

to:

evaluate($nchain1 = 0)
while ($nchain1 < $data.ncomponents) loop nloop1
    evaluate($nchain1 = $nchain1 + 1)
    evaluate($nflex = 0)
    evaluate($continue = true)
    while ($continue) loop count_flex
        evaluate($nflex = $nflex + 1)
        if (defined(flex_seg_$nflex) = FALSE) then
            evaluate($continue = false)
            evaluate($nflex = $nflex - 1)
        end if
    end loop count_flex
    
    if ($nflex = 0) then
        display NO FULLY FLEXIBLE SEGMENTS for molecule $nchain1
    else
        display FULLY FLEXIBLE SEGMENTS for molecule $nchain1
        evaluate($nf = 0)
        while ($nf < $nflex) loop Xfflex
            evaluate($nf = $nf + 1)
            do (store5 = $nchain1) ( resid $flex_$nf_sta : $flex_$nf_end
                                     and segid $flex_seg_$nf )
            display FULLY FLEXIBLE SEGMENT NR $nf FROM $flex_$nf_sta TO $flex_$nf_end IN CHAIN $flex_seg_$nf
        end loop Xfflex
    end if
end loop nloop1

So you first counts how many flexible segments are defined using your new format. Then uses these new variables in the main loop. The segid part becomes $flex_seg_$nf instead of $prot_segid_$nchain1 and the residue range is $flex_$nf_sta and $flex_$nf_end.

I do think this is a very valid (and not so complicated) change

rvhonorato · 2024-08-06T08:22:45Z

Also, this syntax is used for some rather undocumented option to limit the random AIRs definition to selected segments per molecule.

And lets please not have this kind of thing, if this changes also helps us to tackle this then it's two for one

amjjbonvin · 2024-08-06T15:43:18Z

Actually it should be doable to refactor this as suggested as it is not used to limit the random air sampling to specific segments. We have now other parameters controlling that. I will look into it.

AnnaKravchenko added enhancement Enhancing an existing feature of adding a new one documentation Improve docs labels Jul 2, 2024

AnnaKravchenko closed this as completed Aug 6, 2024

amjjbonvin reopened this Aug 6, 2024

amjjbonvin added a commit that referenced this issue Aug 7, 2024

Addressing issue #919

df5e61e

rvhonorato assigned amjjbonvin Aug 7, 2024

rvhonorato added m|emref emref module m|flexref flexref module m|mdref mdref module m|mdscoring mdscoring module CNS Improvements in the CNS engine and removed documentation Improve docs labels Aug 7, 2024

VGPReys linked a pull request Aug 9, 2024 that will close this issue

Enhance flexible definition in config files / defaults.yaml #969

Open

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flexible segments definition #919

Flexible segments definition #919

AnnaKravchenko commented Jul 2, 2024 •

edited

Loading

amjjbonvin commented Jul 2, 2024 via email

rvhonorato commented Aug 6, 2024

rvhonorato commented Aug 6, 2024

amjjbonvin commented Aug 6, 2024

Flexible segments definition #919

Flexible segments definition #919

Comments

AnnaKravchenko commented Jul 2, 2024 • edited Loading

amjjbonvin commented Jul 2, 2024 via email

rvhonorato commented Aug 6, 2024

rvhonorato commented Aug 6, 2024

amjjbonvin commented Aug 6, 2024

AnnaKravchenko commented Jul 2, 2024 •

edited

Loading