-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Patterns for axiomatisation of transcription factor activities #23
Comments
From @cmungall on August 28, 2015 16:16 I think 'directly activates' is correct here, even if we have a more general direct regulation parent that is not restricted to activities |
From @ukemi on August 28, 2015 17:5 It fits the definition of directly activates. |
The ontology currently uses this pattern: molecular_function that ('has part' some 'nucleic acid binding') and ('part of' some 'regulation of transcription, DNA-templated') But in Noctua/LEGO curators are using directly_activates rather than part of: http://noctua.berkeleybop.org/editor/graph/gomodel:583f430000000041 directly_positively_regulates may be justified here (based on direct interaction between the TF and other parts of transcriptional apparatus). We need to decide on one of these two patterns (or reconcile the two with reasoning if that is possible). |
How to formalise this: ? Differentia:1. Regulatory effect on transcription - record via link to BP. Two possible patterns:(a) part_of some 'regulation of transcription, DNA templated' # (use +ve/-ve R terms for transcriptional activator/repressor terms) OR (b) directly_regulates some 'transcription, DNA templated' # (use directly_(positively/negatively)_regulates edges for transcriptional activator/repressor terms One of the aims of MF design patterns for compound functions such as this one is to maximise useful causal inference chains in LEGO. In this respect, pattern (b) is better. It doesn't obscure regulation of the whole process via a part_of link. However, ideally we'd still get inferred annotation to the relevant regulation of transcription BP term. solutions to this:
This is a general issue - covered in #49 2. RNA polymerase type:formalise via transcription type. 3. DNA target bound: has_necessary_component* {transcription regulation region DNA binding}* see #25 (comment) Some cleanup or target terms needed: These can be defined using logical defs that use SO terms as differentia. The promoter_element hierarchy might cover what's needed. See also geneontology/go-ontology#13002 4. direct regulation e.g. by binding ligand or metal ion binding
* see #25 (comment) |
Draft pattern here: Still need to sort out naming. Some notes:
|
CC @thomaspd |
Paul: If we have a general 'necessary part of': If transcription of gene X requires the activity of TFs A, B and C, we could say each activity is a 'necessary part of' transcription of gene X. By analogy to necessary_component_of this would be a subproperty of a causal relation - so not => loss of causal chain. |
Notes on name and definition changes.The original refactoring/implementation of detail TF terms in the GO relied assumptions about curation that may not longer apply in the new era of GO-cam curation. Classic GO curation is very granular, with small amounts of evidence - often single experiments (?) - being used for annotation. It is rare for a single experiment to show that a transcription factor acts to regulate transcription via DNA binding and via protein binding. To cope with this, two branches were added to the GO - one covering activities that (directly) regulate transcription via protein binding and another covering activities that directly regulate transcription via DNA binding. A broad interpretation of the phrase 'transcription factor' - covering cofactors and DNA binding transcription factors - allowed both branches to include this phrase in their labels. As DNA binding transcription also bind proteins as part of their regulation of transcription*, annotation of these activities relied on co-annotation with appropriate terms from each branch (although a small number of terms appear under both branches). With GO-CAM modeling, we can much more easily combine different pieces of evidence to build a model of gene product activity, so these considerations no longer apply. In GO-CAM, and as part of ongoing work on refactoring molecular function, we aim to model the compound nature of molecular functions as far as possible. We therefore need new design patterns for (DNA binding) transcription factor activity that allow us to capture its compound nature (DNA binding and protein binding components and their relationship to regulation of transcription). The original refactoring deliberately omitted a general term for transcription regulator activity. There were two reasons for this:
This refactoring:
* I would be very interested to hear of any known exceptions to this. New proposed labels & textual definitions(Proposed name changes are also discussed in #5 and in Paul's comment upthread). Template-based textual definitions for TFs using the names of component activities are proving hard to specify, so a free-er approach is taken here. transcription factor activity, protein binding: "Interacting selectively and non-covalently with any protein or protein complex (a complex of two or more proteins that may include other nonprotein molecules), in order to modulate transcription. A protein binding transcription factor may or may not also interact with the template nucleic acid (either DNA or RNA) as well." --> transcription regulator activity: "Direct regulation of DNA-templated transcription via selective, non-covalent interaction with elements of the transcription initiation complex or associated proteins. Associated proteins include any protein capable of interacting, directly or indirectly with the transcription initiation complex." Questions:
transcription cofactor activity: "Interacting selectively and non-covalently with a regulatory transcription factor and also with the basal transcription machinery in order to modulate transcription. Cofactors generally do not bind the template nucleic acid, but rather mediate protein-protein interactions between regulatory transcription factors and the basal transcription machinery." --> transcription cofactor activity: "Interacting selectively and non-covalently with a regulatory transcription factor and also with the basal transcription machinery in order to modulate transcription. Cofactor activity does no involve nucleic acid binding, but rather mediates protein-protein interactions between regulatory transcription factors and the basal transcription machinery." transcription factor activity, sequence-specific DNA binding: "Interacting selectively and non-covalently with a specific DNA sequence in order to modulate transcription. The transcription --> transcription factor activity: "Direct regulation of DNA-templated transcription via sequence-specific DNA binding and selective, non-covalent interaction with elements of the transcription initiation complex or associated proteins. Associated proteins include any protein capable of interacting, directly or indirectly with the transcription initiation complex." Notes:
Questions:
RNA polymerase II transcription factor activity, sequence-specific DNA binding: Interacting selectively and non-covalently with a specific DNA sequence in order to modulate transcription by RNA polymerase II. The transcription factor may or may not also interact selectively with a protein or macromolecular complex." --> RNA polymerase II transcription factor activity: "Direct regulation of transcription from an RNA polymerase II promoter via sequence-specific DNA binding and selective, non-covalent interaction with elements of the transcription initiation complex or associated proteins. Associated proteins include any protein capable of interacting, directly or indirectly with the transcription initiation complex." transcription factor activity, sequence-specific DNA binding transcription factor recruiting: --> transcription factor activity, transcription regulator recruiting: "Direct regulation of DNA-templated transcription via sequence-specific DNA binding and recruitment of a transcription regulator (transcription factor or cofactor) via direct, non-covalent interaction with the regulator. Recruitment here means that the activity in question is required to bring the transcription regulator to the transcription initiation complex or associated proteins." Questions:
transcription factor activity, sequence-specific DNA binding transcription factor recruiting: "The function of binding to a specific DNA sequence and recruiting another transcription factor to the DNA in order to modulate transcription. The recruited factor may bind DNA directly, or may be colocalized via protein-protein interactions." --> transcription factor activity, transcription factor recruiting: "Direct regulation of DNA-templated transcription via sequence-specific DNA binding and binding of another transcription factor leading to its recruitment to an binding of a DNA regulatory region. CC @astridla, @RLovering, @thomaspd - Comments please. |
[Quoted text edited down by DOS]
The original refactoring/implementation of detail TF terms in the GO relied assumptions about curation that may not longer apply in the new era of GO-cam curation. Classic GO curation is very granular, with small amounts of evidence - often single experiments (?) - being used for annotation. It is rare for a single experiment to show that a transcription factor acts to regulate transcription via DNA binding and via protein binding. To cope with this, two branches were added to the GO - one covering activities that (directly) regulate transcription via protein binding and another covering activities that directly regulate transcription via DNA binding. A broad interpretation of the phrase 'transcription factor' - covering cofactors and DNA binding transcription factors - allowed both branches to include this phrase in their labels. As DNA binding transcription also bind proteins as part of their regulation of transcription*, annotation of these activities relied on co-annotation with appropriate terms from each branch (although a small number of terms appear under both branches).
* I would be very interested to hear of any known exceptions to this.
[Astrid Lægreid] : I don’t know of any exception to this
...
transcription factor activity, protein binding: "Interacting selectively and non-covalently with any protein or protein complex (a complex of two or more proteins that may include other nonprotein molecules), in order to modulate transcription. A protein binding transcription factor may or may not also interact with the template nucleic acid (either DNA or RNA) as well."
-->
transcription regulator activity: "Direct regulation of DNA-templated transcription via selective, non-covalent interaction with elements of the transcription initiation complex or associated proteins. Associated proteins include any protein capable of interacting, directly or indirectly with the transcription initiation complex."
comment: This term is a general class that encompasses (DNA binding) transcription factors as well as cofactors.
Questions:
* Should we add 'direct' to the name to more clearly distinguish from upstream regulators?
[Astrid Lægreid]: If we want to allow any signalling molecule upstream of transcription to be annotated with “transcription regulator activity”, then, yes, “direct” would be helpful
It may however not be easy to define what is “direct”, because we, in most cases, do not know whether the transcription regulator interacts directly with one of the components of the “polymerase initiation complex”, or whether one or more proteins are “between” the transcription regulator and one of the proteins in the initiation complex. (See: http://dx.doi.org/10.1016/j.sbi.2017.03.013, http://dx.doi.org/10.1016/j.bbagrm.2016.10.010, http://dx.doi.org/10.1002/anie.201608066)
Maybe there is a way to that all proteins interacting directly or indirectly with the “initiator complex” are “direct transcription regulators”? (as opposed to regulators acting more upstream)
* Anticipated objection: we don’t actually have TI complex in GO. Perhaps 'basal transcription machinery would be a better term?
[Astrid Lægreid]: Even though I am not an expert in the complex molecular events involved in transcription initiation, elongation and termination, my ‘high level’ understanding is something in this direction
1. Transcription factor binding specifically to regulatory regions in the gene enable formation of the “RNA polymerase II initiation complex” (this formation proceeds through a number of cascade-like ‘protein recruiting events’, where a cruical stage is the existence of a complex whichin which RNA polymerase II is becomes phosphorylated, activating it’s enzymatic capabilities within a stable “basal transcription complex”
2. the stable “basal transcription complex” initiates transcription, thereby starting the “transcription elongation” phase, which is to some extent also biologically regulated (I don’t know the mechanisms); I’m not sure, but think that maybe the general view is that the “basal transcription complex” is a kind of ‘core DNA-templated RNA transcription’ machinery that catalyses the RNA polymerization throughout initiation-elongation-termination
3. when the “basal transcription complex” reaches “termination signal(s)” (certain gene sequences), termination can occur; again, I think that this process is regulated (not sure whether there are several possible termination signals, also not sure which/how many protein factors are involved).
To my knowledge, regulation of step 1, initiation, is regarded to be most decisive for the ‘time’ and ‘quantity’ aspects of transcription. I think that what is regulated is when and how frequent a new ‘transcription cycle’ is started. This, I think, are the assumptions underlying our picture of the transcription factors (which in reality are a kind of “transcription initiation enabling factors”) as the most interesting/decisinve regulatory factors of transcription
________________________________
transcription cofactor activity: "Interacting selectively and non-covalently with a regulatory transcription factor and also with the basal transcription machinery in order to modulate transcription. Cofactors generally do not bind the template nucleic acid, but rather mediate protein-protein interactions between regulatory transcription factors and the basal transcription machinery."
-->
transcription cofactor activity: "Interacting selectively and non-covalently with a regulatory transcription factor and also with the basal transcription machinery in order to modulate transcription. Cofactor activity does no involve nucleic acid binding, but rather mediates protein-protein interactions between regulatory transcription factors and the basal transcription machinery."
is_a: transcription regulator activity
[Astrid Lægreid]: Again, maybe we indeed need to introduce the concept «RNA polymerase initiation complex (or “machinery”) since I think it is very likely that the transcription factors mainly interact with proteins that only help form the “basal transcription complex/machinery” (I feel a bit uncomfortable with the term “machinery”, but maybe it is well established in GO), but in many/most cases don’t interact with the “basal transcription complex” itself, once it is starting on the ‘initiation-elongation-termination’ –“road”
|
RNA polymerase II transcription factor activity, sequence-specific transcription regulatory region DNA binding: "Interacting selectively and non-covalently with a specific sequence of DNA that is part of a regulatory region that controls transcription of that section of the DNA by RNA polymerase II and recruiting another transcription factor to the DNA in order to modulate transcription by RNAP II."
1. Needs name to more clearly distinguish from:
RNA polymerase II transcription factor activity, sequence-specific DNA binding: "Interacting selectively and non-covalently with a specific DNA sequence in order to modulate transcription by RNA polymerase II. The transcription factor may or may not also interact selectively with a protein or macromolecular complex."
2. Why does it live here:
![image](https://user-images.githubusercontent.com/112839/28217200-3b8f7d00-68ac-11e7-86b2-488de06ebcf2.png)
but not under 'RNA polymerase II transcription factor activity, sequence-specific DNA binding'
In my understanding, these two terms pertain to the same functionality
Since I cannot think of any transcription factor that does not enable its function via interacting with other proteins/complexes, I think that there should be no “may or may not” in the descriptive part “The transcription factor may or may not also interact selectively with a protein or macromolecular complex."
|
Notes on discussion with Astrid: Tie this branch down to: "direct regulation of transcription initiation"
|
This issue was moved to geneontology/go-ontology#16970 |
From @dosumis on October 28, 2015 13:46
From @dosumis on August 28, 2015 9:56
We currently have patterns like this:
But these patterns are not safe. It is not necessarily the case that being part of a regulatory process entails being a regulator of the regulated process. This pattern probably arose from implementation of the general MF part_of BP pattern. In this case, it would be better to directly assert MF regulation of BP. But which relation to use?
Perhaps directly activates:
Current def: "p directly activates q if and only if p is immediately upstream of q and p is the realization of a function to increase the rate or activity of q."
But see notes from 2015-07-23 eds meeting on defining directly postively regulates.
CC @cmungall @ukemi
Copied from original issue: geneontology/go-ontology#12033
Copied from original issue: geneontology/design_patterns#2
The text was updated successfully, but these errors were encountered: