-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make parent term for transcription factor activity #13588
Comments
I'm not sure how this is handled in the MF refactoring project, but the reason for not including it in the ontology traditionally is that it would be defined as a generic function that regulates transcription. An annotation would be no more meaningful than an annotation to the process. |
But what is the activity that is described by these terms? Just protein binding, just nucleic acid binding? These terms are not linked to these MFs. There is a vague activity: "...to modulate transcription", which is seems processy, but would seem to be at the core of the definition. From a user perspective, I think that transcription factor activity is useful and it would seem helpful/expect to have a grouping term. nucleic acid binding transcription factor activity transcription factor activity, protein binding |
Part of the background for deleting the original "transription regulator activity" term from the MF branch is that when I went to transcription specific meeting in 2010 with a poster specifically about the transcription overhaul, I talked to some researchers who found it confusing to remember whether they needed to do their enrichment using MF terms or BP terms because of the fact that there didn't seem to be any difference in meaning between the MF term "transcription regulator activity" and the BP term for "regulation of transcription" (now named "regulation of nucleic acid-templated transcription"). Looking into it, David H and I agreed that there did not seem to be any way to define the MF term "transcription regulator activity" such that it could be distinguished from the BP term. The initial setup of GO indicates that MF, BP, and CC are supposed to be orthogonal, i.e. non-overlapping, so it's a problem to have a term in both MF and BP that means exactly the same thing. I don't know how the MF refactoring is planning to handle this either, but I still don't see how to define the grouping term in MF that you are requesting in a way that makes it distinct from "regulation of transcription" in BP. |
Was it the name "transcription regulator activity" that was confusing, rather than distinguishing between regulating transcription and the (not-very-definable) quality of possessing transcription factor activity? After all, not all things that regulate transcription are "transcription factors". The people I've discussed it with find it more confusing to have two terms with near-identical definitions unlinked in the ontology. |
sequence specific RNA polymerase II transcription factor activity instead of: transcriptional activator activity, RNA polymerase II core promoter proximal region sequence-specific binding which confuses me every day (promoter or proximal region can be a SO extension on the concurrent DNA binding term). In fact I still don't see the need for 2 terms in different branches (the transcription factor activity and the binding term), a single binding term MF DNA binding transcription factor activity with extension "SO term describing region" Job done. The problem is the word "factor". Its a term used by biologists historically when they didn't know what the actual molecular function was (splicing factor, translation factor), so they are really often"processes of function unknown" But I agree that different types of TF in the MF ontology would benefit from a common grouping term. It would make the correct term more findable by drill down during curation. |
For the refactoring: Here's a (quite old) ticket on renaming - which further simplifies the naming scheme based on a suggestion from Paul T: Here's the design pattern/template ticket: geneontology/molecular_function_refactoring#23 (comment) In LEGO, if the factor is also known to bind to another TF or to RNA polII as part of regulating transcription, this can also be represented. It would be straightforward to add classes that capture this. |
(See below for a newer version of this term) How about this as a parent: +[Term] Thanks in advance for your input. Pascale |
Needs something about directness of regulation. For BP - 'regulation of transcription' covers the activity signal transduction pathways that regulate transcription. But we need something more direct for MF. Also - isn't all of this regulation of transcription initiation? |
not necessarily, a transcription regulator can regulate elongation or termination This bit Probably needs to be clear that the gene product needs to bind RNA polymerase (directly or indirectly through another transcription regulator)? Maybe this does not apply to termination factors so an additional clause might be required for this? |
Seems kind of odd that you guys are trying to put "transcription regulator activity" back into the ontology at the same time that the term "transmembrane transporter activity involved in import into cell" is being proposed for obsoletion because: From: go-discuss go-discuss-bounces@lists.stanford.edu Dear all, The proposal has been made to obsolete 'GO:0098663 transmembrane transporter activity involved in import into cell’. The reason for obsoletion is that this term contains information that belongs in the Process ontology ("import into cell”). [snip] @ukemi and I spent a long time thinking about "transcription regulator activity" as a MF term. The fact that you can not define it in terms of function really highlights that this does not represent A function, but rather involvement in a process. @ValWood is correct that a constraint to try to say that a "transcription factor" must bind the RNA polymerase either directly or indirectly is not sufficient as bacterial antiterminators such as NusA bind the RNA of the nascent transcript, . It seems really inconsistent with our founding principles that the three aspects, MF, BP, and CC should be orthogonal, but we are putting back a term that can not be defined in a way that distinguishes it from a biological process term. |
Hi Karen, I think the "transporter involved in process" is a slightly different issue. For transcription regulators, a broad grouping term will be helpful for curators to locate the "transcription factor activity" functions. There is clearly a problem in locating the terms, evident from annotation inconsistencies. The terms for sequence specific transcription factors are buried deep int he graph, in such a way that even experienced curators cannot locate the correct terms, and our users cannot find them. However, I would be very happy to go the same route with TFs: MF "DNA binding transcription factor" involved in BP "transcription from RNA polymerase II promoter". I don't believe the separate branches for "DNA or protein binding" and "transcription factor activity" help the curators, or users of GO. In this scenario we would only have a small number of transcription factor MF terms which represent the major classes they are naturally grouped into by biologists. I think that is what Pascale and Paul are trying to achieve, in which case I absolutely support this change too. |
fwiw, I agree with Karen. |
But the whole of the branch under: The grouping is only a way to make it easier for curators to locate them. Really most of these "transcription factor activity" terms should not be in the MF ontology at all. What we have is deeply unsatisfactory. Personally, I would be happy for most of these to go and only use the ones under The alternative is to obsolete the "transcription factor which is only related to 'regulation of transcription' branch. I'd be happy for this to happen too. However, there is clearly a large need in the community to identify " DNA-associated transcription factors. |
I really don't understand this at all. I think GO should aim to represent the 'functions' of gene products and complexes as biologists understand them, using terms that they recognise. In some cases this requires defining MFs at least partially in terms of some biological process context. dbTF and signalling receptor activity are examples of this. What do we gain by taking such a narrow view of molecular function that we don't allow for this? For the record though, I don't think that defining molecular function as a process that a single protein or complex can carry out precludes that some MFs require a particular process context. Pascal's proposed TF activity grouping term is not redundant with BP 'regulation of transcription'. Here's her proposed definition: "Controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence or to other regulatory protein factors." This is much narrower than the BP term: It covers any upstream process that regulates transcription, including signal transduction pathways. This is not something you can make a curation rule against (as - it comes from the definition of regulates as a transitive relation. Whether you like it or not, it will creep back in with inference from perfectly legal and reasonable GO-CAM models. We could define a BP for direct regulation of transcription that just covers this. But I think it's just easier to use a directly_regulates relationship to a BP term to define the general MF. This then provides a simple general pattern for the whole branch. A more general discussion and details of proposed TF refactoring is here: https://docs.google.com/document/d/11BH6PsdH6u0hgkS_KYhYlAHEhdufg_FPduiFwsGNA04/edit# A partial implementation is in this branch #14318 - but may be at least partially superseded by Paul and Pascale's attempt to simplify the branch (I agree that some of the more complex terms should probably go - or at least have simpler names). Note that the proposed name changes are separable from the formalisation (and can be practically separated because implementation is via TSV and script). |
I am referring specifically to this term, which is only about the regulation of transcription part of the function: Currently we need to provide users with specific instructions to find DNA binding TFs "All sequence-specific DNA-binding transcription factors should be annotated to at least two GO Molecular Function terms, either directly or by transitivity (i.e. annotated to a more specific "descendant" term linked to one of these terms): GO:0000976 transcription regulatory region sequence-specific DNA binding (view in QuickGO or AmiGO) and periodically check that both branches are annotated appropriately (it isn't simple because you need to make different checks to cover pol I, II and III). I still think a single MF term for each flavour of TF + a connection to the appropriate BP for transcription would be sufficient. I think it would also be less confusing if the grouping term in the MF ontology was "transcription factor activity" rather than "transcription regulator activity", with Pascale's proposed definition. |
I can see that the comment I made yesterday might be confusing. I think that the redundancy will become clearer once the terms are collected together under a "transcription factor term". |
I understand the redundancy issue. Please see the Google doc on TF refactoring proposal linked above. If I understand correctly, the current structure is an attempt to bake into the ontology the strict GO rules about evidence and the limited means we have of recording it (one evidence code and paper per statement). The move to GOCAM makes hybrid terms (DNA binding + some specific protein binding target) more viable, if you can come up with good names for the terms. Making it easier to add multiple evidence to classical go annotations would help too. |
Hi all, I understand the concerns - however the annotation inconsistencies are so high in the area that we really need to do something to address it. We'll send a proposal shortly. Pascale |
First part of the proposal: the transcription regulator terms will all be grouped under a general term, 'transcription regulator activity' +id: GO:0140110 |
This sounds too broad. Strictly this could apply to MAPK or a TGF-beta receptor (they are both molecular functions and both (indirectly) regulate transcription. The earlier def seems better as specified mechanism: "by binding to a specific DNA sequence or to other regulatory protein factors." although to be sufficiently tight this should mention what 'regulatory protein factors' count. The other way to do this is by distinguishing direct regulation of transcription from indirect. |
Hi @dosumis How about: "by binding to a specific regulatory sequence or the transcriptional machinery." ? (In any case this is a 'do not annotate' term, so what is represents should also be captured by its children). |
added GOC:txnOH-2018cross-reference for #13588
Could you make a parent term "transcription factor activity"
children-
GO:0001071 nucleic acid binding transcription factor activity
GO:0000989 transcription factor activity, transcription factor binding
(I know that there is probably some historical baggage here, but it would save us a lot of pain as our join solution for the FB ribbon cell is flakey).
The text was updated successfully, but these errors were encountered: