-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Duplicate groups in an input file are not detected #7554
Comments
We disallowed to have two groups with the same name (even in different hierarchical levels / subtrees), because they would contain exactly the same entries (as you have remarked). This lead to many confusions for users, so we prevented it. See #1495 for details. I'm not sure if we really need a warning message. The only way to get duplicate names is by manually editing the bib file (or upgrading from an older version). I would say that's pretty rare. |
I think the bibliography tool should allow to do what users want, and not to "disallow" something. JabRef supports a lot of very different bibliography management styles, which is its big advantage, but the block in the GUI that inhibits creation of groups with the same name, even in different parts of the hierarchy, is a serious usability issue, and in this particular respect JabRef clearly looses to Zotero :( (even thougt the general policy of JabRef is so much superior of what other bibliography managers offer!)
If a group with the same name in different places of the hierarchy (or even in the same group) is being created, it would be nice if the GUI would issue a warning here, not a blocking error. The warning would explain that the group with the same name already exists (and mention its places in the hierarchy), but let the user do what he/she wants. Explicit and complete information wold remove most of confusion, IMHO. The GUI could even suggest a unique group name based on its current place in the group hierarchy. |
In my case, it is not rare but ubiquitous, and judging from other responses in #1495 and other related posts many users miss the group hierarchy with arbitrary group names. I tried to use a workaround in #1495, like "Diabetes > Treatment", and with the current implementation it is very, very inconvenient. |
So lets sum up the current situation:
A possible (easy?) way to reconcile these two requirements could be:
Example:
When creating a subgroup called
I deliberately suggest using 'Treatment (Asthma)' instead of 'Asthma/Treatment' to avoid impression that the group name is strictly bound to group's position in the hierarchy. It is not; so moving group 'Treatment (Asthma)', say, one level up to 'Medicine' would not make its name "incorrect", in the sense that the group still contains papers about asthma treatment no matter where in the hierarchy it resides. This also means that JabRef does not need to bother about renaming groups when their position in the hierarchy changes (as is implemented now). The suggested change would:
|
PS. If the developers agree on such (or similar) design I would take a the issue and offer a PR :) |
Another use case for groups with the same name – have a look at the screenshot example with computer/human languages: In this setting there is no need to name the 'Language' groups differently, that would be very awkward. Also, the group's semantics will not change if I move them to different location in the tree (with all their subgroups), for example into 'Computers/Software' as opposed to 'Computers/Hardware' at some stage. |
A logic (classification) case for non-unique group names:
|
If you mean a warning when opening file with duplicated group names, then I agree, there is not need for the warning there. Which just means that it is a legit situation :) |
I agree with you that it sometimes could be advantages to have the same group in two different places. However,
is a highly unusual concept from a user-interface perspective. For example, files with the same name but in different folders don't automatically sync their content. I think we should really stick to the tree nature of the group interface, meaning that groups are independent. As a way forward, I would propose the following:
What do you think? Help on this of course very much appreciated. |
ACK.
I was thinking about this. It could be that we see the concept as unusual only because we are so much used to the file system tree structure. But file system tree is a bad tool for bibliography management (tried it, switched to JabRef :) ), precisely because we can not place the same file in several different directories (or rather we can, but this is not very convenient). Groups in bibliography managers have the advantage that we can place the same file (or the same subgroup) in different hierarchies. In this case groups serve as a sort of "canned queries" on the database. When one gets used to the current implementation of JabRef groups, one starts thinking that the group tree and the group assignments are actually disjoint features. A tree is described in
Well, this is not always true. On Unix (and Unix-like systems like Linux and probably on Mac), a hard-linked file will be "syncronised" across two directories. Viz.:
So the concept of writing things into one place and finding the change in another place is not that foreign, even if sometimes different to trace. The concept of "same group in different tree branches" is quite similar to hard links in Unix (not soft links).
Here more input from users would be very welcome. I and you can see the situation one way, but it is hard to see how other users who did not see the code internals (or even the BibTeX database internals) will perceive the situation. I have started a discussion on https://discourse.jabref.org/t/hierarchical-groups-with-duplicated-names-are-actually-working-in-jabref-5-x/2619, as was advised in JabRef instructions, but so far it did not attract much comments or voting (well, it's weekend...). Maybe some feedback can be collected there? I am, personally, currently very enthusiastic about just allowing duplicate group names, with a warning, and start using them. I'm right now playing with the JabRef tip, and all functions with duplicate group names seem to work as expected. I must admit that I experienced several mysterious "index out of bound" exceptions, but they are very difficult to reproduce, and they seem to happen also with unique group names, so I do not know if this is connected with the groups. But in all in all having same-name groups does not seem to cause any bug trouble and works as expected.
I would be very happy with such solution. However, as you say, the fix might be very complicated, so it makes sense to see what can be done with simple means, even if functionality is not ideal. An astonishingly simple fix is in my PR #7558 . Of course, we (I) can try to implement the true group hierarchy, naming groups 'Computers/Programming/Languages' and 'Human/Languages', as suggested as a workaround in #1495, and showing just the leaf names in the group tree. I do not think that renaming a group or moving it in the hierarchy will cause big performance issues; groups tend to be small on average, I trust they are in dictionaries (so finding a group is constant time op), and renaming the group has exactly the same performance overhead even with the current implementation (since all entries belonging to the group must be updated when the group is renamed). So this sounds doable. But, as said, maybe a "lazy" solution proposed in the above PR is just good enough? If so we can have it as a semi-permanent solution and move on. At my site, many other things preclude migration to JabRef 5.4 (from 5.2), for example timestamp issue (#7527), file parsing issue (#7553).
Currently entry assignments to groups works as hard links, and this seems to be enough. If the full power of tree hierarchy is implemented (i.e. we can assign whatever names we want to groups in different tree branches), then symlinks no longer seem necessary. Probably they will be an overkill, IMHO.
I'll be more than happy to give any help I can, in implementation or otherwise; just my time resources are not always abundant (but I understand that everyone's time is precious :), so I'll do my best). |
One more thought about a possible fix:
E.g.:
In this way, we will have a unique group
Since the ID is always added as a fixed format numeric suffix, no ambiguity ever arises. For instance, if a user names her group "slaughterhouse(5)", and JabRef assigns ID 1024 to this group, then the JabRef entries will identify this group as When groups are moved around the tree, or even renamed, no changes need to be done to the BibTeX entries, since groups IDs do not change. Group names and/or IDs no longer need to reflect group's position in the group hierarchy. There will be hardly more than 10000 groups in a typical database, so the IDs will be comfortable 1 to 4 digit numbers, manageable also by humans. What is your opinion about such design? |
The id solution would be the best from an implementation point of view (since then we indeed have an unique identifier). However, the idea was that users can also edit the groups directly by changing the The only way I see for this is to use the complete path from the root to the group and save this in the groups field. But then you have to worry about migration... |
Thought about that. Editing a BibTeX file is in fact possible, at least as good as it is now:
In short, when editing, the ID number will be a natural extension of the group's name, and can handled together with the name. In JabRef internally ID will be the sole unique index identifying the group. There is a topic to think – whether just the ID needs to be unique or the combination 'name(ID)', and if just ID is unique then what should JabRef do when it encounters the same ID on the other group... But this is solvable, I believe. |
Indeed, if you have the full path in the group name, then you need to update it each time the group is moved in the tree. You also need to cope with errors when incorrect path is edited in manually. I think this is a complex solution, and complexity arises from severe denormalisation (in database sense). In case you have just an ID, you only need to update entries when you rename groups – which you also have to do in the current implementation, and I believe it is working (need to check ;). Moving groups in a tree around does not affect the entries. I would thus favour the 'name(ID)' solution. |
But finding the name is easy: I just look at the left side bar for the group, and put exactly what I see in the Yes, the "complete path" solution also has disadvantages... |
True, I somehow overlooked this. But then you can not simply copy that text with a mouse :) (or can you?). But I agree that having no IDs is simpler. This brings us again to a simple implementation with the current group code that just adds possibility to create identically named groups. I have hacked also the delete group possibility, without removing entries from other groups that have the same name (see the updated PR #7558), and it works OK for "Delete group and subgroups" command. For just "Delete subgroups" command I have troubles (entries are left with the deleted group name), but this behaviour may be related to #7556. |
Just as a comment from my side: As mentioned by @tobiasdiez you have to be very careful if you allow users to have duplicate groups. Problems I remember from the past (I am afraid, cannot find the respective bug reports anymore, except for #1495):
Now, I am not saying, that this will happen with the current JabRef version as well. But I just want to raise awareness, that these decisions can have major implications downstream. |
Dear @AEgit, thank you very much for the comment.
I have checked with my group tree, with all disambiguation prefixes removed: the problem is no longer there in the current master tip and in the suggested PR #7558. The groups, also with duplicated names, can be scrolled and used without problems.
Sure, the user will not keep in mind all possible configurations and duplications of all groups...
I agree. Therefore I suggest in the PR #7558 make a very simple check – the group is only removed from the group list in the entries if there are no more groups with that name left in the group tree. In this way, as long as the deleted group exists somewhere in the tree (possibly in multiple places), the entries stay assigned to these groups. I have just noticed that when I rename one of the dupliactes, its entries are reassigned to the new group and vanish from all old group duplicates. This is not what I (and probably most other users) would expect; the fix should be easy – if the group(s) with the original name exist somewhere in the tree, the old group name should be retained in the entries. The entries will thus belong to two groups if one of the duplicates is renamed. Only of the group name vanishes from the tree, it should be removed from the entries as well. I'll amend the PR to fix this.
My intuition is that as long as group objects inside JabRef are identified by, say, their index in an array or by their position in an internal tree structure, and not by their (external!) names, no big problems should arise. I agree that the behaviour that emerges on the duplicate group names is somewhat unusual, but it is now quite clear to me, and as long as it is consistent and no group information loss arises when manipulating groups, this might actually be an advantage of Jabref :). The most important issue is to preserve group information without information loss when importing old group files. As @swekia wrote in the recent issue #7442, creating a good grouping for yourself is "many, many hours of manual work" (which I can confirm from my side). To import groups created in the pre-JabRef 3.x version, when the group structure changed to unique (and so duplicates where very likely present in the input databases), a following import strategy might help: a) if a group name is unique leave its name as it is; If at some point it is decided that the groups in distinct parts of the tree should be distinct (either by appending group ID or by some other mechanism), importing from the current JabRef group tree (with possible duplicates, which, recall, can be made by manual editing of the file!) should assign the same entries to all new groups, make groups "synchronised" if syncrinisation is supported, and if not – issue a warning that this function is lost. IMHO this would very much ease the pain of upgrading and make work experience with JabRef much smoother :). Note that import of previous databases is a separate issue from the current behaviour of the JabRef group system at any implementation point, and it should be addressed separately. Seems like the road ahead is pretty save with these considerations... Or are there any dragons ahead which I have overlooked? |
@sauliusg : Thank you for this extensive explanation.
|
Agreed; a possible implementation is in the suggested in the PR #7558 (and a screenshot is present demonstrating the look-n-feel). Please have a look at the branch/screenshot how you find it.
I'm very glad that you confirm the importance of grouping for the users of JabRef, and your info volumes are impressive :). Several ideas how JabRef could highlight duplicated groups in a convenient way:
Which of the display features, or which combination you think are most usable? I think, however, that it would be very inconvenient if JabRef would pop up a warning box each time a database with duplicated groups is read in. The duplicates should be accepted and dealt with as the user wishes.
Indeed, deep trees like yours will have problems. A further refinement of the "tree prefix" method would be to use only as many levels of the tree as is needed for disambiguation. For instance, if two of your no-synonymous groups would be at level 26, but they would differ at the 24-th level, only two extra group prefixes would be needed ('Leaf Group (branch24/branch25)'). If even this is too long (one would set some limit, as you suggest), one could resort to middle character contraction (as in 'i18n') and/or to plain numbering ('i18n, 2nd group'). Of course this is only the import behaviour, the user can then rename the groups as one wishes. Note that in my suggestion this would only be relevant for importing old group trees that gad synonymous but different groups. If your group has no synonyms, or if you synonyms follow the suggested behaviour (i.e. they are intended to always keep the same entries), no transformation of your tree would be necessary. So, are we going for it? :) NB. The PR #7558 does not contain import functionality, it only contains changes in new format group handling. |
So far sounds this good to me. The import should not matter too much in my personal case (hopefully) since I tried to make sure, that all my group names were unique (I am still forced to use 3.8.2 since I cannot use JabRef before the following feature request is implemented: #4237). So here my take on the import feature (which might be added later?):
Thanks for your help! |
In the PR that I suggest groups with repeated names should become, once again, a normal situation in JabRef databases. Thus making a popup window each time you start JabRef, even if you can "configure it away", looks like an inconsistent behaviour (why should a program stop your workflow to warn about a feature which it created itself and which fully supports?) See below for a possible idea.
A quick indication of whether the current database contains groups with repeated names would be a dashboard-style indicator that says:
This announcement could be added right below the "Filter groups" input field, and above the "All entries" group. "Optionally" in this context means that, IMHO, JabRef group interface both with and without these features would be very usable, so maintainers/community should consider whether the benefits of adding extra these features justifies extra work and code increase. If the group count and duplicated groups are not displayed in JabRef itself, they are still possible to find quickly with some command-line magic :):
These From a perspective of Unix principles as I understand them, a function that can be conveniently factored out into a separate, independent program should be factored out and not clutter the main tool. But this might be a different philosophy from what Windows/Mac users are used to... :) |
If the "repeated" names are mentioned in the announcement, then this would indeed already improve the situation. Note, that with the current JabRef version not having the full path is less of a problem, since you can easily filter the groups using the groups filter if you know, which groups are affected. Maybe I should explain further, why this can be a problem: If you have a database with thousands of groups, it might not be obvious when creating a new group, that it already exists somewhere in the database (you will not notice this immediately just by looking at the groups panel). If a user wants to avoid having unique groups (there are domain-specific reasons for this, which I won't go into), not knowing about the existence and the location of these non-unique group names can be an issue. As for the command-line magic: Yes, that would be a workaround and I appreciate you sharing the respective scripts. But I would argue, that this is detrimental to the user experience. People will want to have a simple reference does-it-all-for-you tool. |
Thanks for the ongoing discussion. The twitter poll clearly showed that many users would like to have groups with the same name. However, it became also clear that it might be confusing if the entries are automatically shared. In the devcall we also discussed that there are also scenarios where the groups with the same name actually don't show the same entries (e.g. if one uses the hierarchical modifiers), potentially leading to even more confusion. For this reason I would strongly propose to go the id-based solution:
This unique identifier will also be very helpful for other scenarios. For example, we are planning a new feature in connection with JabRef Online, where users can share a group. For this a unique identifier is essential, since one needs to a way to identify the group even for different users. @sauliusg your work here in this PR is very much appreciated. But I think the id-based system is the more universal way forward. Would you be interested in implementing it? We core developers help of course where we can! |
Let me think if the PR can be extended, with reasonable effort, to handle uniquely identified groups. I'm a bit worried that the full unique group ID solution, while I agree that it is cleaner, but is at the same time more complicated, will take more time to implement, may introduce new bugs; whereas we need badly to continue working with our bibliographies, and the limitation on group names makes this cumbersome... |
I agree that this is a more universal and more usual solution. But then, more questions arise:
Technically, (4) will probably be the easiest to handle both manually and in JabRef code, but will bloat the files. Your ideas?
True.
Or create a duplicate manually, using a text editor :). The bottom line is that JabRef, IMHO, will have to deal with such situations anyway. |
In that case UUID or UUID+Group name seems the best suitable solution. |
This is actually handled in the proposed PR: when you try to create a group with the name that already exists in your tree, a yellow exclamation mark appears, and a tool-tip explains what would happen if you go for it. This is proposed instead of bluntly forbidding such groups. You can decide whether you want a duplicated group name, or whether you add more charters for disambiguation. |
A a side note: as far as group/file sharing goes, this problem we solved at my site once and for all, by simply using an SVN repo for storing bibliographies. This solution works perfectly since 2008, and Subversion gives numerous advantages: a) unlimited undo b) full history of changes c) synchronisation (with automatic merges) on different computers, for different users, and even in different directories of the same computer d) easy and informative diff's... and many more bonuses. What makes it difficult to share data is namely the decision to forbid (in the GUI) groups with identical names. It is all that natural that I (as a user sg) will want to have group "Languages", and my colleague also wants to have group "Languages". Why not? If the same papers land into that group, this is on one hand a benefit for us, on the other hand easy to sort out afterwards if we decide to split the group. |
Well, I can not promise that it is "rock-solid" ("no warranty" disclaimer, as always :); but I have now tested my branch for quite a while and things work as expected; I also do not see any big dangers ahead since, as I have mentioned, the proposed PR #7558 is a conservative extension, it does not change the current design of the groups, just allows to use more features in it.
People who read old databases with unique groups will not be affected at all, since the extension leaves all old functions intact. People who wish to maintain unique groups in their databases are in the position to do so easily; essentially they get the same message as before if they try to create group with a duplicated name. The only difference is that in the proposed PR #7558, you are no longer left with the disabled "OK" button, but you have a explanation and a choice. In addition, several small fixes take care that group renaming and deleting does not loose information when you have duplicated groups. Since you can have duplicated groups anyway (disabling "OK" button in the GUI does not ensure that program invariant – unique group names – is maintained in a database), the fixes are needed regardless to how GUI behaves. People (like me) who want to reuse group names are in a position to do so conveniently. People who later want to make groups unique can do so by renaming one or several groups; the code fixes in the PR #7558 take care that the entries stay assigned to the old group when a duplicated group is renamed, and to the new group if you select "assign all entries to the new group". So the system is very flexible in what you do. Seems like everyone-wins situation, doesn't it? |
@tobiasdiez These are hard choices. May I suggest the following roadmap for group interface:
With (1) and (2) in place, people (including me) can go on migrating to new JabRef (I would do this in my branch and start using it on the spot, and the switch to >= version 5.4 when this comes out; I can not use 5.2 since it still has #7548 PR not merged and I can not use 5.3 since it corrupts my database (issue #7010). Those who use unique group names within their databases, as @AEgit described it in detail, would continue to have full functionality available to them. Their databases would continue to work as before. Unique shared groups and collaboration scenarios are nice to have; however, these features are trickier to implement and needs careful design and more time; they could come somewhat later. Their introduction would be compatible with changes in (1) if it is compatible with the current JabRef group implementation. Would you (developers) agree on such process? Update: actually, after reading discussion in #7010 I found a workaround for my database: setting "Resolve strings for all fields except: url;abstract;not" saves and reads the database correctly again, without touching the abstract field. |
On a third thought, it seems that there is a way to introduce UUIDs without loosing editability:
This would ensure smooth migration from the current group implementation and to the future eventual sharing of groups on a server. |
#7554 (comment) These are all valid concerns. But most of these disadvantages can be taken care of by a more advanced implementation. For example, search for #7554 (comment) The problem I see with this approach is that there is a time where people may create groups with duplicate names. This makes the migration a lot harder. Thus I would prefer to sort this out before. |
If people will create groups with duplicate names, even after being warned, this means that they need such groups. Why not supporting this workflow? For eventual migration to groups with IDs, I have outlined one of the possible migration strategies above; it seems to be simple and to maintain smooth transition. Sure one needs to be cautious, but there seems to no unsurmountable difficulties on this path. |
More advanced implementation inevitably means more complicated, has chances to more bugs, and takes more time to implement. As you surely know: the cheapest, fastest and most reliable software components are those that are absent :) |
I really would like people to have groups with duplicate names. It's just that I feel this should be properly implemented instead of using a temporary workaround that makes the proper implementation harder. |
My argument would be that, on one hand, the proper implementation will not be made (much) harder by the suggested change; on the other hand, a quick fix now would enable users to continue working on their databases, while at the same time developers could design and implement group update without hectic, in parallel, and have enough time to consider multiple possibilities and multiple scenarios. |
From a user perspective this roadmap sounds like a good plan. |
I reckon one problem here could be, that user initially (mistakenly) ignore the warning and only later on realize, that they actually wanted unique groups. Yes, that would be a user mistake - but it is something to think about. Fortunately, you already outlined an approach to deal with this issue: If the non-unique groups are flagged and the groups can be filtered to show the non-unique ones, then the user can easily solve this problem within JabRef. |
Correct, the non-unique groups should be indicated, and I would implement this in the PR #7558 if we get a go from developers :). When discovered, duplicated groups can be renamed without loss of information – this already works in the current PR (tested in several different situations). Entries assigned to the original group stay in that group, and you have a choice whether to include the same entries into the newly renamed group or not. Here the existing JabRef functionality (a popup that asks whether the modified group should receive the entries of the original group) turned out very useful. |
I would agree with that - people, who already need this functionality, would have immediately access to it. Given, that as @sauliusg points out, this is technically already part of JabRef (it is just the GUI that forbids creating non-unique groups), the resulting changes to JabRef appear minor. Finally, if that is not convincing. maybe a compromise could be a solution. Implement the fix as described by @sauliusg but add an additional setting option to the preferences, where the user can actively control whether non-unique groups should be allowed in the GUI or not. Set the default to not allowing the creation of non-unique groups in the GUI (the current default). Add a description/tooltip indicating, that the user has to know, what he is doing if he wishes to allow non-unique groups and maybe call this a BETA feature (given that a later point in time this implementation might be replaced by a more sophisticated one). I reckon this should address the possible concerns and hopefully not be too much more work to implement? |
@tobiasdiez Regarding migration plan to truly unique groups with names in the entries (i.e. user-editable):
The UUID would always be a suffix with distinct syntax. Therefore, it is always possible to include a name that looks like it has UUID at the end – just add the true UUID as a suffix. E.g.: Here ``Uuid like group name (68a0a878-950d-11eb-b17b-ef8af80ecb58) Thus even in the very unlikely event of the user generating such strange UUID-like names, JabRef would be able to handle them, and still stay compatible with the current |
This only came to my mind later on and it is not directly related to the problem at hand (so, if this goes too much off-topic, maybe we should bring this discussion elsewhere): but, does a tag-based interface reflect the hierarchy that groups provide? |
Sorry, I might not have explained this well. I had the impression that @tobiasdiez was planning to add a tag-based feature which should replace the current groups hierarchy. That is where my questions are coming from. What I have been using so far, was the global union/intersection settings (they were available in 3.8.2 so - I hope - they are still available in the current version 5). That is exactly, what I want - if I select "Physics" I only want to see articles, that have directly been assigned to "Physics". I do not want to see articles, that are part of subgroups of "Physics" UNLESS I explicitly select those subgroups as well. |
Oh I see! I misunderstood you, sorry. No, the group tree is implemented in 5.x, and, AFAIU, there are not plans to remove it... (are we right, @tobiasdiez ?)
Works in 5.x and in PR #7558.
Works as above, except for a "union/intersection" toggle bug (you need to restart the JabRef for your change to take effect).
Works, depending on the global settings and on the "Physics" "Hierarchical context" setting, I guess.
As of current, in my tests, the 5.x supports all functionality what you described. In the suggested roadmap above, all this functionality could be (hopefully) retained, plus group disambiguation UUID cold be added. If I correctly understand, unique unambiguous group IDs and the current group hierarchy is in the plans, isn't it, @tobiasdiez ? |
The groups tree stays of course! My idea was to replace the Concerning the future steps, how would the migration to an id-based system look like after we allowed groups with duplicate names? Right now we can assume that the information in the groups field uniquely identifies a group (the only way to not have this unique identifier is by either migrating from a very old JabRef or manually editing the groups tree - in both cases we can simply issue a warning and take the first group matched by the name). Thus the migration is very easy, one simply needs to replace the group name by its id. But once it is officially allowed to have groups with the same name, this identification breaks down. Moreover, I don't really see a reason for urgency right now. The serialization of groups in the groups field has been done 5 years ago, and shortly thereafter the restriction in the ui was added. So this is in place for quite some time. A few months more doesn't change much in the grand scheme of things, and I actually don't think that a proper solution takes so long to implement. The changes should be comparable to the changes in the linked PR, plus some additional tests. |
Migration after the proposed PR will be just as easy as in the case of "unique" group names. The point is that the groups with identical names are still unique, in the sense that they contain, by definition, the same entries. So you have two options when migrating:
In both cases the users' group assignments will not change, and their work will be preserved. The different behaviour will have to be made explicit, but, once understood, it can either can be used. Also, nothing prevents me from adding the same group with the same UUID to multiple branches in the tree, so the functionality of PR #7558 is still needed, in one way or another, even when you transition to a new group implementation. Moreover, as you say yourself, "the /only/ way to not have this unique identifier is by either migrating from a very old JabRef or manually editing the groups tree - in both cases we can simply issue a warning and take the first group matched by the name" – which means that JabRef will have to deal with duplicated group names, no matter what. Thus one can not "officially allow" or "disallow" duplicated groups – the bibtex file is an external input for JabRef, managed at the users' discretion, and it can be processed by many external tools, many of which we don't even know. So asking whether "one should allow duplicates" is a wrong question based on unsatisfyable assumptions; what could be asked instead is "how do we deal with duplicated groups in the way that is most convenient for the users, and reasonably easy to implement/support for developers"? @AEgit has a very good point: "experience can be gained regarding the pros and cons of allowing non-unique group names. Maybe there are some hidden drawbacks that are not obvious at the moment." Or maybe the user will like the feature and start using it. I definitely see the use for this feature for myself. So the bottom line would be: migrating to group names based on UUIDs will be as easy with PR #7558 as it is now, since the groups essentially continue to be unique, in the sense that there is only one set of entries assigned to each named group. |
That's very reassuring! :)
This can be done in any group representation, even after the PR #7558 – for groups that are present several times in the group tree you could just show all their locations, one location per line :) |
That's not gona work. The point of having id's was that they uniquely identify a group...otherwise what's the point?
That's my point: we allow now duplicate group names, but they will behave differently (and more correctly in my opinion) after the migration to the id approach. Sorry, but I don't see the advantages. To be honest, in the we now spent discussing this, one could have easily implemented the solution for id-based groups. |
To me, this was a very unfortunate decision. It rendered the group mechanism unusable for my purposes (lost all group assignments that I had), and I am very upset each time when I see the deactivated OK button in the group creation dialogue, even more so since I know that this restriction is completely unnecessary, and even checked this statement in practice on my JabRef clone :).
May I quote another related Issue (#7442): "To be honest, I'm really angry at the moment as I'm not using JabRef for fun but really need it for work and I cannot waste that much time on recreating tagging, grouping etc. over and over again". It is actually a big praise to JabRef that it is used in professional environment, but also adds certain constraints on what changes and timings the users can tolerate – we need to have the work done, and hope that the tool will help us and spare us time – provided we can give due support to your guys who develop JabRef :). As a matter of fact, for me the issue is rather urgent. For 5 years, we could not use the group interface, so my colleagues abandoned JabRef grouping in favour of CLI tools; some neighbouring labs switched to completely different solutions; in case you are interested I have currently my grouping tree in Zotero (which does not put any restrictions on my group names) and do some command-line fiddling to sync the two repos, my bibtex file which I manage in JabRef and the Zotero DB. But this year I have started a new course in our university and have new students who need to manage bibliographies for their theses. I need to recommend them a good tool for managing bibliographies. I can not offer them a pure command line or dual tool solution, they have still many things to learn. JabRef would be a candidate no. 1 if it had an arbitrary (including duplicated) group name support and would not crash too often. I need to make the recommendation in the next weeks; two or three months are way too long to wait, by that time the students need to have their thesis well advanced, and definitely all literature overview done. So if we find a solution how to handle multiple duplicated group names into JabRef, for example as my suggested PR or in some other way, I would pull or write fixes/workarounds for #7010, #7442, #7351 and #7606 into my branch; if I know that these fixes will be in the next official JabRef release, we (I and my students) can then start working next week with my branch and switch back to mainstream when the JabRef release is out (no hurry there, then). I would sync my Zotero tree with JabRef, finally have my groups assigned and continue working on my main subjects :). As quick fix for grouping now is much much more important for us than a "correct" solution in tree months... |
It will work the same way as it works now. It will work the same way as it works now. In the essence, the current JabRef code handles multiple groups with the same name :) I understand that this was not planned, but it can be viewed as an emergent phenomenon :).
The point in having UUIDs would be to help users have truly different groups in different parts of the tree that have identical names. This feature can be added later, as you say, in two or three months if timing permits. |
Cheers, that is re-assuring. I am still not sure, as to how this is going look and/or what the benefits are (the good thing at the moment is, that groups can easily be modified in any text editor), but probably this is something the user needs to see in action to understand the advantages. |
The advantage is outlined in my timeline description above.
Could be, but I do not see how. When I claimed three weeks ago that I think one can allow identical group names safely and conveniently in just a few lines of code, I then made code that demonstrates this (#7558). I do not see how I can make an UUID based solution in a comparable time with a comparable effort. You know JabRef code much better, so maybe for you it is easer to implement, but for me it would definitely take more time and effort; on the months range, actually, compatible with your estimate. ALso, as you said, the full UUID solution needs to be thought through. Leaving just UUIDs in bibtex entries and eliminating names is not a good option, as I have mentioned before... |
I reckon @sauliusg point is that his solution will just enable something, that is already implicitly possible with the current JabRef version (either by using old databases or by editing the database manually in a text editor). As such, it does not pose a major change in the implementation of JabRef. The ID based approach, on the other hand, requires major thinking on its potential downstream effects and thus will not be that easy to implement. If this more sophisticated ID approach is implemented without much thought, it could lead to much more problematic situations then the available simple approach (which only enables something that is already possible without using the GUI). I am only a user and I will probably also not use non-unique group names, so I am in no position to demand anything - take my input here just as thoughts on the topic. Having said that and given these different opinions on the topic, would it be worthwhile to have a sort of a BETA version of Jabref (similar to a separate branch on Github, but I have something more stable in mind), which allows users to use features, which the main developers of JabRef do not think should currently appear in the main/stable version? It could either be a separate version (though I think that would just make things more complicated) or the currently used developer version, but with an additional preference window, which allows the user to enable certain BETA features, e.g. the creation of non-unique groups. |
I think, our different branches should cover that request. We aim for reducing the number of preferences and keep the number testing candidates small. We even cannot catch up with the current PRs; thus there IMHO is no chance to maintain a beta version. Maybe, if some developer jumps in ^^ |
JabRef version 5.2, 5.3, master commit 049acb9 on Linuxmint-20.1.
JabRef 100.0.0
Linux 5.8.0-45-generic amd64
Java 14.0.2
JavaFX 16+8
Duplicated groups in different branches of the group tree are not detected when a database that contains them is read in.
But on a second thought, and after playing a bit with JabRef 5.x master branch, I start thinking that this is not a bug but actually a very useful feature! I've opened a detailed discussion in https://discourse.jabref.org/t/hierarchical-groups-with-duplicated-names-are-actually-working-in-jabref-5-x/2619. So don't fix it!
Steps to reproduce the behavior:
The resulting BibTeX files are attached in 'biblio.zip':
biblio.zip
The manually created duplicate is used only to reproduce a minimal example of this behaviour; in real life large number of duplicates emerged when porting previous group trees from the previous revisions of my database.
This behaviour actually seems to be very useful, since it allows me to create (once again!) groups with identical names in different places of the group hierarchy, and assignment to one such group adds the same entry to all such groups, which might be very useful (detailed discussion in https://discourse.jabref.org/t/hierarchical-groups-with-duplicated-names-are-actually-working-in-jabref-5-x/2619 ).
This is how it looks after some editing of the database:
The resulting bibtex file is added in 'experimets-with-group-hierarchy.zip'.
experimets-with-group-hierarchy.zip
Log File
The text was updated successfully, but these errors were encountered: