Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Brainstorming] Groups / keywords architecture #12133

Open
ThiloteE opened this issue Oct 30, 2024 · 8 comments
Open

[Brainstorming] Groups / keywords architecture #12133

ThiloteE opened this issue Oct 30, 2024 · 8 comments

Comments

@ThiloteE
Copy link
Member

ThiloteE commented Oct 30, 2024

There has been a lot of discussion and confusion about folders, groups, tags, keywords and labels and what differentiates them. See for example
#11026 (comment) and #8739 (comment)

What we have right now in JabRef 5.15:

  1. A Frankenstein groups feature that is actually something in between groups and keywords (more keyword than group, to be honest), but we call it groups and the information is stored in the "groups" field, but also partially in binary information in JabRef internal syntax at the bottom of the library file. Entries can be added and removed to groups via the entry editor and via the groups sidepane on the left. See https://docs.jabref.org/finding-sorting-and-cleaning-entries/groups.
  2. The field "keywords", whose definition follows the bibtex standard. Those keywords are accessible via the entry editor.

What primary characteristics differentiate group/keyword systems?

  • Can groups/keywords be nested? (Is there a hierarchy?)
    • Currently, yes, but groups and sub-groups cannot have the same name.
  • Can entries be part of multiple groups?
    • Currently, yes, that's why it might be more accurate to call it a keyword system.
  • Are groups/keywords associated with the entry itself?
    • Currently, yes. that's why it might be more accurate to call it a keyword system.

What secondary characteristics exist? Those are qualities that depend on how the primary qualities are implemented.

  • If you change the entry's keyword, will it change the name of the group(s)?
  • If you change the entry's keyword, will it create a new group, if it doesn't already exist?
  • Can entries be automatically assigned to groups/keywords?
  • Where is the data stored? In a (.bib) library file or a database?
  • What happens if entries / groups / keywords are shared remotely? Will remote (server) take precedence or the local file? What preferences have to be shared?

Database structure of JabRef

  • As far as I am aware, JabRef 5.15 stores everything in the library file.

Database structure of external Apps

@koppor
Copy link
Member

koppor commented Oct 30, 2024

This refs #11026 (comment).

@ThiloteE
Copy link
Member Author

Koppor, you think like me. Funny. I referenced the same comment in my second sentence of this issue here.

@koppor
Copy link
Member

koppor commented Oct 30, 2024

For a structured approach, one needs to write down what existing tools are doing. For instance: BibDesk.

They are are good in distinguishing automatic and non-autoamtic groups:

Image

I think, users need both: automatic (e.g., based on citations, keywords, ...) and non-automatic (manual categorization)


This also refs https://github.com/JabRef/jabref/blob/main/docs/decisions/0019-implement-special-fields-as-separate-fields.md. Thus, the dimenstion is not only automatic and manual, but also how to render in the entry table.

@koppor
Copy link
Member

koppor commented Oct 30, 2024

To really come up with a solution, one needs to have a minimal example showing the different options. One can start with Chocolate.bib. In other words requirements analysis 😅

@ThiloteE
Copy link
Member Author

ThiloteE commented Oct 30, 2024

This issue here is not yet about a solution. Just brainstorming for now. I wanted you to look at the Thunderbird link I posted.

@koppor koppor changed the title Groups / keywords architecture [Brainstorming] Groups / keywords architecture Oct 30, 2024
@koppor
Copy link
Member

koppor commented Oct 30, 2024

We need "Draft issues" 🤣🤣

I personally use OneNote for such things, but in the Web Browser, its aweful 🙈.

@ryan-carpenter
Copy link

For instance: BibDesk ... are are good in distinguishing automatic and non-autoamtic groups:

Other reference managers also separate or distinguish between these, and I agree that JabRef could benefit from an easier way to do this. I often use colours or icons to indicate when a group is search-based.

I think, users need both: automatic (e.g., based on citations, keywords, ...) and non-automatic (manual categorization)

Absolutely essential.

@ryan-carpenter
Copy link

  • If you change the entry's keyword, will it create a new group, if it doesn't already exist?

Changing an entry does not automatically create "graphical" groups, so for some time my workflow included adding (text) groups to entries, and then creating graphical search-groups to appear in the panel. One day, I finally discovered that explicit groups also located the entries of interest automatically and that renaming the graphical group had the same effect on the text entry. This makes sense to me, though I did make sure to test carefully to avoid unexpected "corruption" of my grouping.

Automatic creation of graphical groups from the text groups seems like a more predictable/discoverable approach. If users don't want the panel to show every group contained in the entries, then the settings for each group could include a "hide" option. Creating/showing the groups by default also has the advantage of revealing errors, such as typos and accidental variations, in the text groups.

If people think having keywords and groups is too complex, consider for example, that PubMed records usually include at least two types of keywords (Other terms and MeSH terms) that have already lost resolution by the time they land in JabRef. Having a means of organising entries that does contaminate or get contaminated by keywords is very important. Import batches are another kind of grouping that is conceptually separate from keywords.

I am not sure about the architectural implications of "folders", "groups", "keywords", and "tags". However, it is clear from the discussions about nested groups (already linked above) that users need at least one layer of personal organisation. Consider too that commercial reference managers allow "piggy back" and "daisy-chain" groups (created from combinations or series of existing groups). Perhaps a "pivot table" is a better metaphor for user need than folders, groups, keywords, or tags (storing, clustering, indexing, and classifying). All of these could have the same underlying architecture and still be useful as separate inputs to my "data model". The important part is having a dynamic view of the entries in the collection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants