Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve redundant methods of creating tag implications #1371

Open
discomrade opened this issue Dec 26, 2024 · 4 comments
Open

Resolve redundant methods of creating tag implications #1371

discomrade opened this issue Dec 26, 2024 · 4 comments

Comments

@discomrade
Copy link
Contributor

A tag implication (aka. tag parent) is when a tag logically implies another: a fox is always a canine, animal. It is similar to a tag alias (aka. tag sibling) where two tags have the same meaning: smiling is another way of tagging smile.

There are currently two ways to create tag implications:

  1. Setting a tag alias as itself plus the implication: fox -> fox canine animal [1], which has recently been officially recognised on the tag alias settings page[2].
  2. Enabling the Auto-Tagger extension[3] and adding them.

Both are being used by Shimmie admins and having two ways to tag implications seems like a maintenance and administrative problem. The first is hacky (even if valid) and the second is a misleadingly-named optional extension. I propose they be consolidated into a core extension (like Alias Editor already is), which creates an Implications tab next to Aliases.

image

I think the best option is to duplicate the aliases table and page, rename and polish it, and then write a SQL migration script to convert any existing Auto-tagger entries to the implications table, and to convert implication-aliases to become implications, and then finally add an input validation check to prevent new aliases being abused as implications. This allows the Auto-tagger extension to be removed.

@discomrade
Copy link
Contributor Author

@shish Requesting your thoughts on this. I'm happy to try and get the code side done but I need to know which direction to go in :)

@shish
Copy link
Owner

shish commented Jan 25, 2025

(I am afk at the moment and won't have laptop access until Monday, but I've added a calendar entry to remind me to get to this then ^^ But yes it is definitely an unintentional oversight that we ended up with two ways to do the same thing 👀)

@shish
Copy link
Owner

shish commented Jan 27, 2025

Random brain-dumping and thinking out loud as I'm not certain yet:

  • there's a good chance I'm being biased towards code I'm more familiar with, but my gut wants to keep all the data in a single table, even if it's exposed in different ways in the frontend
    • if aliases and implications are handled in different ways in the code, then maybe it makes sense to have them in different database tables, but I think they both generalise to A B -> X Y? (ie, doing a search-and-replace where one set of one-or-more tags is replaced with another set of one-or-more tags, and the replacement set may or may not have some overlap with the searched set)
  • what do we do about aliases with multiple input tags? eg cid ff7 -> cid_highwind ff7 / cid ff8 -> cid_kramer ff8
  • avoiding infinite loops feels important (eg if there are somehow aliases of foo -> bar and bar -> foo, then attempting to set the tag foo shouldn't pin the CPU to 100% until the request is force-killed by timeout. IIRC there is already some input validation to avoid obvious loops, but I'm not sure if it covers all cases)
    • my gut says that the most foolproof way would be to run the alias resolver code several times and make sure that it settles on a single end result within ~10 iterations, or throw an exception if it doesn't
  • either way I agree we want to have one enabled-by-default extension to handle these, and write an onDatabaseUpgrade handler to migrate data away from the auto-tagger tables, then remove auto-tagger

@discomrade
Copy link
Contributor Author

discomrade commented Feb 4, 2025

I'm going to need to think more about whether the single table idea is appropriate. You're right that they're both mapping A -> X Y, which is proven by implications functioning in the Alias Editor by adding A -> A X. On the other hand, I believe they would ideally need to be separated by code anyway due to differing logical constraints and ensuring an intuitive order of operations.

Hydrus's alias and implication documentation says:

[Aliases:] Note that if you say A->B, you cannot also say A->C. This is an n->1 relationship.

Unlike [aliases], which as we previously saw are n->1, some tags have more than one implication (n->n)

e621's implication help page mentions:

This implication process occurs AFTER the alias process.

Danbooru's alias help page also brings up their alias chaining rules:

Given that A -> B alias exists

If B -> C gets added
Then the alias is updated to A -> C

Given that B -> C alias exists

A -> B aliases are rejected


what do we do about aliases with multiple input tags?

I'm leaning towards blocking them, it seems more dangerous than useful. For that example, Danbooru surprisingly has 12 results for final_fantasy_viii cid_highwind. I think assuming any tag from multiple other tags isn't worth automating and should be left to 'related tag' suggestion features.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants