Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: avoid duplicate tag inserts by checking if the mapping exists already in db #45580

Conversation

yemkareems
Copy link
Contributor

fix: do a select in systemtag_object_mapping to see if tag exists already in db. if it does not exist alone insert the same or else do nothing

  • Resolves: #

Summary

TODO

  • ...

Checklist

@yemkareems yemkareems added the 3. to review Waiting for reviews label May 30, 2024
@yemkareems yemkareems added this to the Nextcloud 30 milestone May 30, 2024
@yemkareems yemkareems self-assigned this May 30, 2024
@szaimen
Copy link
Contributor

szaimen commented May 30, 2024

Sorry, cannot judge the code

@szaimen szaimen removed their request for review May 30, 2024 09:51
@kesselb
Copy link
Contributor

kesselb commented May 31, 2024

SystemTagObjectMapper / oc_systemtag_object_mapping state if a tag belongs to a given object (e.g. a file). The code does a select to see if a given mapping already exists, not a tag. I suggest updating the commit message accordingly.

Is there a reason for this change and the additional select? The situation that a mapping already exists is handled properly by the catching the UniqueConstraintViolationException.

@artonge
Copy link
Contributor

artonge commented Jun 1, 2024

Is there a reason for this change and the additional select? The situation that a mapping already exists is handled properly by the catching the UniqueConstraintViolationException.

The catch is enough to prevent Nextcloud logs, but not enough to prevent logs from the DB which apparently can grow very large in short time spans.

@kesselb
Copy link
Contributor

kesselb commented Jun 2, 2024

The catch is enough to prevent Nextcloud logs, but not enough to prevent logs from the DB which apparently can grow very large in short time spans.

Thanks.

I'm aware about this problem with filecache and authtokens.
This is indeed an annoying situation, especially with PostgreSQL.

It surprised me a bit to see this problem with the systemtag_object_mapping table as well.
Is that in combination with files_automatedtagging or recognize?

@kesselb
Copy link
Contributor

kesselb commented Jun 2, 2024

For some cases, it's possible to use our "insertIgnoreConflict" method:

/**
*
* Insert a row if the row does not exist. Eventual conflicts during insert will be ignored.
*
* Implementation is not fully finished and should not be used!
*
* @param string $table The table name (will replace *PREFIX* with the actual prefix)
* @param array $values data that should be inserted into the table (column name => value)
* @return int number of inserted rows
* @since 16.0.0
*/
public function insertIgnoreConflict(string $table, array $values) : int;

For example:
4361019

@artonge
Copy link
Contributor

artonge commented Jun 3, 2024

For some cases, it's possible to use our "insertIgnoreConflict" method:

Apparently:

  • Implementation is not fully finished and should not be used!

The implementation only covers PostgreSQL, is there any equivalent for MySQL?

@kesselb
Copy link
Contributor

kesselb commented Jun 3, 2024

The implementation only covers PostgreSQL, is there any equivalent for MySQL?

INSERT IGNORE (also non-standard unfortunately)

@Altahrim
Copy link
Collaborator

Altahrim commented Jun 4, 2024

INSERT IGNORE is a bit dangerous because it will also ignore other errors

My proposal: #45655

@bigcat88
Copy link
Member

bigcat88 commented Jun 7, 2024

can I ask if there a backport planned for this(or a linked PR) for NC28 or NC29 or it is not a option for this PR and these really cool changes will come only with NC30?

@kesselb
Copy link
Contributor

kesselb commented Jun 7, 2024

I read Alexander's support ticket and #44933 and now the change makes much more sense to me.

To my understanding, the "duplicate key value violates unique constraint" errors for filecache and auththokes are caused by operations running in parallel.

The error for systemtag_object_mapping because our implementation is incomplete or at least strange ;)

Take the following flow:

image

  1. Upload an image
  2. Rename the image
  3. Move the image somewhere else

The flow is triggered for all operations and will trigger the warning for 2. and 3.

So it's not an edge case or a parallel operation. The api consumer like files_automatedtagging or recognize expect that assignTags takes care if a tag is already assigned.

It would be still nice to improve the commit message (it's checking if the mapping does exist and not the tag).
I think we could backport it as well, as it's more a bug fix than a feature.

@yemkareems yemkareems changed the title fix: avoid duplicate tag inserts by checking if tag exists already in db fix: avoid duplicate tag inserts by checking if the mapping exists already in db Jun 19, 2024
@yemkareems
Copy link
Contributor Author

can I ask if there a backport planned for this(or a linked PR) for NC28 or NC29 or it is not a option for this PR and these really cool changes will come only with NC30?

@bigcat88 we will backport this to NC28 and NC29

@yemkareems
Copy link
Contributor Author

/backport to stable29

@yemkareems
Copy link
Contributor Author

/backport to stable28

@kesselb
Copy link
Contributor

kesselb commented Jun 19, 2024

cc @marcelklehr because I recall some reports for recognize about it. Not sure though if about that exact problem, but related to the tagging.

Copy link
Member

@marcelklehr marcelklehr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

…eady in db. if it does not exist alone insert the same or else do nothing

Signed-off-by: yemkareems <yemkareems@gmail.com>
…ound the select and insert

Signed-off-by: yemkareems <yemkareems@gmail.com>
…arting the inserts

Signed-off-by: yemkareems <yemkareems@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3. to review Waiting for reviews bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants