Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Duplicated syles saved on cloud #1906

Open
YisusChrist opened this issue Feb 1, 2025 · 21 comments
Open

[Bug] Duplicated syles saved on cloud #1906

YisusChrist opened this issue Feb 1, 2025 · 21 comments

Comments

@YisusChrist
Copy link

Description

  1. Sync local styles to cloud
  2. Reinstall Stylus without syncing anything
  3. Import/Create the same styles that where synced previously
  4. Sync styles again to pull cloud styles

When pulling styles from cloud, it won't detect it they are already present in the local storage, so all of them are downloaded again and gets duplicated. If we sync the styles, all of them are pushed duplicated.

Because of this, every time I do a clean installation of stylus without any styles, when I try to pull my styles it will always pull a bunch of duplicated styles.

P.D. I have the styles synced only to Google Drive, but I guess there is no difference in trying to store them in other storage providers.

System Information

  • OS:
  • Browser:
  • Stylus Version:

Screenshots, links, CSS

Image

I only have a total of 118 unique styles, but there are like 2 or 3 copies of the same style for each of them, resulting in having a large collection of dupes:

Image

All my styles are imported from the Catppuccin userstyles collection.

Here I provide an export of my currently collection of duplicate styles:

stylus-2025-02-01.json

I don't know how to remove the dupes without doing manually going one by one. I already tried following the method described here, with no success.

@YisusChrist YisusChrist added the bug label Feb 1, 2025
@tophf tophf added the sync label Feb 1, 2025
@eight04
Copy link
Collaborator

eight04 commented Feb 1, 2025

Create the same styles that where synced previously

When you create a style, it is not the same style in your cloud.

  • Import

When importing styles, I think the importer will check if there are duplicated names and ask you whether to keep both of them. So you can sync first before importing styles.

However, you won't want to import those styles if they are synced and in your database.

@eight04
Copy link
Collaborator

eight04 commented Feb 1, 2025

I don't know how to remove the dupes without doing manually going one by one.

I still suggest deleting them manually because clearing the database won't clear your cloud.

You have three options:

  1. Delete them one by one. Note that it is safer to operate on a single machine. Changes will be synced to other instances.
  2. Just don't care and wait until Stylus have bulk deletion.
  3. Disconnect from the cloud, clear the database from all machines, clear Stylus cloud storage, then import your collection to one machine, connect all machines to the cloud and sync.

@YisusChrist
Copy link
Author

I checked what you mentioned and I could verify that Stylus handles properly when importing the same user styles and does not duplicate them. So as you explained, the problem only happens when:

  1. Have a clean installation
  2. Import the user styles
  3. Sync from the cloud pulling the same saved styles

Is there a way to let Stylus detect the local styles before pulling to check if they are already present? If not I am ok with the workaround you proposed, but I think it could be a nice feature to have to avoid this problem.

In the end, I decided to use a simple auto clicker script to manually remove all the styles one by one and then sync the changes to remove all of them from the cloud.

@eight04
Copy link
Collaborator

eight04 commented Feb 2, 2025

detect the local styles before pulling to check if they are already present?

It does but the sync manager uses an ID to identify a style. This allows you to rename a style and it will still sync.

I think you can try exporting your backup again after sync, so it can store ID to the backup.

However, if you want to import some kinds of brand new styles collection, they will be installed one by one resulting in different IDs.

@tophf
Copy link
Member

tophf commented Feb 2, 2025

If styles have the same sourceCode or the same sections code, we should either automatically assume it's the same style and forcefully change the local style's sync id or display a confirmation dialog with all such styles so the user can confirm the assumptions.

@eight04
Copy link
Collaborator

eight04 commented Feb 2, 2025

same sourceCode or the same sections code, we should either automatically assume it's the same

This will prevent us from duplicating styles.

change the local style's sync id

The sync won't work if two documents use the same ID. If we merge two documents, it will become a new version and trigger the sync again which may result in an infinite loop.

display a confirmation dialog with all such styles so the user can confirm the assumptions.

I think this is the most feasible. We can't use this on sync since sync happens in the background, but it shouldn't be hard to display a message/button in the manager which launches the de-dupe tool from the importer. To prevent data loss, we may want to implement a "trash" first.

@tophf
Copy link
Member

tophf commented Feb 2, 2025

This will prevent us from duplicating styles.

That would arguably affect one user out of a million. 99.999% of users don't duplicate styles or at least they would change the name, so we can add a check for the same name to ensure it's indeed a duplicate.

The sync won't work if two documents use the same ID

Yeah, I should have said that the sync'd style would overwrite the local style in case the code+name is the same.

@eight04
Copy link
Collaborator

eight04 commented Feb 2, 2025

I suggest the simpler, the better.

For users, just don't import the same source code multiple times, or they will have multiple styles. Note that importing to each of three synced machines == importing three times to a single machine.

For Stylus, avoid developing an algorithm to guess which styles are 99.999% similar.

@tophf
Copy link
Member

tophf commented Feb 2, 2025

The way I see it the similarity is exactly 100%, not 99.999%.

@tophf
Copy link
Member

tophf commented Feb 2, 2025

Also, the simple solution is always something that just works out of the box automatically without additional user involvement. My suggestion is to ignore the theoretical inconvenience to one user in a million and just clobber the local style if its code and name are equal to the remote style.

@eight04
Copy link
Collaborator

eight04 commented Feb 2, 2025

How would it handle duplicated styles on the cloud or styles both on the cloud and local?

@tophf
Copy link
Member

tophf commented Feb 2, 2025

Normally there are no duplicates in the cloud and no duplicates locally, so the automatic resolution is rather trivial: overwrite the local unsync'd style if its code and the name are exactly the same as the remote style.

@eight04
Copy link
Collaborator

eight04 commented Feb 2, 2025

Normally there are no duplicates in the cloud and no duplicates locally

In this issue, they are both in the cloud and local machines.

If it is for common cases, we shouldn't add this duplicate check to each sync pull. Since normally there are no duplicates in the cloud and no duplicates locally.

unsync'd style

How to identify an unsync'd style?

@eight04
Copy link
Collaborator

eight04 commented Feb 3, 2025

Here is a small demonstration of how this kind of automatic deletion may lead to data loss:

Suppose A has File(a), B has File(b). The boss told them when they see duplicated files, only keep the file from the cloud.

  1. A: File(a) cloud: B: File(b)
  2. A sync first
  3. A: File(a) cloud: File(a) B: File(b)
  4. B sync
  5. A: File(a) cloud: File(a),File(b) B: File(a),File(b)
  6. B noticed there is already a file on the cloud so he should delete File(b)
  7. A sync
  8. A: File(a),File(b) cloud: File(a),File(b) B: File(a),File(b)
  9. A noticed there is a new version of the file on the cloud so he should delete File(a)
  10. B delete File(b) and sync
  11. A: File(a),File(b) cloud: File(a) B: File(a)
  12. A delete File(a) and sync
  13. A: cloud: B: File(a)
  14. B sync
  15. A: cloud: B:

In reality, a company may use never-delete-anything policy, and let the boss/manager to handle version conflicts or backups.

@tophf
Copy link
Member

tophf commented Feb 3, 2025

we shouldn't add this duplicate check to each sync pull

Indeed.

How to identify an unsync'd style?

Its _rev is not present in the cloud.

automatic deletion may lead to data loss:

There won't be any since deduplication applies only to unsync'd styles. We can also start by limiting it to external (updatable) styles. Either way, the current behavior is super inconvenient and must change.

@eight04
Copy link
Collaborator

eight04 commented Feb 3, 2025

Its _rev is not present in the cloud.

Can you elaborate?

_rev is a tag locally set by the client. Two documents will have different _rev. Two versions of one document will have different _rev.


One practical method is to only run the dedupe tool on the first sync. During the first sync, we can assume all styles are unsync'd:

  1. push all styles to an "unsync'd" array.
  2. Sync
  3. Now we can safely dedupe "unsync'd" styles without causing troubles to cloud (or styles from other clients)

But it won't work with synced clients e.g.

  1. setup two synced clients A and B
  2. create a FOOBAR style on client A
  3. create a FOOBAR style on client B
  4. sync A
  5. sync B

Now B will have two FOOBAR.

@tophf
Copy link
Member

tophf commented Feb 3, 2025

Sorry, I meant _id, which is the UUID.

@eight04
Copy link
Collaborator

eight04 commented Feb 3, 2025

If the cloud supports HEAD requests, it should be possible to request the cloud and ask for the existence. However, if there was an incomplete sync, there might be staled files left on the cloud resulting to false positive. There should be no false negative since db-to-cloud was designed to prevent data loss.

The cost is very large though. Usually a single pull can be done by 5 requests: LOCK -> meta -> changes -> file -> UNLOCK. If we want to ask the cloud, the number of requests will increase with the number of local styles.

@tophf
Copy link
Member

tophf commented Feb 3, 2025

I think we can limit it to the first sync after import.

@eight04
Copy link
Collaborator

eight04 commented Feb 3, 2025

If that is the only case, we probably don't need to ask the cloud and can reuse the dedupe logic from the importer:

  1. When importing styles, separate them from others (e.g. use a special ID prefix imported-*)
  2. Run the dedupe tool. This will check against local styles.
  3. Sync
  4. Run the dedupe tool again, this time it will also check styles from the cloud.
  5. Move those imported styles back to normal.

@tophf
Copy link
Member

tophf commented Feb 3, 2025

Looks complicated to the user. I want it to be automatic. Given the possibility of a mistake, the ideal solution would be two-fold: 1) automatic deduplication on first sync + 2) moving the local duplicates into a trash can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants