-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Double scrobbles occur when backlog is processed #121
Comments
Do you know if the songs that are double scrobbled are from spotify or any other source specifically? |
I only use Spotify at the moment. Is the backlog entirely scrobbled and it's left up to maloja to deny duplicates, or does multiscrobbler remove backlog that it knows has already been scrobbled? What other sources support backlog? |
Every Source (in MS) has "discovery" detection where it determines if some play is "new" to MS -- IE MS has not seen it before. At startup the discovered tracks list for all Sources is empty (because MS has just started). If a Source can fetch backlog it does so and triggers discovered events for all plays found. After startup discovery detection is done by the in-memory state player that tracks real time plays. Each Scrobbler (in MS) listens for Discovered events. When it gets one it refreshes its own list of scrobbled plays from the scrobbler API (IE Maloja Scrobbler in MS gets last 20 plays from Maloja api) and then compares the play info/date for the Discovered play against the scrobbles from the API. If it does not find a close enough match it considers the play unique and scrobbles it. The upstream scrobblers (Maloja API, Lastfm, etc...) do not do any duplicate detection on submissions AFAIK.
Lastfm, Listenbrainz, Deezer, and Spotify support backlogging. |
* Refactor using 'close' boolean to 'match' granularity * Makes using granularity for future logic easier * Easier logging for granularity in summary * Remove intermediate temporal functions in classes for DRY and so we can use comparison results * Add Time Detail to match breakdown for more visibility during logging
@wynkenstein @duckfromdiscord I've added more logging visibility into duplicate scrobble checks related to time logic. Available in the The next time you start MS please modify you maloja client config ( [
{
"name": "default",
"enable": true,
"data": {
"url": "MyMalojaUrl",
"apiKey": "MyKey",
"options": {
"verbose": {
"match": {
"onNoMatch": true,
"confidenceBreakdown": true
}
}
}
}
}
] This will make MS log additional info every time it scrobbles a new play to Maloja that looks like this...
Please post logs including this data when you see an erroneous/duplicate scrobble, thanks |
Allright, i've modified my config and will try to force it by playing a few songs via Spotify without the Scrobbler running. After that I will start it again to see how it processes my backlog. |
Ok, here we go. After some random plays on Spotify and 2 times of shutting down Multiscrobbler und turning it back on, there was another dupe from my backlog. in Maloja it says:
now here is the part from my log where it happened: (i just removed my API key for Maloja)
edit: oh, i guess it doesn't have the information you needed? I have to checkout develop branch for this, right? |
Ok, next try! After installing develop version i did the following:
The same track (and this time 2 more!) got duped again, here is the relevant information on Maloja:
now for the logging part:
|
Thank you for the logs! This is super helpful.
This one isn't time related, funnily enough. It's two separate issues:
The combination of those differences is enough for MS to consider it not the same track even though the time is similar. There is some improvement I can do on the time comparison so I'll see if that puts it over the fence on considering it a dupe...
With these duplicates, if you can remember, did you listen to most of the track but skip to the next song early? For example EDIT: I ask because I think the main issue, temporally, is that Spotify records the play listen timestamp as when the player stopped playing the track rather than how MS (and I think most other scrobblers) do which is when the player started the track. |
Thank you for looking into it! In this specific test run, all tracks did run completely on my iPad, no skipping. And for the special track with multiple names, it MAY be possible yes, but the duplicates I had a few weeks and month ago where completely random. Even with short names like "Spaces laces" etc. It always happens in some way after I had to restart the scrobbler (mostly after component updates on the machine it runs) and everything it finds in that spotify backlog list is a potential candidate for dupes. EDIT: ah, just saw your edit. What I can do for testing is, what happens with songs i pause for a moment (because this happens here and there while listening). But this will take a while to build up some new more listenings EDIT 2: and just to be clear, when the scrobblers runs for a weeks, there are absolutely no dupes in Maloja. It always happens after initializing when there is a backlog from spotify |
Pausing shouldn't affect things as MS doesn't scrobble until the track actually changes or Spotify's player is completely stopped but who knows.
Dang. Well, to give you some insight into the logging you're seeing:
MS takes a new scrobble and compares it to all the existing scrobbles it got from the scrobbler's (maloja) API. On each existing scrobble it compares Artist, Title, and Time which are scored with weights (0.3, 0.4, 0.5 respectively). If the score sum up to at least Artist and Title use a string comparison mechanism that normalizes a bunch of things and then compares how different the strings are. that's the first number for each. Time uses a fixed score based on how close, temporally, the timestamps of the two scrobbles are.
fuzzy is used to account for things like spotify where the Maloja scrobble timestamp is at the start of the track but spotify timestamp is at the end. MS will take the maloja timestamp and add the track duration to it, then compare against the spotify timestamp, effectively getting a timestamp at the "end" of the track. If that timestamp is within 10 seconds then its "fuzzy". You can see these diffs in the logs as I'm still going to try changing scrobbles from spotify to use timestamp when the track changes and see if that makes a difference. |
* Track datetime player "completes" play at * Add scrobble datetime SOC metadata * Log SOC used when printing play datetime * Use hinted SOC scrobble datetime when comparing existing scrobbles * Use hinted SOC scrobble datetime when building scrobbler client scrobble payload * Hint Spotify backlog plays use END play date for scrobble SOC
@wynkenstein I've deployed a version of MS that uses timestamps from when Spotify changes tracks like mentioned above. Seems to be working fine for me so far. You can test it using the |
Thank you. I don't use docker, but I guess I can checkout the "playCompletedContext" branch for this to test? |
ah yes that works! |
unfortunately, same behaviour as before with this version. i just listened to a few tracks, then started MS for backlog process again and this time a lot of dupes happened. Maloja:
here is my log:
|
I want to confirm how you are testing this. With the changes in the
The only scrobbles that will be relevant for checking for duplicate changes will be those that were scrobbled after step 3 because the branch changes how/when it scrobbles from Spotify compared to the develop/master branch. I ask because the maloja scrobble timestamps in your log look like they are when the track started rather than at the end -- compare the I've been testing Do you know if you have Automix enabled? Are you playing these on your ipad only are you using Spotify Connect for any listening? |
I'd like to help a bit. What I'll do is, I'll screenshot maloja, update multi-scrobbler (how do I do that again?) and restart it, then check the logs. The screenshot is just so I know what was scrobbled again, I'm not sending that |
To answer your questions:
This is exactly what I did (plus npm install before npm build) but with one diffence at step 4. I didn't listen while MS was running. I listened to a handfull of songs THEN started MS to catch up with the backlog and thats how I got my results and dupes.
I don't use Automix. |
How are the original scrobbles getting into Maloja? If MS isn't recording them where are they coming from? |
Maybe we have a missunderstanding here. As mentionend in my post a few days ago
so, the problem with dupes only appears if I have to restart the scrobbler for some reasons. as long as it "just runs" it works without any problems |
Ok think I got it... Then you need to do step 4 before testing backlog dupes because
So please listen to some music while using MS from the new branch to scrobble. Then try backlog and check for dupes only on the tracks that were scrobbled using the new branch. |
Alright, will do that. Thanks for trying to figure this out |
Ok, it looks very promising :) I listened to about 6 Tracks while MS was running, those tracks were saved at Maloja just fine (as always). After that I terminated MS and listend to 7 more tracks, waited a few minutes and started MS again for backlog scrobbling. So far no dupes, everyhing at Maloja seems to be saved correctly. |
Hi. Just did a few tests again and didn't encounter any dupes or problems so far. I guess it's safe to say the problem is fixed. |
@duckfromdiscord these improvements are in the flatpak package now as well |
Just updated (to 0.6.3?) and enabled verbose logging, but still getting double scrobbles; here's an example of a logged one
I started with one scrobble of this song at Here are a couple more:
|
@duckfromdiscord were these tracks scrobbled after the update or before? Duplicates will only be fixed for backlog scrobbling of tracks that were tracked after updating |
My apologies, this was with the version from flatpak. |
Do I clone |
No you're good 0.6.3 flatpak is what you need. I'm trying to make sure that the dups/testing you are seeing is for scrobbles made after you started using 0.6.3. As in:
|
Good point. All messages from now on will be with the updated version.
After I updated, multi-scrobbler randomly stopped ("Terminated", perhaps out of memory or CPU) and when I restarted it, everything re-scrobbled but most of them with the exact same time. Since I'm running in a Stuff got pretty complicated after this update:
A couple of these started happening too:
On the maloja side we'll see It's strange! It seems that a lot of the duplicates are at the same time as the originals, there are still some that are a couple minutes off, but more of them show up at the exact times than in the previous updates. In fact, we'll see entirely different songs showing up during the "Closest Scrobble` comparison. Where will I find log files? Copying stuff from the |
Here's a backlog that definitely happened after the update, that I discovered while writing #130
I notice the two songs here are entirely different, but if you look at maloja, the double scrobble is right next to the old one! |
You can enable file logging for warning/errors by modifying "logging": {
"level": "debug",
"file": "warn"
} MS must have file permissions to write to
This is a new error thrown by maloja that is not yet included in an official release. Previously it silently ignored the error and did not scrobble the duplicate. 0.6.4 handles this as a dead letter scrobble now. |
No, its (potentially) addressing the double scrobble from your comment above. P.S. I know you sometimes redact artist/track info in the logs...please leave all information in as it can be critical to fixing the bug, like in the example above. Thanks |
Is my issue different from @.wynkenstein's then? It seems like their issue is not caused by Unicode/non-English characters. |
If you feel #130 has been solved by switching to docker please feel free to close it. I don't know if all of your issues were caused by non-english characters. This issue is a "catch-all" for double scrobbling on backlog. Until you are satisfied it has been resolved the issue stays open. |
non-english characters bug fix released in 0.6.5 |
@duckfromdiscord how has behavior been since this fix? |
I haven't had any crashes since I moved to Docker so I haven't had any chances to test backlog scrobbling! I'll have to start manually restarting the container every so often to check behavior |
It looks like after a restart I'm getting new scrobbles of songs I didn't finish/started and paused within the first minute or so.
I got three NEW scrobbles (i.e. scrobbles that did not have any of the same song right near them), but I did not see any new "double" scrobbles where the same track has two different entries within 3 minutes of each other. That was the issue I had previously before your update and my move to Docker. This container runs It's worth noting that these were the only three tracks that were displayed in the log, though I listened to many more tracks yesterday. I did have to truncate this log just a bit to remove the API key. |
If spotify says you listened to them who are we to disagree? ¯\_(ツ)_/¯ Interesting that it didn't discover any more tracks but it sounds like its working as intended. |
I only got to around 30 seconds in each of these tracks. If I had listened to them they would have been scrobbled already since multi-scrobbler had already been up and running for over half a week at the times of these plays. |
Spotify's listening history is a black box, they don't document how it works. If you'd prefer to use MS only you can disable backlog scrobbling in [
{
"name": "mySpotify",
"enable": true,
"clients": [],
"data": {
"clientId": "...",
"clientSecret": "...",
"redirectUri": "http://localhost:9078/callback"
},
"options": {
"scrobbleBacklog": false
}
}
] |
I know about Is the listen length saved in the history? And if so, could that tell us whether the plays were too short to be considered backlog? If the listen lengths stored by Spotify in the history are the same ones shown in the payloads there, then they're either just stored by Spotify incorrectly entirely or also count pause time. I would also like to add that these were not the only discovered tracks, these were the only scrobbled ones. It always finds 40 recent scrobbles, but these three were the only ones that ended up being scrobbled. |
I do agree that it sounds like multi scrobbler is working as intended; it is skipping scrobbles that have already happened, and scrobbling ones that have not. So it sounds like the main issue is fixed (I will continue to test to make sure of this). The new issue is that either Spotify or multi scrobbler are qualifying short scrobbles/"skips" to be tested in the first place. |
Spotify does not store the duration the track was listened to. Though in my past experience it has not added the track to history unless it was fully played. Who knows...they have have changed that or it may apply differently depending on the device listened to. |
It was my phone so perhaps. I believe the duration is stored in the extended listening history (maloja is scripted to drop <30 second scrobbles from the extended history), so I am a bit surprised it is not stored in the history provided by the API. But since that is the case, there is no issue with multi scrobbler as of now. I will try to do a same-day container restart with some music played and see if there's any double scrobbles at that time. EDIT: link had wrong line + I thought this was an issue I had made |
I think everything is fixed. The only issue I've been having is half-listened songs being backlog scrobbled, and that is both potentially not fixable and also not within the scope of double scrobbles. I just scrobbled some backlog from songs that didn't get scrobbled this morning because my computer decided to update or crash without asking. Everything worked perfectly. |
Closed with 0.7.0 that includes these additional fixes:
(these were already in |
Based on comments from here
Investigate if this is a bug in MS existing scrobble duration check or maybe an issue with timestamp reported from player vs. source
The text was updated successfully, but these errors were encountered: