Universal Listener/Audience ID (ULID/UAID) for modern stats #472
Replies: 10 comments
-
I would suggest use of UUIDs for the ULID. |
Beta Was this translation helpful? Give feedback.
-
The ULID is fully documented over here. I'm a little unclear why it isn't a Podcasting 2.0 piece of work, so it's good to see you mention it here. If fully implemented, the ULID would allow an accurate download figure per episode. It won't be fully implemented - so it will only ever be a sample: but even a sample would enable us to understand better the relationship between "a download" and "a person". I'm pleased to see OP3 capture it as part of its data.
That would (by design) allow an app to track a listener. The per-episode ULID is produced expressly to avoid this kind of listener tracking, which happens (to an extent) using IP addresses. A per-person ULID would be more invasive, because it wouldn't be dependent on an IP address, and instead be always tracked, whatever connection a user is using; even if it's a privacy-enhancing VPN. That would be a significant privacy issue. It would not be GDPR compliant. It would allow a podcast publisher to be able to tie in the various IP addresses that you use to make a much more invasive dataset about you - not just your household and its buying habits, but also what time you leave the house, where you buy groceries, where you work, who works with you, and where they live. It could see if you visited church, and/or a strip club, and/or an abortion clinic. Tracking listeners is currently not possible with standard podcast delivery (unless you're forced to sign in to an app). When you sign into an app, at least you can make the distinction as to who you are signing into (Pocket Casts, Apple, Spotify). A per-person ULID could potentially share that information with all podcasts you ever listen to, and the prefix companies they use. Worse, listening to an illegal podcast - covering illegal subjects like marijuana in some parts of the world, abortion clinics in others, or even podcasts about homosexuality - would be trackable back to the person who listened to it. I wrote more on podcasting's privacy issues. This is significantly worse than an IP address, which by themselves are fine, but when combined with other information most certainly are not. You may guess that I am not very keen on this proposal.
That's how ULIDs are currently generated. |
Beta Was this translation helpful? Give feedback.
-
Boom! Thank you so much for laying it out so well, @jamescridland! I get it now and I'm onboard. (I'll strikethrough some parts of my original post.)
Maybe you meant it like this, or maybe I've misunderstood how it's been discussed before, but one of my concerns is that ULIDs would be generated every time a listener redownloads an episode or requests a different range of bytes. If that is how @daveajones was proposing, then I suggest the app have a way to only ever generate a single ULID for each episode. That way, multiple (full or partial) downloads from the same person would not be tracked as separate listeners. |
Beta Was this translation helpful? Give feedback.
-
Sidenote, maybe we should call it "UAID" where "A" stands for "audience" instead of the media-restricted label of "listener." |
Beta Was this translation helpful? Give feedback.
-
This is interesting. One issue comes to my mind from the point of view of an app developer: consider the very common situation where the user has the app installed on more than one device. All devices will download the attachment files in the background as soon as they see a new episode appear in the RSS feed. Should these count as multiple downloads, and each device generate an ULID per episode? Or do we want the ULID system to let us count downloads per user and not per device? In this case the mechanics would be much more complicate, as the app should sync ULIDs and wait to download an episode until it's sure of its ULID sync status with all other devices from the same user. |
Beta Was this translation helpful? Give feedback.
-
I think a way to handle that would be to incorporate user information in how the audience ID is generated. Maybe it's something like user email address + episode GUID then encrypted (so it can't be decoded into an email address). That way, the app would generate the same UAID for the same episode downloaded from different devices, as long as the user is logged in. |
Beta Was this translation helpful? Give feedback.
-
A big loophole in this is that someone could create a script to automatically download an episode thousands of times with a different UUID each time. |
Beta Was this translation helpful? Give feedback.
-
It's true that if ULID were actually implemented and relied on, it would be much easier to generate fake audience. You wouldn't even need to clear the already low bar of using multiple IP addresses, you could simply generate new UUIDs. I like the idea of renaming ULID, since it might confuse the concepts of downloading vs listening, downloaders vs listeners. Perhaps UDID (Universal Download ID) is better, although podcastlistening.com would need to be retired too. It might be a good time to get our story straight on receiving high-quality and privacy-preserving listen/play data from some of the larger good-faith actors in the system who are interested in helping out. I've been thinking about this recently, and started a proposal over in #396 |
Beta Was this translation helpful? Give feedback.
-
Just a thought... perhaps the same _ulid=... parameter should also be appended to any requests for resources of the same item. Specifically chapters and transcripts. I think this would mean the timestamps of the chapters and transcripts could match the audio when the audio is dynamically generated. This would require some thought though as it promotes serving different audio for each download. |
Beta Was this translation helpful? Give feedback.
-
I wonder if any kind of benefit a universal audience ID brings could actually be handled better by @johnspurlock's webhook proposal (#396). |
Beta Was this translation helpful? Give feedback.
-
(Or does the U stand for "Unique"?)
This is not my original idea. I think it was @daveajones who first shared this idea; I only want to start an official discussion about it here.
The idea is for podcast apps to pass a unique ID as a URL parameter each time a media file is downloaded. Such an ID could be hashed by whatever means the app developer sees best to prevent duplicates. But I suggest that the ID be alphanumeric and as least 16 characters long.
Then, podcast-measurement tools could easily count the number of ULIDs for an episode to count the number of listeners. This would get around the issue of multiple downloads from the same IP address (like a school, office, or public wifi).
If I understood him correctly, Dave has suggested that the app generates a new ULID with every download, but I think that not only becomes complicated for the app, but could also lead to improper tracking.
~~I propose that the app generates a single ULID for the user's account. That same ULID would be used if the user logs into a different device (like Overcast on iPhone, iPad, or browser). And using the same ULID would allow podcasters to see how engaged that listener is from what the analytics tool does with the data. For example, the analytics could show average episodes per listener, when new listeners appear (great for seeing marketing results), and when listeners leave. None of that would be possible if the ULID changed with each download.
Then, apps could offer an easy way to regenerate the ULID for the listener's privacy, just like Apple Podcasts already does with its proprietary analytics.~~
Update: instead of a new ULID for each time the episode is downloaded (in full or in part), it should be a ULID that doesn't change for that episode. So redownloading/restreaming the episode later would not be tracked as a different listener.
Beta Was this translation helpful? Give feedback.
All reactions