-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC/PROPOSAL: add user_id to event.schema #15
Conversation
The client_id field had some lingering documentation about where user_ids could go. This commit cleans that up.
Makes sense - I'd consider making this nullable explicitly i.e.,
|
I think this makes sense as well. Users would probably outlive several sessions and I imagine correlating data across disparate sessions would be troublesome otherwise. |
@AnkitSiva would you be willing to update the PR for the null concept. I did a bit of googling and yeah, you need to explicitly have the data type "null": https://turbo360.com/blog/specifying-json-schema-elements-null-in-logic-apps |
For a moment I thought, hey, if this isn't in the |
@epugh I'm not sure if I understand what the rationale behind making the |
@AnkitSiva I'm thinking that the common pattern for folks using this would be to say "I am using user_id", and hten, for all the places where user_id is null, that they would set it explicitly null, versus skipping the attribute... How about, for now, we just make the change as you have it, and look to dig in more on @miike suggestion... |
What/Why
What are you proposing?
We propose that the event schema contain a dedicated user ID field so that consumers of UBI data can disambiguate between
client_id
anduser_id
and better standardize what we recommend the integrators track.What users have asked for this feature?
We have spoken to data analysts who work on analyzing user behavior and they mentioned that the terms
client_id
,session_id
anduser_id
have distinct meanings that cannot be merged.What problems are you trying to solve?
The
client_id
in analytics parlance usually refers to a hash of the browser and its version. This would mean that if multiple unauthenticated users were using the same browser version, their activity would fall under the same id. The current approach has another caveat: if a user ID is logged in theclient_id
field, then the behaviors of unauthenticated users won't be logged. With this proposal, there won't be such a confusion any more. In case of an unauthenticated user, the user_id can remain empty.Are there any security considerations?
No additional security impact as the existing recommendation was to already track the user ID under the
client_id
fieldAre there any breaking changes to the API
No
What is the user experience going to be?
Customer can configure and analyze the user-behaviors along an additional axis that is well-separated from
client_id
.Are there breaking changes to the User Experience?
No
Why should it be built? Any reason not to?
This will allow the separate customer personas (front-end developer and behavior analyst) to be able to perform their tasks better with less coordination required as it reduces ambiguity around what attributes are tracked in which fields.
What will it take to execute?
Any remaining open questions?
No.