Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC/PROPOSAL: add user_id to event.schema #15

Merged
merged 2 commits into from
Aug 31, 2024
Merged

Conversation

AnkitSiva
Copy link
Contributor

@AnkitSiva AnkitSiva commented Jul 30, 2024

What/Why

What are you proposing?

We propose that the event schema contain a dedicated user ID field so that consumers of UBI data can disambiguate between client_id and user_id and better standardize what we recommend the integrators track.

What users have asked for this feature?

We have spoken to data analysts who work on analyzing user behavior and they mentioned that the terms client_id, session_id and user_id have distinct meanings that cannot be merged.

What problems are you trying to solve?

The client_id in analytics parlance usually refers to a hash of the browser and its version. This would mean that if multiple unauthenticated users were using the same browser version, their activity would fall under the same id. The current approach has another caveat: if a user ID is logged in the client_id field, then the behaviors of unauthenticated users won't be logged. With this proposal, there won't be such a confusion any more. In case of an unauthenticated user, the user_id can remain empty.

Are there any security considerations?

No additional security impact as the existing recommendation was to already track the user ID under the client_id field

Are there any breaking changes to the API

No

What is the user experience going to be?

Customer can configure and analyze the user-behaviors along an additional axis that is well-separated from client_id.

Are there breaking changes to the User Experience?

No

Why should it be built? Any reason not to?

This will allow the separate customer personas (front-end developer and behavior analyst) to be able to perform their tasks better with less coordination required as it reduces ambiguity around what attributes are tracked in which fields.

What will it take to execute?

  1. Merging this pull request
  2. Documentation and samples updates.

Any remaining open questions?

No.

@epugh epugh added the RFC Request for Comment label Jul 31, 2024
@epugh epugh changed the title feat: add user_id to event.schema RFC/PROPOSAL: add user_id to event.schema Jul 31, 2024
The client_id field had some lingering documentation about where user_ids could go. This commit cleans that up.
@miike
Copy link
Contributor

miike commented Aug 16, 2024

Makes sense - I'd consider making this nullable explicitly i.e.,

type": ["string", "null"]

@dtaivpp
Copy link

dtaivpp commented Aug 20, 2024

I think this makes sense as well. Users would probably outlive several sessions and I imagine correlating data across disparate sessions would be troublesome otherwise.

@epugh
Copy link
Member

epugh commented Aug 20, 2024

@AnkitSiva would you be willing to update the PR for the null concept. I did a bit of googling and yeah, you need to explicitly have the data type "null": https://turbo360.com/blog/specifying-json-schema-elements-null-in-logic-apps

@epugh
Copy link
Member

epugh commented Aug 20, 2024

For a moment I thought, hey, if this isn't in the required list, then that would mean it is null. However, I can imagine that you might have user_id explicitly set to null, versus an abscense of the attribute meaning null.. So yeah, let's add the null data type, and not add it to the required list.

@AnkitSiva
Copy link
Contributor Author

@epugh I'm not sure if I understand what the rationale behind making the user_id nullable is if it's not required? Is it common for said value to be explicitly null?

@epugh
Copy link
Member

epugh commented Aug 31, 2024

@AnkitSiva I'm thinking that the common pattern for folks using this would be to say "I am using user_id", and hten, for all the places where user_id is null, that they would set it explicitly null, versus skipping the attribute...

How about, for now, we just make the change as you have it, and look to dig in more on @miike suggestion...

@epugh
Copy link
Member

epugh commented Aug 31, 2024

@dtaivpp @miike if the nullable thing is something we want to pursue, let's create a fresh PR for that. I am somewhat under the gun to get the 1.1 release out the door in August... and it's the 31st of August....

@epugh epugh merged commit 778b4f4 into o19s:main Aug 31, 2024
@epugh epugh added this to the 1.1 milestone Aug 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFC Request for Comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants