Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC/PROPOSAL: add session_id to event.schema #17

Merged
merged 1 commit into from
Aug 31, 2024

Conversation

AnkitSiva
Copy link
Contributor

What/Why

What are you proposing?

We propose that the event schema contain a dedicated session ID field so that consumers of UBI data can track what users (authenticated and anonymous) are doing on dedicated visits. client_id does not uniquely identify a user (since two users with the same browser version would point to the same client_id) and the user_id proposed in this PR will not always be available.

What users have asked for this feature?

We have spoken to data analysts who work on analyzing user behavior.

What problems are you trying to solve?

The analysts we spoke to mentioned that a session_id is an important field in addition to client_id and user_id to understand shifts in the user behavior over time. They also mentioned how they often need to correlate search-interactions tied to a session against the other interactions that would also be tied to a session.

Are there any security considerations?

No additional security impact as the existing recommendation was to already track the user ID under the client_id field. This is of a similar impact.

Are there any breaking changes to the API

Yes, the session_id is proposed as a required field.

What is the user experience going to be?

Customer can configure and analyze the user-behaviors along an additional axis that is well-separated from client_id and user_id.

Are there breaking changes to the User Experience?

No

Why should it be built? Any reason not to?

This will allow the separate customer personas (front-end developer and behavior analyst) to be able to perform their tasks better with less coordination required as it reduces ambiguity around what attributes are tracked in which fields.

What will it take to execute?

  1. Merging this pull request
  2. Documentation and samples updates.

Any remaining open questions?

No.

@AnkitSiva AnkitSiva changed the title feat[event.schema.json]: add session_id RFC/PROPOSAL add session_id Aug 7, 2024
@AnkitSiva AnkitSiva changed the title RFC/PROPOSAL add session_id RFC/PROPOSAL: add session_id to event.schema Aug 7, 2024
@epugh epugh added the RFC Request for Comment label Aug 15, 2024
@miike
Copy link
Contributor

miike commented Aug 16, 2024

Looks good. I don't know if you want to consider adding session index (the number of sessions a user has had) in there as well or if that's overloading as customisation for new users (index=1) is often quite different for returning users.

@epugh
Copy link
Member

epugh commented Aug 20, 2024

I suspect the richness of having multiple sessions is worth it's own PR, and I think we need to see what the appetite is for that. One thought related is do we need to think about, in the future, extensions to the spec for those who are doing more advanced things in certain areas...

@AnkitSiva
Copy link
Contributor Author

For now, validating presence of multiple sessions can be computed through a query post-hoc given userID, sessionID and timestamps. I also think that session_index is probably a server-side metric as opposed to a client-side event metric.

@ydrozd
Copy link

ydrozd commented Aug 26, 2024

In our practical implementation of sessionization we found out that session definition is not necessarily stable across various applications. Maintaining certain types of session in a stream of events, on the other hand, is problematic because session identification may not be correctly resolved due to possibility of delayed events. One solution is to postpone session identification to downstream processing stages.

@epugh epugh added this to the 1.1 milestone Aug 31, 2024
@epugh
Copy link
Member

epugh commented Aug 31, 2024

Session has been a common request... It's a widely used concept in many tracking solutions.

I think that for "simple" use cases it's a powerful tool, and then the richer and more complex you get, the more session because a knotty problem to deal with, as suggested by @ydrozd in his comment!

I'm going to merge it for the 1.1, however I look forward to digging into this topic more. It may end up being one of those things where you caveat the heck out of the use of it?

@epugh epugh merged commit d6d27a7 into o19s:main Aug 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFC Request for Comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants