-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Paths Query #5646
New Paths Query #5646
Conversation
def _determine_should_join_distinct_ids(self) -> None: | ||
self._should_join_distinct_ids = True | ||
|
||
def _determine_should_join_persons(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: Why override this? The implementation is identical (with entity props not accounted for but these would be blank) to main events query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
entities don't exist on the PathFilter
class at all. This inheritance / deciding how filters should be created needs a bit more thought - feels too messy right now (and mypy screams a lot too lol)
@@ -14,15 +15,15 @@ class ClickhouseEventQuery(metaclass=ABCMeta): | |||
EVENT_TABLE_ALIAS = "e" | |||
|
|||
_PERSON_PROPERTIES_ALIAS = "person_props" | |||
_filter: Filter | |||
_filter: Union[Filter, PathFilter] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pretty much the wrong way to go about this, judging by how lots of things outside (like ColumnOptimizer, EventQuery, breakdown props, etc.) depend on Filter
existing.
Maybe some makes sense to have PathFilter
inherit from Filter
instead of BaseFilter
. This has problems as well, since mixins specific to PathFilter
can't do anything in ColumnOptimizer / EventQuery
A better way out can be to only have this gigantic Filter
class that has Funnel and Path specific mixins. This way, everything else just checks if the property that matters to itself exists, and does what needs to be done.
Downside is that possible parameters for Filter object for Paths become a bit unclear. Too many, which one to use?
But, I guess that's okay for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to think on this more. The typing is really weird right now. I actually might need to check again if there's some conflicts/unintended bugs
Another open question: I don't yet see multiple |
Ready for another look. @neilkakkar feel free to change the paths class/file naming |
self.assertNotIn("json", query.lower()) | ||
|
||
self.test_current_url_paths_and_logic() | ||
|
||
def test_paths_start(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is pretty suspect, trying to figure out why we have the off by one error in the new query. Judging from how the data is setup,
{"source": "1_/pricing", "target": "2_/about", "value": 1}
is incorrect, the value should be 2, like in the postgres test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@paolodamico @kpthatsme @marcushyett-ph When we talk about setting a start point for paths, should the start point define the beginning of the session that a person is in or should it include paths after the start point even if it's in the middle of the session? For example, if one user has path A -> B -> C and another has B -> C and out start point is B. Would we count both of them or just the second person because the second person starts at B
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm reading your question right–
I think the path we show should start from whichever event the user is saying is their starting point.
For example, I may want to know everything that happens after a user ingests data:
So the path should only show the journey starting with the "user ingested data" event.
In your example case, I think we'd include both of these users - both starting from B onwards
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah makes sense. We wanted to make sure there's no a significant need/possible confusion for the other method where start points are defined as exact starts for a person session. I think the use case you described is much more useful though and it's how our current paths functionality works so we'll stick with that
I updated how start point filtering works: ec3a8bb It's... not the prettiest, and I didn't spend much time thinking about how to extend this, but I feel that can wait for a separate refactor. Right now, aiming to get this merged in, where we've exactly replaced old functionality. (all same old tests pass as expected) Waiting for you to review this bit^, but the rest looks good to me 👍 |
self.assertTrue(response[2].items() >= {"source": "2_/", "target": "3_/about", "value": 1}.items()) | ||
self.assertTrue(response[3].items() >= {"source": "2_/about", "target": "3_/pricing", "value": 1}.items()) | ||
self.assertTrue(response[4].items() >= {"source": "3_/pricing", "target": "4_/help", "value": 1}.items()) | ||
self.assertTrue(response[0].items() == {"source": "1_/pricing", "target": "2_/", "value": 1}.items()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assertEquals? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that reminds me, I have no idea why we had the >=
instead of ==
. Will fix!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Postgres version has extra elements, hence >=
and assertTrue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assertDictContainsSubset?
This reverts commit 5eb229f.
Changes
Please describe.
If this affects the frontend, include screenshots.
Checklist