-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🔍 Starting contacts into flow via ElasticSearch #144
Conversation
Codecov Report
@@ Coverage Diff @@
## master #144 +/- ##
==========================================
+ Coverage 42.98% 44.09% +1.11%
==========================================
Files 68 70 +2
Lines 6449 6715 +266
==========================================
+ Hits 2772 2961 +189
- Misses 3181 3248 +67
- Partials 496 506 +10
Continue to review full report at Codecov.
|
@dodobas would appreciate a close review of this! |
elastic/elastic.go
Outdated
name += "_keyword" | ||
|
||
if c.Comparator() == "=" { | ||
query = q.NewTermQuery(name, value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we are missing fieldQuery
for =
and !=
operators, json test cases should also be updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I am including the field but more importantly I wasn't building a nested query, updating.
@dodobas updated, can you take another pass? |
ivr/ivr_test.go
Outdated
@@ -32,10 +32,10 @@ func TestIVR(t *testing.T) { | |||
db.MustExec(`UPDATE channels_channel SET channel_type = 'ZZ', config = '{"max_concurrent_events": 1}' WHERE id = $1`, models.TwilioChannelID) | |||
|
|||
// create a flow start for cathy | |||
start := models.NewFlowStart(models.Org1, models.IVRFlow, models.IVRFlowID, nil, []models.ContactID{models.CathyID}, nil, false, true, true, nil, nil) | |||
start := models.NewFlowStart(models.Org1, models.IVRFlow, models.IVRFlowID, nil, []models.ContactID{models.CathyID}, nil, "", false, true, true, nil, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This constructor is getting a bit out of hand, going to tweak it to do With()
for recipients.
WithContactIDs(contactIDs). | ||
WithURNs(event.URNs). | ||
WithCreateContact(event.CreateContact). | ||
WithParentSummary(event.RunSummary) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
golang doesn't support using ()
around a block like this the way Python does. Having a trailing .
is the only way to get this not reformatted that I could figure out. Slightly weird but I bet we get used to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
been doing this in goflow for a while with its builder classes, except with NewFlowStart(..)
on the same line as models.
and only the With
lines indented... which seems marginally less weird, maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya, I started that way too but then that first list of required args is kinda long and seemed more readable this way to me. 🤷♂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO looks a bit weird to split a package and symbol. What if DoRestartParticipants
was also WithRestartParticipants(bool)
and then you also don't have to add new bool types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went back and updated those.
type IncludeActive bool | ||
|
||
const DoIncludeActive = IncludeActive(true) | ||
const DontIncludeActive = IncludeActive(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Death to bool parameters!
So random thoughts on allowing starts via query.. One thing we have heard is that it would be nice to be able to start a flow for a group of contacts and exclude some contacts, specifically contacts which have a WhatsApp URN. (or maybe more precisely those that would be sent to using a WhatsApp URN, but lets ignore that specificity for this discussion)
I think that's right, but it does not allow for the scenario above. Two ways we could accomplish that:
Option 2 above is appealing from a number of code-paths perspective (though honestly the code path is not very complicated on the mailroom side either way). But honestly we would probably want to continue to include Option 2 also doubly makes us depend on our indexes being not too stale and actually correct. This PR is already making that more the case and honestly Elastic seems to have treated us pretty well in that respect, but adding it as a dependency for every single start feels kinda scary. Thoughts? @rowanseymour @ericnewcomer @norkans7 ? |
I don't think I hate a query being the canonical form for flow starts. In a way it might actually be clearer to the user when we ultimately surface flow starts. ie, that the selection of contacts was a point-in-time that matched a query instead of it just being a group list. I mean, it is the same thing, but in a way I can see that feeling like a more honest visualization of what happened. So, in the whatsapp case, is the thinking they would create a query that is |
Ya, we'd have to come up what our syntax is like for group queries and that could be using a group name: Seems mostly clear.. But ya, the case above for whatsapp would then be something like:
|
Then of course they change their group name, doh. Maybe that's okay for that link to be tenuous in the flow start case. Assuming we wouldn't allow in-group querying for general contact search (or at least disable dynamic group creation from those queries). |
Ya, maybe we rewrite to UUID and store that version in our own queries? |
👿 devils advocate: you don't need to query on groups.. every group should be dynamic anyway.. backed by a query on fields.. everything is fields |
I don't necessarily hate that, believe Vumi did that. It doesn't feel super friendly for users though, so feel it may require us hiding this from them in the simplest case. But ya, then suddenly you get to do boolean operations on all your groups which is powerful. |
models/assets.go
Outdated
} | ||
|
||
// is this an implicit field? | ||
if key == contactql.ImplicitKey { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't actually allow implicit conditions (Bob
instead of name = Bob
) in dynamic group queries - you can search like that but you can't save that raw query as a group. Thus goflow's evaluation of groups errors if there's an implicit condition. Reason for that is that the RP UI should rewrite any implicit conditions anyway - so searching for Bob
gets rewritten as the explicit condition name ~ "Bob"
. Seems like it would be reasonable to do same with these kinds of queries and then you don't have to worry about implementing implicit queries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that's true for dynamic queries but didn't see any reason why this part of the library shouldn't support it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why wouldn't we rewrite these queries same as we do for contact search?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also to be clear, it's not just implicit conditions we rewrite away, it's also implicit ANDs, e.g. Bob 078245353
becomes name ~ "Bob" AND tel ~ 078245353
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I think we will totally pass in preprocessed ones. But this module only concerns itself with taking a contactql query and turning that into something elastic supports. Implicit queries are a thing in contactql, so seemed correct to support them.
Yes it is a superset of what we will absolutely need, but it is implemented and tested so don't see the plus in ripping that out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tho actually looking at the Python contactql parser, it always switches out an implicit condition with an explicit id, name or tel ones in the visitor, so if we want to be consistent, goflow's visitor should also do that, and then there is no contactql.ImplicitKey
.. this code can't happen.
The only reason goflow's implementation doesn't already do that is we've been assuming it would never encounter an implicit condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the question is what is ContactQL? Does ContactQL support implicit queries? If so, then I would posit everything that takes contactql should support implicit queries. If not, then that should be an error when parsing contactql and it is the job of RapidPro to turn implicit looking things into valid ContactQL.
I'm fine with either, though to me it seems useful to have implicit queries be part of ContactQL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implicit conditions are certainly a part of contactql - but our Python implementation swaps them out with explicit ones at the visitor level so the evaluator has no concept of an implicit condition. No reason for goflow to not do likwise.
Also noticed that goflow's parser is wrong for implicit keys - it sets the operator to =
instead of ~
.. it's never been used 🤷♂
Have made nyaruka/goflow#742 which I think replicates the behavior of the Python parser
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent, so then every place that takes ContactQL can take any ContactQL, which I think is the right thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok just made goflow v0.44.3 which should in theory match the python parser!
search/search.go
Outdated
|
||
if field.Key == NameTel { | ||
if number != 0 { | ||
return elastic.NewNestedQuery("urns", elastic.NewMatchPhraseQuery("urns.path", number)), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment above re implicit conditions, but doesn't this need to be restricted to tel urns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, yes indeed, will update.
search/search.go
Outdated
var query elastic.Query | ||
|
||
if field.Category == Implicit { | ||
number, _ := strconv.Atoi(c.Value()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this match the logic of https://github.com/nyaruka/rapidpro/blob/5814c169e031656c1f65e97348e382d90c496c54/temba/contacts/search/parser.py#L1053 ?
i.e. match a phonenumber regex, then do a bit of normalization?
// FieldRegistry provides an interface for looking up queryable fields | ||
type FieldRegistry interface { | ||
LookupSearchField(key string) Field | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this and the interface above are only needed because assets.Field
doesn't have a UUID parameter. (and we don't have a FieldUUID
in goflow either)
How do you feel about adding that? Would make this all fall away to just one field resolver func.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took me a moment to remember what field UUID is used for.. but yeah we could add it to assets.Field
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
models/fields.go
Outdated
@@ -15,7 +15,7 @@ import ( | |||
type FieldID int | |||
|
|||
// FieldUUID is our type for the UUID of a field | |||
type FieldUUID uuids.UUID | |||
type FieldUUID = uuids.UUID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bit of a hack to get around the circular dependency between models
and search
. If we had a FieldUUID
in assets then wouldn't need this.
(spent a few hours trying to refactor all ids and uuids in models
to an ids
package but it just got out of control, when do we get decent golang refactor tools again, fuck modules)
} | ||
|
||
fieldQuery := elastic.NewTermQuery("fields.field", field.UUID()) | ||
fieldType := field.Type() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so no need to look up anything to do with fields if we add UUID right? FieldType is also on contactql.Condition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, ya, then we'd have everything we need to build these queries in assets.Field
@rowanseymour updated, oh god please no more changes. :) |
key := c.PropertyKey() | ||
|
||
if c.PropertyType() == contactql.PropertyTypeField { | ||
field := resolver(key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this needed? haven't we already verified that the fields exist during parsing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh sorry I get it now - there's still no field UUID on the condition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't say I'm familiar with the ES API to know if that's all right, but rest looks good.
Any way to verify if this works the same as the ES stuff in RP ?
I've run these queries against Elastic and it is pretty much a translation of the Rapid queries so I think they should be good, won't know for sure until we try! |
Optin message fixes
Allows starting a set of contacts in a flow, our start job just needs a
query
parameter.Will require ansible updates to update mailroom env to know about Elastic.