Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propose new search attribute APIs #67

Closed
wants to merge 2 commits into from

Conversation

bergundy
Copy link
Member

@bergundy bergundy commented Sep 29, 2022

@bergundy bergundy self-assigned this Sep 29, 2022
In the current version of the SDK, search attributes are always arrays. In order for the server to support visibility
on top of standard persistence, we need to redo the search attributes API to enforce single values for all search
attributes except for keywords. The server will eventually drop support for multiple search attribute values for
non-keywords.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could maintain the array & support for multiple values when backed by ES, while supporting single-value arrays for standard persistence. There are use cases for it. Was it decided that there are not enough use cases to warrant the complexity?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want different experience based on the backend.
Research shows that multiple values are almost never used.

Comment on lines 20 to 21
export const customerAttribute = searchAttributes.defineAttribute(searchAttributes.Text, 'customer');
export const dateOfPurchaseAttribute = searchAttributes.defineAttribute(Date, 'dateOfPurchase');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might drop define here. You're not really defining it in the verb sense, especially if it already exists, rather you are referring to some existing definition. Perhaps attributeDefinition then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be fine with just attribute TBH.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of like the verb define for consistency with defineQuery and defineSignal.

Comment on lines 51 to 56
query: Q.filter(
// WorkflowType is a pre-defined search attribute
Q.and(searchAttributes.WorkflowType.eq(myWorkflow.name), customerAttribute.eq('customer-id-1234')).orderBy(
dateOfPurchaseAttribute,
'ASC' // Optionally specify DESC | ASC
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks nice but I wouldn't underestimate the complexity of a query builder. This isn't a small amount of work to implement properly, especially if it's type-safe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already built most of it while designing this and it's not my first time building one of these.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me it's not about difficulty of building, it's about maintenance and compatibility. How well defined are list filters? If you can give me a grammar that you guarantee won't change in an incompatible way, then I can support a programmatic builder. Even then though, though there's some type safety, you'll end up where SQL query builders always end up where people will escape back into string literals for extreme flexibility.

Also, this API needs to be fledged out some more. Why Q when it's a "list filter" in our docs? Why not a builder pattern? How can I embed a custom query string? Why is "orderBy" on a "query" part of something called a "filter"? (granted that's a Temporal naming problem).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked with the server team and they've confirmed that the entire list of capabilities is documented here: https://docs.temporal.io/concepts/what-is-a-list-filter.

I'm not proposing this as the only way, The list API will accept a structured query and a plain string, we can of course make the query builder accept strings too but I don't think we need that at this point.

The doc page mentions list filters but it actually talks about queries, while the projection part of the query is omitted in Temporal, the rest is a query, it's composed of filters and ordering operators.
filter is not a what you get when you use Q.filter, it's just a factory for creating a new query with a single filter.
Alternatively, we could have a Q() method or constructor instead, I'm open to discussing that.

Essentially Q is a builder.

You can compose multiple operations:

Q.filter(...).filter(...).orderBy(...).orderBy(...)

NOTE that queries are immutable and filter and orderBy return new instances.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this builder solving a need people have?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoiding typos

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, to clarify I mean as in a struggle people have today. They can typo lots of strings, e.g. task queues, I am just wondering if anyone is clamoring for this.

Also, can we get an exact query grammar? (if not in this issue, just in docs somewhere)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc page I linked above.

I don't think people are clamoring for this, I just this it's a general improvement and is pretty straightforward to do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not "clamoring", but I would have appreciated a query builder for the one spot in the server where I had to construct a query for internal use. Mostly just so I don't have to think about quoting or escaping.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll clamor to avoid thinking about quotes in quotes 😊

const info = await handle.describe();
const customer = info.getAttribute(customerAttribute); // includes runtime validation - returns string | undefined
// -- OR --
const customer = info.getAttribute<string>(customer); // no runtime validation - returns string | undefined
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this argument supposed to be 'customer' ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks for catching it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me

Copy link
Member

@Sushisource Sushisource left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I like this. The query builder is perhaps a bit much, but it's independent of the rest of this. I would just call that out as maybe an additional improvement for later.

// ... random workflow logic ...

// Set the value of a search attribute, all set and unset calls are buffered and flushed out as a single command at
// the end of an activation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and unset calls are buffered and flushed out as a single command at the end of an activation

I assume last write wins when altering the same attribute before next command?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I thought that was obvious but I can add a comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was mostly obvious, no comment needed

`workflows.ts`

```ts
import { searchAttributes } from '@temporalio/common';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does searchAttributes look like? Can I iterate all of them inside the workflow? Can I do things like unset all of the ones with a certain prefix? (i.e. a replace-all)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a set of functions and constant, it includes things like:

defineAttribute
WorkflowType // built-in attribute

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I iterate all of them inside the workflow? Can I do things like unset all of the ones with a certain prefix? (i.e. a replace-all)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd have to define your own set of search attributes, it's up to you how to access those.
We can expose an API to list the set of search attributes, do you think it's valuable?
Since search attributes have to be registered, I don't know if there's a real need for it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TypeScript doesn't allow object destructuring on imports statements. That means that one would always have to write searchAttributes.defineAttribute(...), instead of simply defineAttribute(...).

I think defineAttribute should be outside of this object, and searchAttributes should contain only build-in search attributes...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like 2

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SearchAttributes is essentially an enum, from user's POV

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like 2 as well.
Not sure why SearchAttributes are an enum..

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "built in search attributes" should be grouped together. Not formally an enum, but an object on which every field is a builtin search attribute.

Having an export named simply WorkflowType gives the impression that it is a type. Writing it with a lower case makes it only marginally better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, maybe have a builtins constant with all of the builtin search attributes.

Comment on lines 51 to 56
query: Q.filter(
// WorkflowType is a pre-defined search attribute
Q.and(searchAttributes.WorkflowType.eq(myWorkflow.name), customerAttribute.eq('customer-id-1234')).orderBy(
dateOfPurchaseAttribute,
'ASC' // Optionally specify DESC | ASC
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me it's not about difficulty of building, it's about maintenance and compatibility. How well defined are list filters? If you can give me a grammar that you guarantee won't change in an incompatible way, then I can support a programmatic builder. Even then though, though there's some type safety, you'll end up where SQL query builders always end up where people will escape back into string literals for extreme flexibility.

Also, this API needs to be fledged out some more. Why Q when it's a "list filter" in our docs? Why not a builder pattern? How can I embed a custom query string? Why is "orderBy" on a "query" part of something called a "filter"? (granted that's a Temporal naming problem).

const info = await handle.describe();
const customer = info.getAttribute(customerAttribute); // includes runtime validation - returns string | undefined
// -- OR --
const customer = info.getAttribute<string>(customer); // no runtime validation - returns string | undefined
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should standardize on the word "search attribute" or "attribute".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we should probably just stick to search attribute.

const info = await handle.describe();
const customer = info.getAttribute(customerAttribute); // includes runtime validation - returns string | undefined
// -- OR --
const customer = info.getAttribute<string>(customer); // no runtime validation - returns string | undefined
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you'll do the same for memo here and for the in-workflow part? (though memo can be any type and is not part of your filter work)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need to do this for memo, but I do think that the "upsert memo" API in the workflow module will do batching as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting a single memo value by key for a specific type makes sense though, right? info.getMemo(myStringMemo) and info.getMemo<string>("my-string-memo")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. right now memos can be any user object so we can't do runtime validation like we do for search attributes.
ATM users can access their memo as: info.memo['my-string-memo'] as string.

Adding the getter doesn't provide that much value IMHO but I'll consider adding it for API consistency with search attributes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We added workflow.memo and workflow.memo_value (and the client handle describe response equivalents) after discussion because there is no simple way to deserialize a memo value without manually invoking a converter.

No need to add it now, but I think making memo and search attributes APIs match as much as possible makes sense (though obviously the latter is much more limited on types).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we had type hints in TS I'd do it, but we currently don't support that.

const info = await handle.describe();
const customer = info.getAttribute(customerAttribute); // includes runtime validation - returns string | undefined
// -- OR --
const customer = info.getAttribute<string>(customer); // no runtime validation - returns string | undefined
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we mark this new API experimental until implemented in another SDK?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should mark any new APIs experimental for a certain period IMHO

// Other methods will be exposed too

const iterator = client.workflow.list({
query: Q.filter(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does Q.filter produce a filter query as a string or an object?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

object

Q.and(searchAttributes.WorkflowType.eq(myWorkflow.name), customerAttribute.eq('customer-id-1234')).orderBy(
dateOfPurchaseAttribute,
'ASC' // Optionally specify DESC | ASC
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can constructing a filter always succeed? or should we have a users call a .build() method at the end to return any errors in the construction?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They always succeed or throw on construction.

Comment on lines 51 to 56
query: Q.filter(
// WorkflowType is a pre-defined search attribute
Q.and(searchAttributes.WorkflowType.eq(myWorkflow.name), customerAttribute.eq('customer-id-1234')).orderBy(
dateOfPurchaseAttribute,
'ASC' // Optionally specify DESC | ASC
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll clamor to avoid thinking about quotes in quotes 😊

const info = await handle.describe();
const customer = info.getAttribute(customerAttribute); // includes runtime validation - returns string | undefined
// -- OR --
const customer = info.getAttribute<string>(customer); // no runtime validation - returns string | undefined
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const customer = info.getAttribute<string>(customer); // no runtime validation - returns string | undefined
const customer = info.getAttribute<string>(customerAttribute); // no runtime validation - returns string | undefined
const customer = info.getAttribute('customer'); // no runtime validation - returns unknown

@bergundy bergundy force-pushed the ts-search-attributes branch from 1a50952 to 8061d0c Compare September 30, 2022 22:48
Copy link

@mjameswh mjameswh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

await client.operator.registerSearchAttributes(customerAttribute, dateOfPurchaseAttribute);
// Other methods will be exposed too

const iterator = client.workflow.list({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick question as I start to work on this in Python. So this solves temporalio/features#75 too? I assume this is interceptable? Also, can you show a quick example of status filtering? Is ExecutionStatus a search attribute?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will solve the high level SDK list API, assuming you want to add the query builder (you should IMHO).
I don't see much value in making it intercept-able if there are no headers to pass to the server.
ExecutionStatus along with the rest of the default search attributes will be defined in the SDK.

For search attributes that we know are enums I'd definitely restrict the type (I think it's only ExecutionStatus).

Something like:

Q.filter(builtinSearchAttributes.ExecutionStatus.eq('RUNNING'))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see much value in making it intercept-able if there are no headers to pass to the server.

While I thought similar, I was overridden on workflow handle "describe". I think we are now forced to be consistent here.

@bergundy
Copy link
Member Author

Closed in favor of #75

@bergundy bergundy closed this Feb 28, 2023
@bergundy bergundy deleted the ts-search-attributes branch February 28, 2023 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants