Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Document how caching can be controlled in the "Writing Custom Flyte Types" page #5838

Open
2 tasks done
fg91 opened this issue Oct 10, 2024 · 1 comment
Open
2 tasks done
Labels
backlogged For internal use. Reserved for contributor team workflow. documentation Improvements or additions to documentation good first issue Good for newcomers

Comments

@fg91
Copy link
Member

fg91 commented Oct 10, 2024

Description

The page Writing Custom Flyte Types explains how users can implement a type transformer for custom types.

This page should include a section that explains which ways exist to control the caching of instances of custom types.

Currently users need to have a lot of understanding of the code base to figure this out themselves.

In particular, the following options should be documented and explained, potentially with examples:

  • Literal(hash=)
  • LiteralType(schema=..., metadata=..., structure=...,
  • Others?

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@fg91 fg91 added documentation Improvements or additions to documentation untriaged This issues has not yet been looked at by the Maintainers labels Oct 10, 2024
@fg91
Copy link
Member Author

fg91 commented Oct 11, 2024

class DataclassTransformer(TypeTransformer[object]):
    ...
    def get_literal_type(self, t: Type[T]) -> LiteralType:
        ...
        ts = TypeStructure(tag="", dataclass_type=literal_type)
        return _type_models.LiteralType(simple=_type_models.SimpleType.STRUCT, metadata=schema, structure=ts)

This snippet from the dataclass transformer for example suggests that adding metadata=schema to the LiteralType should handle cache misses in case of schema changes but it turns out that the metadata appears in fact to be not used to calculate the hash key here:

func generateTaskSignatureHash(ctx context.Context, taskInterface core.TypedInterface) (string, error) {

	…
    	if taskInterface.Outputs != nil && len(taskInterface.Outputs.Variables) != 0 {
		taskOutputs = taskInterface.Outputs
	}
	outputHash, err := pbhash.ComputeHash(ctx, taskOutputs)
Screenshot 2024-10-11 at 12 53 55

Instead, it's the structure that does this. Things like this should be documented.

@eapolinario eapolinario added backlogged For internal use. Reserved for contributor team workflow. and removed untriaged This issues has not yet been looked at by the Maintainers labels Oct 17, 2024
@davidmirror-ops davidmirror-ops added the good first issue Good for newcomers label Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlogged For internal use. Reserved for contributor team workflow. documentation Improvements or additions to documentation good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants