Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a function that discovers and looks up schemas without using the list endpoint of an Iglu server #256

Closed
benjben opened this issue Aug 9, 2024 · 0 comments

Comments

@benjben
Copy link
Contributor

benjben commented Aug 9, 2024

Currently, our loaders (RDB / Lake / BigQuery v2) call resolver.listSchemasLike (e.g. this line in RDB Loader). Underneath this calls the "list" api endpoint of the Iglu Server.

Some users host their own Iglu repo without support for this "list" api endpoint, which can be a problem to run the new loaders.

However, it doesn't seem necessary to call "list". If a batch of events uses schema version 1-0-2 then we're going to need to fetch 1-0-0, 1-0-1 and 1-0-2 -- we don't need to list schemas to tell us that.

We can imagine a new function in the Resolver class with a signature something like:

def lookupSchemas[F[_]: SomeTypeClasses](maxSchemaKey: SchemaKey): F[Either[ResolutionError, List[Json]]

The cases below should give an overview of what the algorithm should look like.

Case 1: schemaKey is for 1-0-2

Use the resolver (lookupSchema) to look up 1-0-0, 1-0-1 and 1-0-2. If any lookup returns an error then return the first Left(resolutionError). If all lookups return success then return them all in a list.

It should never look up 1-0-3 (or greater) even if greater schemas exist. The loader is only interested in schemas up to the max schema key it knows about.

Case 2: schemakey is for 2-0-2

Same as case 1, but lookup 2-0-0, 2-0-1 and 2-0-2. Do not lookup 1-*-* schemas.

Case 3: schemaKey is for 1-1-0

This is the most difficult case, because we don't know how many 1-0-* schemas to lookup. So, we do this:

  • Lookup 1-0-0.
  • Previous step was successful, so next lookup 1-0-1.
  • Previous step was successful, so next lookup 1-0-2.
  • Previous step returned a NotFound. That's OK. Lookup 1-1-0.

The final result should contain a list of schemas 1-0-0, 1-0-1 and 1-1-0. It does not concern us that 1-0-2 did not exist.

Case 4: schemaKey is 1-2-3

There is nothing new here, this is just an extension of case 3 above.

  • Lookup 1-0-0.
  • Previous step was successful, so next lookup 1-0-1.
  • Previous step was successful, so next lookup 1-0-2.
  • Previous step returned a NotFound. That's OK. Lookup 1-1-0.
  • Previous step was successful, so next lookup 1-1-1.
  • Previous step returned a NotFound. That's OK. Lookup 1-2-0.
  • Previous step was successful, so next lookup 1-2-1.
  • Previous step was successful, so next lookup 1-2-2.
  • Previous step was successful, so next lookup 1-2-3.
  • Stop, because we reached the target schema key.
benjben added a commit that referenced this issue Aug 9, 2024
benjben added a commit that referenced this issue Aug 20, 2024
@spenes spenes closed this as completed in e1de41f Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant