Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to catalogd's CatalogMetadata APIs #343

Merged
merged 1 commit into from
Aug 25, 2023

Conversation

m1kola
Copy link
Member

@m1kola m1kola commented Aug 21, 2023

Description

Closes #342

This is a minimal set of changes required to switch from BundleMetadata API to CatalogMetadata and remove code duplication between resolution CLI and operator controller itself.

I expect more refactoring to be done as part of #347 where we will be removing entities/entity sources from operator controller and switching to VariableSources. E.g. we might want to create a wrapper for declcfg structs which makes it easier to traverse catalog (e.g. package might contain refs to channels and channels might contain refs to bundles). But this to be defined separately and out of scope for this PR.

The main goal of this PR is to make removal of entities and entity sources easier.

Reviewer Checklist

  • API Go Documentation
  • Tests: Unit Tests (and E2E Tests, if appropriate)
  • Comprehensive Commit Messages
  • Links to related GitHub Issue(s)

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 21, 2023
@codecov
Copy link

codecov bot commented Aug 21, 2023

Codecov Report

Merging #343 (5c72831) into main (b99cc20) will decrease coverage by 0.48%.
The diff coverage is 74.57%.

❗ Current head 5c72831 differs from pull request most recent head 2d61258. Consider uploading reports for the commit 2d61258 to get more accurate results

@@            Coverage Diff             @@
##             main     #343      +/-   ##
==========================================
- Coverage   82.35%   81.88%   -0.48%     
==========================================
  Files          21       21              
  Lines         907      911       +4     
==========================================
- Hits          747      746       -1     
  Misses        113      113              
- Partials       47       52       +5     
Flag Coverage Δ
e2e 62.12% <74.57%> (-0.39%) ⬇️
unit 78.59% <ø> (+1.37%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
internal/controllers/operator_controller.go 81.58% <ø> (+0.88%) ⬆️
...nternal/resolution/entitysources/catalogdsource.go 58.42% <74.57%> (-3.01%) ⬇️

@m1kola m1kola force-pushed the catalog_metadata_api branch 4 times, most recently from bbb4f35 to 4a43091 Compare August 22, 2023 12:11
@m1kola m1kola force-pushed the catalog_metadata_api branch 6 times, most recently from 50c9237 to b568602 Compare August 24, 2023 10:33
Copy link
Member Author

@m1kola m1kola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codecov is reporting decrease in coverage becasue there is a bunch of new if err != nil { return err} cases. Our existing tests do not cover failure scenarios at the moment.

I'm not adding any new tests for these code branches because this all will be changed as part of #347.

}
}
func validateCatalogMetadataCreation(g Gomega, operatorCatalog *catalogd.Catalog, pkgName string, versions []string) {
cmList := catalogd.CatalogMetadataList{}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests were passing without change to this function, but since we switched to using CatalogMetadata instead of BundleMetadata in the controller, I think it makes sense to update this pice too.

Also I switched to using g.Expect here since it makes a bit easier to track down a specific failure vs this being logged in the calling case.

@m1kola m1kola marked this pull request as ready for review August 24, 2023 12:26
@m1kola m1kola requested a review from a team as a code owner August 24, 2023 12:27
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 24, 2023
internal/resolution/entitysources/catalogdsource.go Outdated Show resolved Hide resolved
return allEntitiesList, nil
}

func MetadataToEntities(catalogName string, channels []declcfg.Channel, bundles []declcfg.Bundle) (input.EntityList, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I'm reviewing this function I have some concerns about it's performance. How frequently are we retrieving and transforming this data?

If this is an issue, we can address it in a follow up.

Copy link
Member Author

@m1kola m1kola Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain your concerns?

I'm concerned about CatalogdEntitySource since it seems to call getEntities on each Filter call. I decided not to touch it since we will be removing entity sources in #347.

I think as part of #347 we should think about some cache which will be valid for the duration of one resolution.

In the CLI we cache entites into the enttiy source property:

func (es *indexRefEntitySource) entities(ctx context.Context) (input.EntityList, error) {
if es.entitiesCache == nil {
cfg, err := es.renderer.Run(ctx)
if err != nil {
return nil, err
}
// TODO: update empty catalog name string to be catalog name once we support multiple catalogs in CLI
entities, err := entitysources.MetadataToEntities("", cfg.Channels, cfg.Bundles)
if err != nil {
return nil, err
}
es.entitiesCache = entities
}
return es.entitiesCache, nil
}

In operator controller we can not do the same since entity source is resused between reconciliations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a chat with Alex on this. The consern is not about MetadataToEntities where we just convert data from declcfg to entities, but about fetching the data from API server on each reconcile & multiple times during one resolution.

We came to an agreement that this is something which needs to be addressed, but not as part as this PR since this is problem with an entity source in general, not with the way we consume catalog medatada. Moreover entity sources will be removed soon as part of #347.

@awgreene please correct me if I captured it incorrectly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fetching the data from API server on each reconcile & multiple times during one resolution.

Is that true? I thought we had informer caches setup for the catalogd data such that cl.Get and cl.List would only ever hit the cache (and never the real apiserver)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you might be right! I totally forgot about informers.

Comment on lines +82 to +83
for _, catalog := range catalogList.Items {
channels, bundles, err := fetchCatalogMetadata(ctx, cl, catalog.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to fetch and loop through all the catalogs here? If we want to ensure we fetch all CatalogMetadata for all Catalog resources I think we can simplify this to listing all CatalogMetadata resources and then processing each one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea is that we process one catalog at a time and let garbage collection do its thing while we move onto the next catalog.

But yes, we definitevely can fetch all the data from all catalogs and loop trough it, but the data won't be garbage collectable until after we process whole slice. So we need to find a balance between 1) how much we want to hold in memory and 2) how many network calls we make.

Notice: I haven't tested/measured memory consumption with multipe catalogs. So it is based on gut feeling and maybe it is not very beneficial from memory consumption point of view.

But another benefit of doing one catalog at a time is that it avoids doing more mapping between catalogs, channels and bundles.

@m1kola m1kola requested a review from dtfranz August 24, 2023 15:11
@m1kola m1kola force-pushed the catalog_metadata_api branch 2 times, most recently from ab7400c to 21d3ac5 Compare August 24, 2023 16:02
Signed-off-by: Mikalai Radchuk <mradchuk@redhat.com>
Copy link
Contributor

@dtfranz dtfranz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This /lgtm, I've taken note of the concerns and hopefully I can help take care of them during the next efforts on the refactor. Thanks Mikalai!

@m1kola m1kola added this pull request to the merge queue Aug 25, 2023
Merged via the queue into operator-framework:main with commit 2114da6 Aug 25, 2023
9 of 11 checks passed
@m1kola m1kola deleted the catalog_metadata_api branch August 25, 2023 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Switch to catalogd's CatalogMetadata APIs
5 participants