-
Notifications
You must be signed in to change notification settings - Fork 260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Some classifications not returned from findEntityXXX requests #6311
Comments
This fix requires the findEntityXXX executors to operate in two phases. The first phase to retrieve the entities and the second phase to retrieve any stranded home classifications. The first phase visits each repository and retrieve none, or a list of entity details. These are assembled in the accumulator where old/duplicated versions are discarded and the classifications are merged into a common list. In order to avoid calling a repository that has already returned an entity, the accumulator needs to remember which repositories have returned a specific entity so they are skipped in the second phase. In this fix I have also changed the single entity queries to use the same pattern. Their current implementation manages the two phases in the executor. This means the request has to be executed in sequence using a single executor. Moving the accumulation of the entity, classifications and the list of visited repositories to the accumulator means these requests can be executed in parallel using a different executor in each thread. Note: at this time, the federated query is operating in a single thread but the design/code allows for queries to be issued in parallel threads - the code is waiting for the thread pool suppport to be added to ParallelFederationControl. This parallel operation is needed for large cohorts. |
Signed-off-by: Mandy Chessell <mandy.e.chessell@gmail.com>
@mandy-chessell, I managed to test the changes with IGC proxy and there is something missing in the first phase. When the FindEntitiesByPropertyExecutor's issueRequestToRepository method is called, the method always returns true. In case of the glossary term from IGC, the first connector (the local one) will retrieve no results. This means that the federation control logic, in executeCommand will go to break and will stop searching for entities in the next connector (no entities will be searched in IGC, no entities will be returned at all). The same happens when I try to publish the glossary term context through Asset Lineage. This time the executor is GetRelationshipsForEntityExecutor. No relationship is found. I tried to add I am not sure if some other logic should be added or if other executors should have a similar change, but I saw we have a lot more than these 2. Could you please take a look at this issue so I don't mess with the flow? :) Thank you! |
issueRequestToRepository should always return false to ensure all repositories are visited. This is required for all find executors |
Signed-off-by: Mandy Chessell <mandy.e.chessell@gmail.com>
Ensure federated query (#6311)
I think this is finished now |
Is there an existing issue for this?
Current Behavior
The federated findEntityXXX repository requests supported by the enterprise repository services do not return classifications for entities if they are stored in a repository attached to an entity proxy. These methods do return all classifications attached to home entities and reference copies.
The reason for the difference is that the entity proxy can not be returned on the findEntityXXX methods because they return EntityDetail objects.
The methods that retrieve a single entity (isEntityKnown, getEntitySummary, getEntityDetail) will successfully return all classifications because the executors for these requests are making requests for a specific entity and so receive EntityProxyOnlyException if the entity is known but only a proxy is available. This then allows getHomeClassifcations() to be called on that repository to retrieve any locally homed classifications from that repository.
Expected Behavior
All classifications for an entity should be returned from the federated query no matter how they are stored.
The federating query mechanism needs to minimise calls to the cohort member repositories and allow the query to execute in parallel.
Steps To Reproduce
The entities needs to be stored in a repository that is using the adapter pattern (ie running in a repository proxy) and does not have an event mapper nor supports the Anchors classification. The lack of an event mapper means that events relating to its metadata will not be sent over the cohort topic(s) and so no reference copies from this repository will be created in the other cohort members. The lack of support of the Anchors classification means that this classification needs to be stored in another repository - and due to the lack of reference copies, the classification will be attached to an entity proxy.
This repository needs to be part of a cohort with a native repository so there is somewhere to store the Anchors classifications.
Then issue a query to find these entities through an OMAS to engage the federated query. Beforethis fix, the anchors classification is missing, after this fix, the Anchors classification is returned.
Environment
- Egeria: 3.7 SNAPSHOT or earlier
Any Further Information?
No response
The text was updated successfully, but these errors were encountered: