-
Notifications
You must be signed in to change notification settings - Fork 260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement] allow Kafka Producers and configs to be stored in Egeria #5804
Comments
@mandy-chessell @cmgrote @lpalashevski any feedback. |
As discussed on our call, some initial thoughts:
|
There are a number of misunderstandings described above and so this post is going to need to contradict the statements above. Firstly, Kafka producers can be represented in the open metadata types. They are some form of SoftwareServerCapability within a Software server. For example, it maybe an Application or a SoftwareService. Here are some examples: If the kafka consumer is a specific type of software server capability that is not currently represented then we should add a new subtype. The Topic is an Asset. The relationship to link the Topic to the kafka producer's SoftwareServerCapability is ServerAssetUse: For a Kafka producer the The SubscriberList is a DataSet that lists the subscribers as data/properties file. This is for a system that pushes events to its list of subscribers and it shows where this information is located. This is not the way that Kafka works but I would like to keep it along with TopicSubscribers for systems that operate in this way. |
@mandy-chessell thank you for the clarifications - that makes sense. |
I think that a SubscriberList as a group of subscribers is not assigned to a Topic. Only a Subscriber subscribes 0..n Topics and a Topic is subscribed by 0..m Subscribers. |
@xgadjhe thank you for you comments on this issue. It looks like I was not correct in talking about subscriberList, as this is not used by Egeria to represent Kafka subscription. I did not realise that when we talked earlier. Do you think that populating the SoftwareServerCapability as above would be sufficient for you or do you think it needs augmenting in some way? |
Sorry for the confusion @xgadjhe @davidradl @cmgrote. I did not think deeply enough about the context of David's question when I led him to the 0223 model. Here is some more information. Topics, Processes, DeployedAPIs and DataSets are all types of Assets. We can link them together to form the lineage flow. However this does not describe the agent that is actively working with these assets. The SoftwareServerCapability is the definition of this agent when it is software. (Other agents could be people of course.) The SoftwareServerCapability is linked to the asset using the ServerAssetUse relationship to describe its role in working with the asset. This is expressed in the useType property. The SoftwareServerCapability has an important role in linking the assets to the infrastructure that is supporting them through the SoftwareServerSupportedCapability relationship (0042). |
@mandy-chessell thanks for the explanation. I am wondering whether we need a subclass of SoftwareServerCapability to be able to be more explicit / specific on its purpose - maybe a subscriptionService and publshedService, or to we think the fact its related to the the topic with the useType is enough. |
A subclass does not seem to work because software server capabilities are not just kafka producers or consumers. For example, many of the Egeria registered services are both kafka producers and consumers - similar to OMRS. We model them as subtypes of SoftwareService. They also support APIs and access data sources. What type should they be? Would we need 2 subtypes for every current subtype of SoftwareServerCapability in case they access an event service? - and do we do similar changes in case they have an API? Then what about the combinations? It could be possible to add new relationship types between asset and software server capability to show producers and consumers but I would not recommend that unless there are specific properties that need to be stored. I would recommend that even if these relationships were added, the ServerAssetUse relationship was still established to prevent the need for special case code in the governance/lineage modules just for event systems. |
@mandy-chessell I agree that SoftwareServerCapability could be the correct place to represent consumers, producers and owner of topics but I think the model should be more precise when it comes to the relationships. A topic can have many producers and many consumers but only one owner. So we need to have 3 relationships - 2 with a many-to-many cardinality and one (the owning relationship) with a one-to-many cardinality between SoftwareServerCapabilty and Topic. I think the relationships should not be too generic or you end up with a model consisting of things having relationships to things. |
@mandy-chessell I see that the asset manager omas has ServerAssetUseType. I cannot see it being used. I cannot see an OMAS API that would allow the consumers and producers to be added and queried in Egeria. I assume this would be asset manager omas, rather than the data manager which is handling topics and schemas. The AssetManagerElement in the Asset manager OMAS contains SoftwareServerCapability properties. I looks like we might need to extend this OMAS so that AssetManagerElement contains this relationship content, either by The 2nd option looks more consumable - as the caller could then query the asset manager element and see the software server capability properties and the consumer and producer information. Would it be reasonable for the data manager to be able to see the consumers and producers associated with the topic? I am not sure whether the persona would ever need this information. Can you confirm this makes sense please @mandy-chessell . |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions. |
Is there an existing issue for this?
Current Behavior
Kafka Producers cannot be stored in Egeria
Expected Behavior
We currently have https://egeria-project.org/open-metadata-publication/website/open-metadata-types/0223-events-and-logs
The SubscriberList would be a list of consumers.
I propose :
Alternatives
no
Any Further Information?
no
Would you be prepared to be assigned this issue to work on?
The text was updated successfully, but these errors were encountered: