As the usage of enterprise message producers, brokers, and consumers grows, ensuring that data or payloads are compliant with a known schema becomes crucial for several reasons:
Interoperability: Schemas provide a common structure for data, ensuring that different components and systems can understand and work with the data. This is especially important in a heterogeneous environment where various technologies and platforms are in use.
Data Consistency: Schemas help to maintain data consistency by defining the format and data types. This consistency is essential to prevent data errors or misinterpretations that could lead to system failures or incorrect business decisions.
Validation: Schemas enable data validation. When data is produced, it can be validated against the schema to ensure it meets the required standards. This reduces the likelihood of erroneous data entering the system.
Version Control: Enterprises often evolve and update their data structures over time. Schemas provide a way to manage version control, ensuring that different versions of data can be correctly processed by consumers.
Security: Schemas can include security constraints, ensuring that sensitive data is handled appropriately, and access controls are enforced.
Documentation: Schemas serve as documentation for the data structures used within the enterprise. This aids in onboarding new team members, understanding data flows, and troubleshooting issues.
To achieve schema compliance, organizations often use technologies and practices such as:
Schema Definition Languages: Using languages like JSON Schema, Avro Schema, Protocol Buffers, or XML Schema to formally define the structure of data.
Schema Registry: Implementing a central repository (schema registry) where schema definitions are stored and managed. This allows producers and consumers to reference and validate data against the latest schema versions.
Data Validation: Implementing data validation processes at the producer and consumer ends to ensure data adheres to the defined schema before it's transmitted or processed.
Schema Evolution: Establishing procedures for handling schema changes, including versioning and backward compatibility to ensure smooth transitions when schemas are updated.
Monitoring and Alerting: Implementing monitoring tools and alerting mechanisms to detect and notify stakeholders of any schema compliance violations.
Education and Training: Ensuring that teams are well-trained on schema usage and compliance practices to minimize errors and maintain data quality.
By focusing on schema compliance, enterprises can maintain data quality, reduce errors, improve interoperability, and make their systems more robust and reliable as they scale and evolve.
🔹 In this Application we have used
-
Download and install Docker Desktop
-
You can check the version of Docker you have installed:
-
Starting confluent platform on Docker:
Download docker-compose.yml file and run docker compose command with -d option to run in detached mode
docker-compose up -d
You should see all the containers come up as shown below:
-
Create Kafka topics
Navigate to Control Center at http://localhost:9021. It may take a minute or two for Control Center to start and load. Click on the cluster.
In the navigation menu, click Topics to open the topics list. Click on Add topic button
In the Topic name field, enter topic name and click Create with defaults. Topic names are case-sensitive.
Create retry topic
Create dlt topic
You should see all the new topics in the Topics list.
-
Verify registered schema types:
http://localhost:8081/schemas/types
Response:
[ "JSON", "PROTOBUF", "AVRO" ]
-
Created Avro Schema: student.avsc
davidmc24 gradle avro plugin will generate the
Student
POJO in theorg.poc.kafka.avro.model
package which is defined in the schema. This POJO has id, firstName, lastName, contact properties.
- Run springboot-kafka-avro-producer service
- Open Swagger-Ui
-
Run springboot-kafka-avro-consumer service
-
Execute Students API
- springboot-kafka-avro-producer console log
Students API response:
- springboot-kafka-avro-consumer console log
- You can check message in Control Center by selecting Date Time and Partition
Example in Retry topic message:
- You can verify created avro student.avsc schema in the clusters
These additional references should also help you: