Skip to content

Commit

Permalink
Improve README for Kafka example (#1885)
Browse files Browse the repository at this point in the history
* Improve Readme.md for Kafka

* Address review comments

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
  • Loading branch information
akankshakumari393 and mergify[bot] committed Feb 10, 2023
1 parent 0fe21c8 commit c084a0a
Showing 1 changed file with 57 additions and 19 deletions.
76 changes: 57 additions & 19 deletions examples/kafka/adobe-s3-connector/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ $ helm install kafka-release strimzi/strimzi-kafka-operator --namespace kafka-te

```
## Setup Kafka

```bash
# Provision the Apache Kafka and zookeeper.
$ kubectl create -f ./kafka-cluster.yaml -n kafka-test
Expand All @@ -45,7 +46,9 @@ kubectl port-forward kafdrop 7000:9000 -n kafka-test
```

## Validate producer and consumer

Create Producer and Consumer using Kafka image provided by strimzi.

```bash
# create a producer and push data to it
$ kubectl -n kafka-test run kafka-producer -ti --image=strimzi/kafka:0.20.0-kafka-2.6.0 --rm=true --restart=Never -- bin/kafka-console-producer.sh --broker-list my-cluster-kafka-bootstrap:9092 --topic blogpost
Expand Down Expand Up @@ -87,7 +90,9 @@ These additional configs apply to the kafka-connect:
| plugin.path | Connector jar location |

## Setup Blueprint, ConfigMap and S3 Location profile

Before setting up the Blueprint, a Kanister Profile is created with S3 details along with a ConfigMap with the configuration details. `timeinSeconds` denotes the time after which sink connector needs to stop running.

```bash
# Create ConfigMap with the properties file, S3 properties and kafkaConfiguration.properties
$ kubectl create configmap s3config --from-file=adobe-s3-sink.properties=./adobe-s3-sink.properties --from-file=adobe-kafkaConfiguration.properties=./adobe-kafkaConfiguration.properties --from-file=adobe-s3-source.properties=./adobe-s3-source.properties --from-literal=timeinSeconds=1800 -n kafka-test
Expand All @@ -102,38 +107,48 @@ $ kanctl create profile s3compliant --access-key <aws-access-key> \
$ kubectl create -f ./kafka-blueprint.yaml -n kanister
```

## Perform Backup
To perform backup to S3, an ActionSet is created to run `kafka-connect`.
```bash
# Create an actionset
$ kanctl create actionset --action backup --namespace kanister --blueprint kafka-blueprint --profile kafka-test/s3-profile-fn64h --objects v1/configmaps/kafka-test/s3config
```
## Verify the backup
We can verify the backup operation by adding some data to the topic configured earlier
## Insert Data in Topic

* Create a topic `blogs` on the Kafka server. The `blogs` topic is configured as source and sink topic in `s3config` configmap

* List all topics in Kafka server
```bash
$ kubectl -n kafka-test run kafka-producer -ti --image=strimzi/kafka:0.20.0-kafka-2.6.0 --rm=true --restart=Never -- bin/kafka-topics.sh --bootstrap-server=my-cluster-kafka-bootstrap:9092 --list
```
* Create a topic on Kafka server
```bash
$ kubectl -n kafka-test run kafka-producer -ti --image=strimzi/kafka:0.20.0-kafka-2.6.0 --rm=true --restart=Never -- bin/kafka-topics.sh --create --topic blogpost --bootstrap-server my-cluster-kafka-bootstrap:9092
```
* Create a producer to push data to blogpost topic

* Create a producer to push data to `blogs` topic

```bash
$ kubectl -n kafka-test run kafka-producer -ti --image=strimzi/kafka:0.20.0-kafka-2.6.0 --rm=true --restart=Never -- bin/kafka-console-producer.sh --broker-list my-cluster-kafka-bootstrap:9092 --topic blogpost
$ kubectl -n kafka-test run kafka-producer -ti --image=strimzi/kafka:0.20.0-kafka-2.6.0 --rm=true --restart=Never -- bin/kafka-console-producer.sh --broker-list my-cluster-kafka-bootstrap:9092 --topic blogs

>{"title":"The Matrix","year":1999,"cast":["Keanu Reeves","Laurence Fishburne","Carrie-Anne Moss","Hugo Weaving","Joe Pantoliano"],"genres":["Science Fiction"]}
>{"title":"ABCD3","year":2000,"cast":["Keanu Reeves","Laurence Fishburne","Carrie-Anne Moss","Hugo Weaving","Joe Pantoliano"],"genres":["Science Fiction"]}
>{"title":"Student of the year","year":2001,"cast":["Keanu Reeves","Laurence Fishburne","Carrie-Anne Moss","Hugo Weaving","Joe Pantoliano"],"genres":["Science Fiction"]}
>{"title":"ABCD","year":2002,"cast":["Keanu Reeves","Laurence Fishburne","Carrie-Anne Moss","Hugo Weaving","Joe Pantoliano"],"genres":["Science Fiction"]}
```
* Check S3 bucket for the topic

## Perform Backup

To perform backup to S3, an ActionSet is created which runs `kafka-connect`.

```bash
# Create an actionset
$ kanctl create actionset --action backup --namespace kanister --blueprint kafka-blueprint --profile kafka-test/s3-profile-fn64h --objects v1/configmaps/kafka-test/s3config
```

### Disaster strikes!

Let's say someone accidentally removed the events from the `blogs` topic in the Kafka cluster:

```bash
# No events from `blogs` topic.
$ kubectl -n kafka-test run kafka-consumer -ti --image=strimzi/kafka:0.20.0-kafka-2.6.0 --rm=true --restart=Never -- bin/kafka-console-consumer.sh --bootstrap-server my-cluster-kafka-bootstrap:9092 --topic blogs --from-beginning

```
## Perform Restore

To perform restore, a pre-hook restore operation is performed which will purge all events from the topics in the Kafka cluster whose backups were performed previously.
```bash

```bash
$ kanctl create actionset --action restore --from "backup-rslmb" --namespace kanister --blueprint kafka-blueprint --profile kafka-test/s3-profile-fn64h --objects v1/configmaps/kafka-test/s3config

```
Expand All @@ -142,13 +157,33 @@ $ kanctl create actionset --action restore --from "backup-rslmb" --namespace kan
* Before running pre-hook operation, confirm that no other consumer is consuming data from that topic

## Verify restore

Create a consumer for topics

```bash
# Creating a consumer on a different terminal
$ kubectl -n kafka-test run kafka-consumer -ti --image=strimzi/kafka:0.20.0-kafka-2.6.0 --rm=true --restart=Never -- bin/kafka-console-consumer.sh --bootstrap-server my-cluster-kafka-bootstrap:9092 --topic blogpost --from-beginning
$ kubectl -n kafka-test run kafka-consumer -ti --image=strimzi/kafka:0.20.0-kafka-2.6.0 --rm=true --restart=Never -- bin/kafka-console-consumer.sh --bootstrap-server my-cluster-kafka-bootstrap:9092 --topic blogs --from-beginning

>{"title":"The Matrix","year":1999,"cast":["Keanu Reeves","Laurence Fishburne","Carrie-Anne Moss","Hugo Weaving","Joe Pantoliano"],"genres":["Science Fiction"]}
>{"title":"ABCD3","year":2000,"cast":["Keanu Reeves","Laurence Fishburne","Carrie-Anne Moss","Hugo Weaving","Joe Pantoliano"],"genres":["Science Fiction"]}
>{"title":"Student of the year","year":2001,"cast":["Keanu Reeves","Laurence Fishburne","Carrie-Anne Moss","Hugo Weaving","Joe Pantoliano"],"genres":["Science Fiction"]}
>{"title":"ABCD","year":2002,"cast":["Keanu Reeves","Laurence Fishburne","Carrie-Anne Moss","Hugo Weaving","Joe Pantoliano"],"genres":["Science Fiction"]}
```
All the messages restored can be viewed.

## Delete the Artifacts

The artifacts created by the backup action can be cleaned up using the following command:

```bash
$ kanctl --namespace kanister create actionset --action delete --from backup-rslmb --namespacetargets kanister
actionset delete-backup-rslmb-cq6bw created

# View the status of the ActionSet
$ kubectl --namespace kanister get actionsets.cr.kanister.io delete-backup-rslmb-cq6bw
NAME PROGRESS LAST TRANSITION TIME STATE
delete-backup-rslmb-cq6bw 100.00 2022-12-15T10:05:38Z complete
```
## Delete Blueprint and Profile CR

```bash
Expand All @@ -162,19 +197,22 @@ s3-profile-fn64h 2h
$ kubectl delete profiles.cr.kanister.io s3-profile-fn64h -n kafka-test
```

### Troubleshooting
## Troubleshooting

The following debug commands can be used to troubleshoot issues during the backup and restore processes:

Check Kanister controller logs:

```bash
$ kubectl --namespace kanister logs -l run=kanister-svc -f
```
Check events of the ActionSet:

```bash
$ kubectl describe actionset <actionset-name> -n kanister
```
Check the logs of the Kanister job

```bash
# Get the Kanister job pod name
$ kubectl get pod -n kafka-test
Expand Down

0 comments on commit c084a0a

Please sign in to comment.