Skip to content

Commit

Permalink
docs: fix readme for anonymization (#559)
Browse files Browse the repository at this point in the history
* docs: fixed markdown for Anonymization

Signed-off-by: Jatin Mehrotra <jatin.mehrotra@classmethod.jp>

* docs: added details for events which are not being masked

Signed-off-by: Jatin Mehrotra <jatin.mehrotra@classmethod.jp>

* docs: removed timeframe, added issue link for event anonymization

Signed-off-by: Jatin Mehrotra <jatin.mehrotra@classmethod.jp>

* docs: changed title to Further details

Signed-off-by: Jatin Mehrotra <jatin.mehrotra@classmethod.jp>

* docs: fixed broken markdown for config management and remote caching section

Signed-off-by: Jatin Mehrotra <jatin.mehrotra@classmethod.jp>

* Signed-off-by: Jatin Mehrotra <jatin.mehrotra@classmethod.jp>

docs: added "note" for events before further details section

---------

Signed-off-by: Jatin Mehrotra <jatin.mehrotra@classmethod.jp>
Co-authored-by: Jatin Mehrotra <jatin.mehrotra@classmethod.jp>
Co-authored-by: Alex Jones <alexsimonjones@gmail.com>
  • Loading branch information
3 people committed Jul 19, 2023
1 parent 781ecb7 commit 70bec05
Showing 1 changed file with 51 additions and 0 deletions.
51 changes: 51 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -395,6 +395,7 @@ With this option, the data is anonymized before being sent to the AI Backend. Du


<summary> Anonymization </summary>

1. Error reported during analysis:
```bash
Error: HorizontalPodAutoscaler uses StatefulSet/fake-deployment as ScaleTargetRef which does not exist.
Expand All @@ -415,12 +416,61 @@ The Kubernetes system is trying to scale a StatefulSet named tGLcCRcHa1Ce5Rs usi
The Kubernetes system is trying to scale a StatefulSet named fake-deployment using the HorizontalPodAutoscaler, but it cannot find the StatefulSet. The solution is to verify that the StatefulSet name is spelled correctly and exists in the same namespace as the HorizontalPodAutoscaler.
```

Note: **Anonymization does not currently apply to events.**

### Further Details

**Anonymization does not currently apply to events.**

*In a few analysers like Pod, we feed to the AI backend the event messages which are not known beforehand thus we are not masking them for the **time being**.*

- The following are the list of analysers in which data is **being masked**:-

- Statefulset
- Service
- PodDisruptionBudget
- Node
- NetworkPolicy
- Ingress
- HPA
- Deployment
- Cronjob

- The following are the list of analysers in which data is **not being masked**:-

- RepicaSet
- PersistentVolumeClaim
- Pod
- **_*Events_**

***Note**:
- k8gpt will not mask the above analysers because they do not send any identifying information except **Events** analyser.
- Masking for **Events** analyzer is scheduled in the near future as seen in this [issue](https://github.com/k8sgpt-ai/k8sgpt/issues/560). _Further research has to be made to understand the patterns and be able to mask the sensitive parts of an event like pod name, namespace etc._

- The following are the list of fields which are not **being masked**:-

- Describe
- ObjectStatus
- Replicas
- ContainerStatus
- **_*Event Message_**
- ReplicaStatus
- Count (Pod)

***Note**:
- It is quite possible the payload of the event message might have something like "super-secret-project-pod-X crashed" which we don't currently redact _(scheduled in the near future as seen in this [issue](https://github.com/k8sgpt-ai/k8sgpt/issues/560))_.

### Proceed with care

- The K8gpt team recommends using an entirely different backend **(a local model) in critical production environments**. By using a local model, you can rest assured that everything stays within your DMZ, and nothing is leaked.
- If there is any uncertainty about the possibility of sending data to a public LLM (open AI, Azure AI) and it poses a risk to business-critical operations, then, in such cases, the use of public LLM should be avoided based on personal assessment and the jurisdiction of risks involved.


</details>

<details>
<summary> Configuration management</summary>

`k8sgpt` stores config data in the `$XDG_CONFIG_HOME/k8sgpt/k8sgpt.yaml` file. The data is stored in plain text, including your OpenAI key.

Config file locations:
Expand All @@ -440,6 +490,7 @@ In these scenarios K8sGPT supports AWS S3 Integration.
_As a prerequisite `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` are required as environmental variables._

_Adding a remote cache_

Note: this will create the bucket if it does not exist
```
k8sgpt cache add --region <aws region> --bucket <name>
Expand Down

0 comments on commit 70bec05

Please sign in to comment.