- Project Ideas
- Argo
- Buildpacks
- Kubernetes
- KubeVela
- Chaos Mesh
- Vitess
- KubeArmor
- TiKV
- Karmada
- Pixie
- CoreDNS
- in-toto
- KubeEdge
- WasmEdge
- Kyverno
- Brigade
- cert-manager
- The Update Framework (TUF)
- Keylime
- Thanos
- LitmusChaos
- Meshery
- Service Mesh Performance
- Knative
If you are a project maintainer and consider mentoring during the GSoC 2022 cycle, please, submit your ideas below using the template.
- Description: Memoization is a feature that allows users to run workflows faster by avoiding repeating work that has already been done. Currently, memoization uses a Kubernetes ConfigMap for storage. This will not scale to large number of entries and it requires elevated RBAC. We'd like to make this extensible that could support additional storage systems to store step memoization cache for Argo Workflows.
- Expected Outcome: The memoization feature is extended to allow additional storage systems.
- Recommended Skills: Golang, Kubernetes, Databases
- Mentor(s): Yuan Tang (@terrytangyuan) and Alex Collins (@alexec)
- Expected Project Size: 175 hours
- Difficulty Rating: Medium
- Upstream Issue (URL): argoproj/argo-workflows#3587
- Description: Currently, any user who wants to add additional resources to the UI needs to implement both backend and frontend changes. Instead, we'd like to implement an UI extension mechanism to load and embed UI elements in Argo Workflows UI. Argo CD already has a UI extension mechanism to load the Argo Rollouts into the UI that is proven to be successful. We can lift inspiration from this and implement something similar in Argo Workflows.
- Expected Outcome: An extension mechanism in Argo Workflows UI to embed custom UI elements.
- Recommended Skills: Typescript, React, Golang
- Mentor(s): Alex Collins (@alexec)
- Expected Project Size: 175 hours
- Difficulty Rating: Medium
- Upstream Issue (URL): argoproj/argo-workflows#6945
- Description: Currently, Argo Workflows has a feature called enhanced depends logic that allows users to specify dependent tasks based on their statuses via complex boolean logic. However, this requires users to specify task names explicitly which may be difficult or unnecessary to obtain. For example, there are situations where the tasks are dynamically generated and task names cannot be easily obtained or users might not care the statuses for specific tasks and instead focus on the number of tasks of a particular status. We'd like to implement a count-based enhanced depends logic in Argo Workflows in addition to the existing depends logic based on dependent tasks and their statuses.
- Expected Outcome: Support count-based enhanced depends logic in Argo Workflows in addition to the existing depends logic based on dependent tasks and their statuses.
- Recommended Skills: Golang, Kubernetes
- Mentor(s): Yuan Tang (@terrytangyuan)
- Expected Project Size: 175 hours
- Difficulty Rating: Medium
- Upstream Issue (URL): argoproj/argo-workflows#3171
- Description: The
pack
CLI has many layers of cache with various methods to configure them. In this project, the goal would be to implement a proposed solution for how the end-user provides caching options. The caching mechanisms are already implemented but they are not currently exposed to the end-user. - Expected outcome: A new argument to the Pack CLI that allows users to configure various types of cache options.
- Recommended Skills: Golang, Docker
- Mentor(s): Javier Romero (@jromero)
- Expected project size: 350 Hours
- Difficulty: Medium
- Upstream Issue (URL): buildpacks/pack#1077
- Description: A proof of concept implementing an OCI registry facade to stand in the middle between the lifecycle and the Daemon is required to evaluate the deprecation of the Daemon support. The idea is to create a component that is capable of translating OCI format to V1 format and viceversa and simplify the Lifecycle base code to only interact with OCI registry without loosing the current capability of dealing with the Daemon. There are some concerns about how this will affect the performance but the PoC will help us to understand all the side effects of the approach. There are some more information discussed about removing the Daemon support in the following draft RFC
- Expected outcome: Implementation of a basic OCI registry wrapper capable of translating inbound/outbound OCI requests to Daemon requests
- Recommended Skills: Golang, Docker
- Mentor(s): Javier Romero (@jromero), Juan Bustamante (@jjbustamante)
- Expected project size: 350 Hours
- Difficulty: Hard
- Upstream Issue (URL): buildpacks/pack#1372
Kubebuilder (https://github.com/kubernetes-sigs/kubebuilder)
- Description: Your goal is to develop a Kubebuilder Plugin which generates the files with all possible common desired source code implementation for an Operator Author to achieve the goal of deploying an Operand(image/Pod) following the Operator pattern and common best practices and recommendations such as using StatusConditionals, tests, etc. Note that we can begin implementing this plugin with basic implementation and then start to provide many follow-ups as much we wish to grow it incrementally. This plugin can help a lot of Operator Authors save time and give good direction and start to point to them. You can begin by following, for example, the quick tutorial Golang Operator and the document Common recommendations and suggestions to have an idea of how Operators works and what code this plugin would generate by default. It is ideal for those who are looking to know more about Operator pattern and its good practices, how to develop tests and ensure the quality and maintainability of solutions.
- Expected outcome: Implementation of a new Golang Plugin in Kubebuilder to generate scaffolds required to deploy and manage an image on the cluster. Other possibe outcomes are to provide examples of the code implementation and recommend best practices in documentation.
- Recommended Skills: Golang, Kubernetes, Operators
- Mentor(s): Camila Macedo (@camilamacedo86) and Rashmi Gottipati(@rashmigottipati)
- Expected project size: 175 Hours
- Difficulty: Medium
- Upstream Issue (URL): https://github.com/kubernetes-sigs/kubebuilder/blob/master/designs/code-generate-image-plugin.md
- Description: Your goal is to develop a Kubebuilder Plugin which will generate the manifests required to provide Grafana dashboards for visualizing the default metrics exported. You can begin by first looking into the requirements for exporting the controller-runtime metrics and steps to create a custom dashboard. After figuring out the configuration files which are needed, we can start by creating a plugin when invoked scaffolds them. The plugins here can give a good idea on how they work. An extended goal would be to figure out on how we can make it available on https://grafana.com/grafana/dashboards for users to be able to easily export it.
- Expected outcome: Create a plugin, when invoked scaffolds out the required manifests for integrating Grafana dashboard. Make the Grafana dashboard available in https://grafana.com/grafana/dashboards based on the inputs from the community.
- Recommended Skills: Golang, Kubernetes
- Mentor(s): Varsha Prasad (@varshaprasad96)
- Expected project size: 175 Hours
- Difficulty: Medium
- Upstream Issue (URL): kubernetes-sigs/kubebuilder#2183
- Description: CAPG is a subporject under the SIG-Cluster-LifeCycle and it is an implementation for GCP of Cluster-API which is a Kubernetes sub-project focused on providing declarative APIs and tooling to simplify provisioning, upgrading, and operating multiple Kubernetes clusters.
- Expected outcome: This work will bring CAPG inline with CAPA and CAPZ, both of which support creating unmanaged and managed clusters.
- Recommended Skills: Golang, Kubernetes
- Mentor(s): Carlos Panato (@cpanato), Davanum Srinivas (@dims), Richard Case (@richardcase), Winnie Kwon (@pydctw)
- Expected project size: 175 Hours
- Difficulty: Medium
- Upstream Issue (URL): kubernetes-sigs/cluster-api-provider-gcp#512 / kubernetes-sigs/cluster-api-provider-gcp#478 / kubernetes-sigs/cluster-api-provider-gcp#289
- Description: Cluster API Provider AWS (CAPA) is a subproject of SIG Cluster Lifecycle that extends Cluster API to simplify lifecycle management of Kubernetes clusters on AWS. The focus of this project is to improve the observability of CAPA by integrating it with observability tools such as OpenTelemetry/Jaeger/Prometheus. With the help of metrics/traces emitted by these observability integrations, users will easily analyze CAPA's behaviour and performance. This project will also focus on improving developer experience with OpenTelemetry collectors and by documenting the integration steps.
- Expected outcome: The main outcome is to instrument CAPA to export OpenTelemetry trace and improve the metrics exposed. Another outcome is to improve development environment by deploying OpenTelemetry for collecting traces, Jaeger and Prometheus for viewing traces and visualizing metrics.
- Recommended Skills: Golang, Kubernetes, observability tools (such as OpenTelemetry/Jaeger/Prometheus)
- Mentor(s): Sedef Savas (@sedefsavas) Richard Case (@richcase)
- Expected project size: 350 Hours
- Difficulty: Medium
- Upstream Issue (URL): kubernetes-sigs/cluster-api-provider-aws#2178
- Description: To enable KubeVela end-users to better manage cloud resources, more cloud providers and cloud resources are needed to support in KubeVela. These are some cloud providers which have been supported as Terraform Addons. But there are also some which have been supported by Terraform Controller, but not by KubeVela. They are expected to be extended as Terraform Cloud providers in KubeVela. KubeVela has supported 110 cloud resources across cloud providers. But we need more cloud resources to be extended as KubeVela Terraform ComponentDefinition.
- Expected outcome: 15+ cloud providers are supported by KubeVela as Terraform addons; 300+ cloud resources are supported; Tools and docs to support extending Terraform provider addons and cloud resources in an easy and productive way
- Recommended Skills: Golang, Terraform
- Mentor(s): ZhengXi Zhou (@zzxwill)
- Expected project size: 350 Hours
- Difficulty: Hard
- Upstream Issue (URL): kubevela/kubevela#2442
- Description: Improve the capabilities of KubeVela GitOps to make it a standalone Addon.
- Expected outcome: Create a KubeVela GitOps addon to simplify use and debugging.
- Recommended Skills: Golang, Kubernetes, Cue
- Mentor(s): FogDong (@FogDong)
- Expected project size: 175 Hours
- Difficulty: Medium
- Upstream Issue (URL): kubevela/kubevela#3205
- Description: Extend the observability of multi-cluster information on KubeVela control plane
- Expected outcome: Provide extensible cluster information, starting with providing more information such as: the health status, available cpu cores, whether it is a GPU cluster, the stability of connections with KubeVela control plane, etc.
- Recommended Skills: Golang, Kubernetes
- Mentor(s): Da Yin (@Somefive)
- Expected project size: 175 Hours
- Difficulty: Medium
- Upstream Issue (URL): kubevela/kubevela#3177
- Description: Chaos Mesh supports injecting errors into a bare metal server through chaosd and
PhysicalMachineChaos
. It's convenient to manage the physical machine injection through Chaos Mesh, considering its dashboard,Workflow
andSchedule
function. However, for a bare metal user without any kubernetes environment, he has to deploy a kubernetes cluster to manage and install Chaos Mesh. This project is to simplify this routine by packaging the neccesary part of Kubernetes and the controller of Chaos Mesh together (just like a much more simplified k3s). You will need to investigate the source codes of Chaos Mesh and Kubernetes, combine their controllers and api-servers, and extract the useful part of them. You also need to provide a out-of-box configuration for the certificates and leader election. What's more? This project has the potential to empower Kubernetes to become a framework to develop a high-available declarative API for any cloud-native projects. - Expected outcome: Provide an executable file to start Chaos Mesh and related kubernetes environment with high-availability support under one process.
- Recommended Skills: Golang, Kubernetes
- Mentor(s): Yang Keao (@YangKeao)
- Expected project size: 350 Hours
- Difficulty: Hard
- Upstream Issue (URL): chaos-mesh/chaos-mesh#2848
- Description: Communicate with subprocess in another namespace through unix socket.
- Detailed description: The
chaos-daemon
component of Chaos Mesh manages many subprocesses to inject faults into other containers(Linux namespaces), and we need to communicate with the subprocesses (to modify configurations or monitor statuses). However, some of the subprocesses run in another network namespace (hard to access by network), others of them may run in another mount namespace (hard to access by named unix socket). Currently,chaos-daemon
communicate with subprocesses by stdin and stdout, it works across different namespaces but lacks session layers, and we have to be cautious of printing logs into stdout. So, we need a better way to communicate with subprocesses, and the abstract unix socket is a reasonable choice. - Expected outcome:
- The
chaos-daemon
can pass file description of unix socket listeners to subprocesses and dial them. - The subprocesses can receive file descriptions and re-construct unix socket listeners.
- The
- Recommended Skills: Golang, Rust, Docker, Linux
- Mentor(s): Hexilee (@Hexilee), Yang Keao (@YangKeao)
- Expected project size: 350 Hours
- Difficulty: Hard
- Upstream Issue (URL): chaos-mesh/rfcs#34
- Description: Improve the compatbility of Vitess' evaluation engine against MySQL by adding support for more built-in SQL functions.
- Detailed description: The evaluation engine in Vitess is one of the most critical parts of our query serving infrastructure. This engine is capable of evaluating arbitrary SQL expressions directly inside Vitess' process, without reaching out to a live MySQL instance, and this allows us to plan and execute complex user queries (e.g. queries that contain WHERE and similar filter clauses) between Vitess shards much more efficiently. If you're interested in this GSoC project, your task for the summer will involve continuing the work on this evaluation engine by implementing support for as many built-in SQL functions as possible, using the behavior of MySQL as a reference.
- Expected outcomes: We expect the Evaluation Engine in Vitess to be close to 100% compatible with MySQL after all the leftover SQL built-ins have been implemented.
- Recommended Skills: Golang, MySQL
- Mentor(s): Vicent Marti (@vmg)
- Expected size of the project: 350h
- Difficulty rating: Medium
- Upstream Issue (URL): vitessio/vitess#9647
- Description: KubeArmor provides a visibility telemetry events to show pod/container observability data such as process executions, file system accesses, network accesses. This information is to be used to bind together more comprehensive analysis data showing the security posture for the pod/container. This security posture/visibility information would help user in turn to discover optimal policy settings. One of the aim for this work is to ensure that the system shows only useful/aggregated data and does simply throw bunch of events/logs to the user. The overall design involves developing and deploying a k8s service that will wait on the kubearmor events and aggregate those events at the container/pod level. The cli-tool (already present but has to be extended) will be pulling the information from the service to show it to the user. An extended goal could be to show a simply TUI to the user by querying the kubearmor service. Detailed use-cases and requirements are mentioned in this slide deck.
- Expected outcome: Develop a k8s service/deployment that could aggregate events from kubearmor and show observability data to the user. An extended goal is to show a TUI based on this observability data.
- Recommended Skills: golang, tui, mysql, grpc, k8s
- Mentor(s): Barun Acharya (@daemon1024), Rahul Jadhav (@nyrahul)
- Expected project size: 350 Hours
- Difficulty: Medium
- Upstream Issue (URL): kubearmor/KubeArmor#613
- Description: KubeArmor has garnered interests from edge computing platforms (such as LF Edge OpenHorizon) that leverages k8s control plane for workload orchestration. The primary requirement is to support ARM platforms that are prevalent on the edge devices (especially Raspberry PI). KubeArmor leverages eBPF for observability and Linux Security Modules (such as AppArmor) for policy enforcement. One of the challenge is to check if the eBPF primitives such as observing kprobe, kretprobe, tracepoints that are typically available on the x86 platform are also available on the ARM platform and check if the parameter list fulfills the requirement. Post this analysis, the kubearmor code might have to be changed to accomodate any differences in the eBPF behaviour.
- Expected outcome: Kubearmor observability features should work on ARM platform. Extended goal would be to ensure that Kubearmor's policy enforcement features (based on AppArmor) are also supported on ARM platforms.
- Recommended Skills: golang, raspberry-pi, ebpf, k8s
- Mentor(s): Rahul Jadhav (@nyrahul), Barun Acharya (@daemon1024)
- Expected project size: 350 Hours
- Difficulty: Hard
- Upstream Issue (URL): kubearmor/KubeArmor#614
- Description: TiSpark maintains a fork of TiKV Java client of little use and could use the upstream TiKV Java client.
- Expected outcome: TiSpark abandons self-maintained TiKV Java client and uses upstream TiKV Java client. Everything should work as before.
- Recommended Skills: Java, Spark, TiKV
- Mentor(s): Xiang Zhang (@zhangyangyu), Yuhang Shi (@shiyuhang0)
- Expected project size: 175 Hours
- Difficulty: Medium
- Upstream Issue (URL): tikv/client-java#514
- Description: TiKV is an open-source, distributed, and transactional key-value database. TiKV is widely used in many mission-critical scenarios that require request latency to be below a single millisecond level, so knowing where the latency has consumed is important. This project is going to let TiKV has the ability to observe the latency composition and diagnose slow requests. CPU-bound requests in TiKV, such as coprocessor requests, are executed at the requested granularity, which is relatively convenient to trace. However, for IO-bound requests in TiKV, such as prewrite requests, batch processing has been introduced to improve IO throughput, which will bring some challenges to trace since it's a scenario not covered by most tracing frameworks. Also, we need to fetch statistics from RocksDB, the storage engine powering TiKV, to provide further tracing details for IO-bound requests.
- Expected outcome: Improve observability for IO-bound requests in TiKV. Concretely, we expect to learn about latency details of RaftStore and RocksDB from the improved tracing results.
- Recommended Skills: Rust, C++, OpenTracing, RocksDB
- Mentor(s): Zhenchi (@zhongzc), breeswish (@breeswish)
- Expected project size: 350 Hours
- Difficulty: Medium
- Upstream Issue (URL): tikv/tikv#11872
- Description: Karmada (Kubernetes Armada) is a Kubernetes management system that enables you to run your cloud-native applications across multiple Kubernetes clusters and clouds. This project is to develop the community official website to hold the necessary documents.
- Expected outcome: A brand-new website with enhancements to hold documents.
- Recommended Skills: Frontend HTML/CSS/JavaScript, Backend Node/Python/Go/etc
- Mentor(s): Hongcai Ren (@RainbowMango)
- Expected project size: 175 Hours
- Difficulty: Medium
- Upstream Issue (URL): karmada-io/website#14
- Description: Pixie helps contextualize the data it collects by joining it with the relevant K8s metadata. This helps us answer questions like “Pod A has a HTTP request latency of 30ms” and allows us to build resource-level visualizations. Currently, Pixie pulls in metadata about namespaces, pods, and services, but is currently missing other useful K8s resources. You can help add resources to Pixie’s metadata context and experiment with what views and dashboards can leverage this new metadata.
- Expected outcome: More queryable metadata context for Pixie data (for example, the equivalent of
px.pod_*
functions), and a dashboard (PxL + visspec) for the newly added resource (similar topx/pod
). - Recommended Skills: Go, C++, understanding of Kubernetes resources
- Mentor(s): Michelle Nguyen (@aimichelle)
- Expected project size: 175 Hours
- Difficulty: Medium
- Upstream Issue (URL): pixie-io/pixie#34
- Description: Pixie’s protocol tracer captures traffic for various protocols, including HTTP, gRPC, postgres and many more. The protocol tracer relies on protocol inference rules to classify traffic based on the contents of the messages being sent. To validate the accuracy of our protocol inference rules, we need a large database of different traffic patterns. We call this dataset TrafficNet. For this project, you will try to (1) expand the TrafficNet data set to include more samples of existing and new protocols, and (2) experiment with different inference models to improve the accuracy of the protocol inference.
- Expected outcome:
- An augmented data set of traffic patterns with a variety of protocols, based on new workloads.
- An report of protocol inferences rules on the TrafficNet data sets to identify the best performing models, and evaluating the trade-offs of the models.
- Recommended Skills: C++, Python
- Mentor(s): Omid Azizi (@oazizi000)
- Expected project size: 350 Hours
- Difficulty: Medium
- Upstream Issue (URL): pixie-io/pixie#404
- Description: PxL is a Pandas-based query language which is used to query Pixie’s data. Help add to PxL’s function library to make it easier for users to transform their data and build interesting views. This may include adding math and string operations, or functions of your own choice.
- Expected outcome: New functions in PxL, with the number of functions/complexity of the function up to your choice
- Recommended Skills: C++
- Mentor(s): Natalie Serrino (@nserrino)
- Expected project size: 175 Hours
- Difficulty: Easy
- Upstream Issue (URL): pixie-io/pixie#405 pixie-io/pixie#356
- Description: CoreDNS is a cloud-native DNS server with a focus on service discovery. While best known as the default DNS server for Kubernetes, CoreDNS is capable of handle many other scenarios such as severing DNS through HTTPS/TLS. As HTTPS/TLS requires valid certificate any renewal of corticate periodically, an automation (e.g., through ACME protocol) will minimize the manual maintenance need. This project is to provide certificate management automation in TLS plugin of CoreDNS.
- Expected outcome: An option will be added to TLS plugin for automatically renewing certificate through ACME protocol, and exposing the updated certificate.
- Recommended Skills: Golang, DNS, TLS, Certificate Management
- Mentor(s): Yong Tang (@yongtang) Paul Greenberg (@greenpau)
- Expected project size: 175 Hours
- Difficulty: Medium
- Upstream Issue (URL): coredns/coredns#3460
- Description: Dead Simple Signing Envelope (DSSE) is a new signature wrapper, and a proposal to make it the default for in-toto metadata has been accepted. However, there is currently no Python implementation that can be used by in-toto and other projects. The aim of this project is to fix that by writing a fully featured and well tested DSSE implementation, and using it to allow users to generate in-toto metadata using DSSE rather than the legacy signature wrapper.
- Expected Outcome: The Python reference implementation can generate in-toto metadata using DSSE.
- Expected Project Size: 175
- Difficulty: Medium
- Recommended Skills: Python
- Mentor(s): Aditya Sirish (@adityasaky), Lukas Pühringer (@lukpueh)
- Upstream Issue (URL): in-toto/in-toto#445
- Description: The in-toto Jenkins plugin allows users to generate in-toto link metadata in their build pipelines. However, with ITE-6 and the in-toto Attestations project, the plugin must also be capable of generating other in-toto attestations such as the provenance specification. The aim of this project is to refactor the Jenkins plugin to allow for new attestation types, as well as to implement the provenance attestation. The in-toto-java implementation can be leveraged to create the new attestations in the Jenkins plugin.
- Expected Outcome: The Jenkins in-toto plugin can generate SLSA provenance metadata for tasks performed in pipelines.
- Expected Project Size: 175
- Difficulty: Easy
- Recommended Skills: Java, Jenkins
- Mentor(s): Aditya Sirish (@adityasaky), Santiago Torres-Arias (@SantiagoTorres)
- Upstream Issue (URL): in-toto/in-toto-jenkins-plugin#1
- Description: rebuilderd is a verification system for binary packages. It repeates the build process of a package in an identical environment and verifies that the package is identical. It currently generates in-toto link attestations when a package is successfully rebuilt. As part of this task, rebuilderd must be updated to generate in-toto SLSA provenance. To enable this feature, in-toto-rs must be extended to support the provenance specification as well.
- Expected Outcome: in-toto-rs gains ITE-6 semantics and the ability to generate SLSA provenance metadata, which is then used by rebuilderd to generate provenance for successful package rebuilds.
- Expected Project Size: 350
- Difficulty: Medium
- Recommended Skills: Rust
- Mentor(s): Aditya Sirish (@adityasaky), Santiago Torres-Arias (@SantiagoTorres)
- Upstream Issue (URL): in-toto/in-toto-rs#17, in-toto/rebuilderd#5
- Description: Init the UI dashboard for kubeedge, users can operate the kubeedge objs in dashboad.
- Expected outcome: Create a KubeEdge objs dashboard
- Recommended Skills: Kubernetes, KubeEdge, HTML/CSS/JavaScript
- Mentor(s): Yue Bao(@Shelley-BaoYue), Fisher Xu (@fisherxu)
- Expected project size: 175 Hours
- Difficulty: Medium
- Upstream Issue (URL): kubeedge/kubeedge#3608
- Description: The KubeEdge SIG AI is chartered to facilitate Edge AI applications with KubeEdge. An overview of SIG AI activities can be found in this charter. This project focuses on measuring and validating the desired behaviors for an epoch-making Edge AI scheme, i.e., Edge-cloud Joint Inference. The scheme of Edge-cloud Joint Inference has been released in KubeEdge-Sedna together with a hands-on example and a free playground. Part of the effort is to develop test cases on the existing scheme of Edge-cloud Joint Inference on KubeEdge-Sedna, including interfaces for benchmark datasets, metrics, and even baselines. These test cases will help all Edge AI application developers to validate and select the best-matched algorithm of joint inference. Several breath-taking Edge-AI scenarios have been prepared for benchmarking: looking forward to seeing your codes involved in real-world robots, outer-space satellites, and industrial production lines!
- Expected outcome: Develop tests around the existing scheme of Edge-cloud Joint Inference on KubeEdge-Sedna, including interfaces for benchmark datasets, metrics, and baselines.
- Recommended Skills: TensorFlow/Pytorch, Python
- Mentor(s): Jie Pu(@jaypume)
- Expected project size: 175
- Difficulty: Medium
- Upstream Issue(URL): kubeedge/sedna#274
- Description: The KubeEdge SIG AI is chartered to facilitate Edge AI applications with KubeEdge. An overview of SIG AI activities can be found in this charter. This project focuses on measuring and validating the desired behaviors for an epoch-making Edge AI scheme, i.e., Edge-cloud Collaborative Lifelong Learning. The scheme of Edge-cloud Collaborative Lifelong Learning has been released in KubeEdge-Sedna together with a hands-on example and a free playground. Part of the effort is to develop test cases for this existing scheme on KubeEdge-Sedna, including interfaces for benchmark datasets, metrics, and even baselines. These test cases will help all Edge AI application developers to validate and select the best-matched algorithm of lifelong learning. Several breath-taking Edge-AI scenarios have been prepared for benchmarking: looking forward to seeing your codes involved in real-world robots, outer-space satellites, and industrial production lines!
- Expected outcome: Develop test cases around the existing scheme of Edge-cloud Collaborative Lifelong Learning on KubeEdge-Sedna, including interfaces for benchmark datasets, metrics, and even baselines.
- Recommended Skills: TensorFlow/Pytorch, Python
- Mentor(s): Zimu Zheng (@MooreZheng)
- Expected project size: 175
- Difficulty: Medium
- Upstream Issue (URL): kubeedge/sedna#275
- Description: Sedna features support job status and pod status centralized visualization as visualized O&M of intelligent collaboration is a basic requirement.
- Expected outcome: Sedna features support job status(such as job status, jobstage status), pod status and worker status(such as worker status, worker service output) centralized visualization.
- Recommended Skills: golang, Prometheus, Grafana
- Mentor(s): Jin Yang(@JimmyYang20)
- Expected project size: 175 Hours
- Difficulty: Medium
- Upstream Issue(URL): kubeedge/sedna#273
- Description: WasmEdge is a lightweight, high-performance, and extensible WebAssembly runtime for cloud native, edge, and decentralized applications. WasmEdge is designed to support multiple operating systems. However, the WASI and wasmedge process components are only implemented on macOS and Linux platforms. Most of the host functions are not supported on the Windows platform. Since we have lots of windows users, it is necessary to finish the implementations.
- Expected outcome: The WASI and process host functions are implemented on Windows platform.
- Recommended Skills: C++, Windows API
- Mentor(s): Hung-Ying Tai (@hydai), Shen-Ta Hsieh (@ibmibmibm)
- Expected project size: 350 Hours
- Difficulty: Medium
- Upstream Issue (URL): WasmEdge/WasmEdge#1227
- Description: Kyverno, the Kubernetes native policy engine, is designed to easily secure and automate Kubernetes configurations. Kyverno uses an extended JMESPath query language for complex JSON data processing. Kyverno extends JMESPath by allowing nested expressions and an extended set of custom functions. As Kyverno's declarative policy language has evolved its become important to provide proper tooling around embedding, syntax checking and validation of JMESPath expressions including the extensions.
- Expected outcome: This project will define a formal specification of Kyverno's JMESPath extensions using ABNF notation and create a parser-validator for this grammar for use in Kyverno.
- Recommended Skills: Golang, Kubernetes, Language Design, DSLs
- Mentor(s): Jim Bugwadia (@JimBugwadia)
- Expected project size: 350 hours
- Difficulty: Medium
- Upstream Issue (URL): kyverno/kyverno#3217
- Description: Kyverno policies can validate, mutate, and generate any Kubernetes resource. A common use case is for organizations to allow access to Kubernetes cluster resources based on schedules, such as a pager rotation schedule for SREs. This feature enables time-based policies for Kyverno, to allow select operations for authorized users based on time ranges and schedule constraints.
- Expected outcome: The feature will extend Kyverno's policy rule definition to support time ranges and block requests which do not match the configured values.
- Recommended Skills: Golang, Kubernetes
- Mentor(s): Chip Zoller (@chipzoller), Jim Bugwadia (@JimBugwadia)
- Expected project size: 175 hours
- Difficulty: Medium
- Upstream Issue (URL): kyverno/kyverno#2233
- Description: Kyverno is a Kubernetes native policy engine that makes it easy to validate, mutate, and generate resources. While Kyverno policies allow configuratrion of exceptions, excluding policy enforcement on select resources requires manual configuration by cluster administrators. This feature extends Kyverno to allow users to request policy exceptions, and for administrative approvals to be granted asynchronously. The goal of this feature is to allow easier collaboration across developers and operations teams, and to make it easier to manage policies at scale.
- Expected outcome:
- Recommended Skills: Golang, Kubernetes
- Mentor(s): Shuting Zhao (@realshuting), Jim Bugwadia (@JimBugwadia)
- Expected project size: 350 hours
- Difficulty: Medium
- Upstream Issue (URL): kyverno/kyverno#2627
- Description: Brigade's data access packages are currently unit-tested against mock implementations of MongoDB client interfaces. These tests have been adequate for asserting that queries and statements are constructed properly (look like we think they should) and that mock query and statement results can be unmarshaled without error into domain types, but this approach cannot assert that DB queries and statements are logically correct and actually achieve the desired results, since a live database would be required to accomplish that. Given the importance of the data access code, we would like to develop a new suite of integration tests directly targeted at that code. These new tests should run against a live (and disposable) MongoDB database and assert that all queries and statements achieve the desired results. If the new suite of tests exposes bugs in the existing data access code, correcting those bugs is within the scope of this project as well.
- Expected outcome: New test suite merged into main branch and fully integrated into CI/CD pipelines
- Recommended Skills: Golang, MongoDB, Docker
- Mentor(s): Kent Rancourt (@krancour)
- Expected project size: 175 hours
- Difficulty: Easy
- Upstream Issue (URL): brigadecore/brigade#1811
- Description: Brigade event gateways are peripheral components that are installed alongside Brigade to receive events from external systems, transform those events into Brigade events, and enqueue them on Brigade's event bus. Most gateways are small, simple programs. On the more sophisticated end of the spectrum, gateways may go as far as utilizing the Brigade API to monitor the status of events they have created so they can report those statuses "upstream" to the original source of the event. This project invites candidates to propose and implement one or more new gateways.
- Expected outcome: The GA release of one or more new gateways that will be donated to the Brigade project
- Recommended Skills: Golang or TypeScript, Kubernetes, Helm
- Mentor(s): Kent Rancourt (@krancour)
- Expected project size: 175 hours
- Difficulty: Medium
- Upstream Issue (URL): brigadecore/brigade#1817
- Description: An early prototype exists for a web-based, Brigade v2-compatible dashboard application. It does not yet meet the high bar for quality that the Brigade project, as a whole, aspires to. With our core team's front-end bench strength somewhat lacking at the moment, this project invites a suitable candidate to take ownership of this application's principal development. Specific areas of focus will include improving error-handling, developing test suites, ensuring compatibility across popular browsers, improving the overall look, feel, and responsiveness of UI elements, improving the overall UX, and improving accessibility.
- Expected outcome: The GA release of the new dashboard
- Recommended Skills: HTLM, CSS, TypeScript, React
- Mentor(s): Kent Rancourt (@krancour)
- Expected project size: 350 hours
- Difficulty: Medium
- Upstream Issue (URL): brigadecore/brigade-dashboard#4
- Description: Create a means to easily install and configure various combinations of cert-manager, external dependencies (i.e ingress controllers) and cert-manager custom resources for local development and testing purposes.
- Detailed description: cert-manager is a Kubernetes addon that helps with management and issuance of TLS certificates in a Kubernetes cluster. In practice, such a TLS setup can involve a deployment of cert-manager, cert-manager custom resources and some external tools, such as Vault or some ingress controller implementation and all of these tools also need to be configured to work together. As cert-manager developers working on new cert-manager integrations, trying to reproduce bugs etc, we often spend significant amount of time setting up and cross-configuring cert-manager and these external tools. It would be great if we had a way to easily deploy cert-manager, cert-manager resources and any required external tools, all configured to work for a particular TLS setup scenario.
The expected outcomes:
- create a new installation mechanism that can install and configure resources for a few common Kuberenetes TLS setup scenarios (an example scenario would be cert-manager + cert-manager ACME
Issuer
+ ingress-nginx) like in our nginx-ingress tutorial) - The new mechanism:
- should be easy to update, so that it is straigtforward for a developer, who knows how to set up a particular scenario, to share their knowledge with the team by adding new functionality to the installation mechanism
- should allow for easy parameterization/modification of any of the deployed resources
- if possible, should not involve developers having to learn a complex new language/framework
The implementation could be a CLI tool, a collection of scripts, a bunch of Terraform modules or other - we would like to involve the GSoC student in the design process and welcome students' ideas.
- Expected size of the project: 350h
- Difficulty rating: medium
- Recommended Skills: Kubernetes, Bash, Terraform, Go
- Mentor(s): @irbekrm
- Upstream Issue (URL): cert-manager/cert-manager#4855
- Description: Write an implementation of TAP 13 in the go implementation of TUF. TUF metadata provides key management for developers who want to sign packages that they upload to a repository. However, this means that users are trusting the repository administrators to accurately portray the correct signing key for each package. TAP 13 reduces trust in repository administrators by adding support for user-managed keys to TUF, allowing users to override the key management done by the repository to trust only a subset of images on that repository. The implementation will be built on the new python-tuf client.
- Expected outcome: An extension is added to the python-tuf client that allows users to specify TAP 13 metadata.
- Recommended Skills: go
- Mentors: Marina Moore (@mnm678)
- Expected project size: 350 Hours
- Difficulty: Medium
- Upstream Issue: theupdateframework/taps#137
- Description: Implement version management for TUF’s python reference implementation. The implementation does not currently have a way to migrate TUF repositories or clients to a new TUF version that has breaking changes. This project will be the implementation of a proposal for coordinating specification versions between a repository and a client to prevent interruptions in access to updates after a major version change to the specification.
- Expected outcome: Version management for python-tuf is implemented.
- Recommended Skills: python
- Mentors: Marina Moore (@mnm678), Zack Newman (@znewman01)
- Expected project size: 350 Hours
- Difficulty: Medium
- Upstream Issue: theupdateframework/taps#136
- Description: Improve hashed bin delegations in TUF by adding succinct hashed bin delegations to the python-tuf reference implementation. TUF delegations allow the repository to specify which developer (or signing key) is associated with which packages are downloaded using TUF. Hashed bin delegations allow TUF to delegate many projects to the same signing keys more efficiently. However, the current implementation has a lot of duplicated metadata that can be simplified without affecting the functionality of this feature. This project would be implementing these simplifications to hashed bin delegations.
- Expected outcome: Succinct hashed bin delegation are implemented in python-tuf
- Recommended Skills: Python
- Mentors: Marina Moore (@mnm678), Lukas Pühringer (@lukpueh), Zack Newman (@znewman01)
- Expected project size: 350 Hours
- Difficulty: Medium
- Upstream Issue: theupdateframework/taps#132
- Description: The Metadata API of the python-tuf reference implementation provides a modern API for accessing individual pieces of TUF metadata. It does not, however, provide any wider context help to someone looking to implement a TUF repository. The goal of this project is to implement a minimal repository abstraction that includes carefully selected core functionality, without implementing all repository actions itself. Instead it should become easy for application code on top of such an abstraction to perform those actions autonomously, while maintaining compliance with the TUF specification.
- Expected outcome: A minimal repository abstraction is implemented.
- Recommended Skills: Python
- Mentors: Lukas Pühringer (@lukpueh)
- Expected project size: 350 Hours
- Difficulty: Hard
- Upstream Issue: theupdateframework/python-tuf#1136
- Description: Keylime enables users to monitor remote nodes (file integrity and measured boot) using a hardware based cryptographic root of trust. Keylime currently operates on a pull basis which means that the tenant or verifier connect to the agent to collect attestation data. This works fine in most virtualized environments where all the devices are in the same network, but not for edge devices or in BYOD contexts. This work would allow remote nodes to work in a "push" model instead of the normal "pull" model.
- Expected outcome: An implementation, corresponding tests and documentation for an agent-push model.
- Recommended Skills: Python, Security
- Mentor(s): Thore Sommer (@THS-on), Michael Peters (@mpeters), Marcio Silva (@maugustosilva)
- Expected project size: 350 Hours
- Difficulty: Medium
- Upstream Issue (URL): keylime/enhancements#60
- Description: Keylime enables users to monitor remote nodes (file integrity and measured boot) using a hardware based cryptographic root of trust. Keylime currently uses "Atomic Quotes" of PCRs from TPM security modules which can cause some extra churn in attestation and extra work for the TPM itself. These atomic quotes are not strictly necessary and removing them would help performance and scalability of the verification and also less work on the target agents.
- Expected outcome: A new configuration option for the keylime verifier that would tell agents that generating an atomic quote is not necessary with the verifier completing the attestation without, along with corresponding tests.
- Recommended Skills: Python, Security, Trusted Platform Modules (TPM)
- Mentor(s): Thore Sommer (@THS-on), Michael Peters (@mpeters)
- Expected project size: 175
- Difficulty: Medium
- Upstream Issue (URL): keylime/enhancements#59
- Description: Keylime enables users to monitor remote nodes (file integrity and measured boot) using a hardware based cryptographic root of trust. Most of the interactions with Keylime are via the CLI or REST APIs. There exists a bare bones web UI but it is limited in usability and usefulness. This task would be to overhaul and improve the UI to make it more usable and attractive.
- Expected outcome: A design and implementation for a new web UI to make it easier to interact with the various keylime services in a single place.
- Recommended Skills: Python, Javascript, Web UIs
- Mentor(s): Michael Peters (@mpeters)
- Expected project size: 175
- Difficulty: Easy
- Upstream Issue (URL): keylime/enhancements#61
- Description: Keylime enables users to monitor remote nodes (file integrity and measured boot) using a hardware based cryptographic root of trust. Various keylime components currently log events and information in a text log on the machine where the process is running. Not only does this make it challenging in a distributed environment, but it is also difficult to parse through the unstructured data looking for specific historical events. We would like to create structured events for every state change in keylime (new agent registered, agent passes attestation, agent fails attestation, etc) and send those to a 3rd party system like ElasticSearch. This will allow creating more detailed dashboards as well as historical event logs for forensic analysis.
- Expected outcome: New integrations and corresponding tests for integration with an ElasticSearch like backend to record all attestation actions and phases for each target managed by Keylime.
- Recommended Skills: Python, ElasticSearch
- Mentor(s): Michael Peters (@mpeters)
- Expected project size: 175
- Difficulty: Medium
- Upstream Issue (URL): keylime/enhancements#62
- Description: The current implementation of the Thanos Receive uses non-consistent hashing to distribute metrics across distributed ingesting replicas. The scalability that this enables can be improved. Because the hash is not consistent, every change in number or replicas considered for metric distribution can cause larger memory utilization for short periods. In the past, we tried to mitigate this problem by flushing the replica content to object storage which caused delays in scale-out and down. Switching to consistent hash implementation like a hash ring would allow us to mitigate this issue without delaying the scaling process. This should improve the life of Thanos receive users by enabling easier auto-scaling capabilities for production systems that have to react to different metric consumption characteristics. Join us in the effort of moving the hashing method to a consistent one!
- Expected outcome: Thanos Receive uses a consistent hashing method instead of a non-consistent one.
- Recommended Skills: Golang, Distributed Systems
- Mentor(s): Lucas Servén Marín (@squat), Prem Saraswat (@onprem), Matej Gera (@matej-g)
- Expected project size: 175
- Difficulty: Medium
- Upstream Issue (URL): thanos-io/thanos#4972
- Description: Right now during the compaction/downsampling stage, the Thanos Compactor always downloads TSDB blocks to local disk first and compacts/downsamples them later. This download process takes a long time if the data is large. Actually it is doable to read the data we need from the object storage directly and perform the action on the fly. This improvement will help save a lot of disk space in larger deployment.
- Expected outcome: On-the-fly compaction/downsampling feature is implemented and production ready.
- Recommended Skills: Golang, Distributed Systems
- Mentor(s): Ben Ye (@yeya24), Matej Gera (@matej-g)
- Expected project size: 175 Hours
- Difficulty: Medium
- Upstream Issue (URL): thanos-io/thanos#3406
- Description: LitmusChaos is an open-source Chaos Engineering platform that enables teams to identify weaknesses & potential outages in infrastructures by inducing chaos tests in a controlled way. This project aims to develop a terraform provider or scripts to provision litmuschaos functionalities.
- Expected outcomes: Develop a terraform provider ( Along with the documentation ) on top of helm provider to provision litmus with the following operations
- Install chaoscenter on k8s
- Install chaos agent via helm
- Add chaoshub to ChaosCenter via API provider
- Run chaos workflows via API provider
- Recommended Skills: Terraform, Kubernetes, Golang
- Mentor(s): Vedant Shrotria (@Jonsy13), Raj Babu Das (@rajdas98), Adarsh Kumar (@Adarshkumar14)
- Expected project size: 350 Hours
- Difficulty: Medium
- Upstream Issue (URL): litmuschaos/litmus#3456
- Description: Meshery Meshery is the open source, service mesh management plane that enables the adoption, operation, and management of any service mesh and their workloads. The Service Mesh Catalog project provides a place for users to consume and publishers share WebAssembly filters, Service Mesh Patterns, eBPF programs
- Expected outcome: Create a centralized catalog of Patterns, WebAssembly filters and eBPF programs which let's the user import, edit and deploy patterns.
- Recommended Skills: Reactjs, TypeScript, Golang
- Mentor(s): Lee Calcote, Aditya Chatterjee
- Expected project size: 350 Hours
- Difficulty: Medium
- Upstream Issue (URL): meshery/meshery.io#677
- Description: Service Mesh Performance Service Mesh Performance is a standard for capturing and characterizing the details of infrastructure capacity, service mesh configuration, and workload metadata. Service Mesh Performance test results capture rich performance profiles and realtime statistical analysis of microservices. There is a need of a dashboard of these results in which all service mesh projects are representeed, analyzed and charted visually in order to give better insight as to the performance characaterstics and value povided by cloud native infrastructure.
- Expected outcome: A dashboard facilitating Service Mesh Performance profiles and results which are analysed and charted visually to give better insights to performance results. - Design backend for facilitating the dashboard data - Design frontend for visually appealing dashboard
- Recommended Skills: Reactjs, Chartjs, DevOps, GitHub Actions,
- Mentor(s): Lee Calcote, Xin Huang
- Expected project size: 350 Hours
- Difficulty: Medium
- Upstream Issue (URL): service-mesh-performance/service-mesh-performance#272
- Description: Today, Knative Eventing has some support for observability 1 2 3 but it is piecewise and user needs for end-to-end observability are not fully addressed 4 5 6. The idea is to find out what are the biggest gaps in Knative Eventing by asking the community. Then to find a few issues that are “low hanging” fruits that can be solved during summer with the overall goal to to simplify end-to-end observability by improving support for OpenTelemetry in Knative Eventing and/or create a new plugin(s) for kn CLI. We see that work to be accomplished in few stages. Initial Stage (few weeks) of getting used to knative-eventing with few sample applications, reach out to Knative Eventing community via mailing list and Knative Slack to ask for interviews or feedback on what are biggest problems with observability in Knative Eventing. Then first stage (few weeks) to identify the highest priority problem that can be solved in 1-2 weeks (small size), share the solution, gather feedback. Possible second stage (if time allows) to improve the features from previous stage and/or address larger problem in 3-4 weeks. And final Stage (last weeks of summer) to write Knative docs, blog post, advertise
- Expected outcome: Improved e2e Knative Eventing observability documented and described in one or more blog posts
- Expected size of the project: 350h
- Difficulty rating: Medium
- Recommended Skills: Golang skill level: Intermediate to Advanced, Kubernetes: Intermediate to Advanced, familiarity with projects: Knative Eventing, Knative Client, Prometheus, Jaeger, OpenTelemetry
- Mentor(s): Aleksander Slominski @aslom, Ansu Varghese @aavarghese, and Lionel Villard @lionelvillard
- Upstream Issue (URL): knative/eventing#6247
- Description: More and more workload is moving towards running on the edge. We saw experiments running Kubernetes on vehicles, fighter jets, 5G antenna and various far edge, near edge and fat edge environments. We would like to see what the challenges are when Knative is run on a resource limited environment. While there are multiple edge-friendly Kubernetes distributions, we would like to see k0s is used as the base platform. Fixes should also go into the mink project which is a minimal Knative+Tekton distribution. Knative consists of Serving and Eventing modules but focusing on Serving as a first step is a better idea. Stages:
- Run Knative on k0s with minimal resources: Find out problems here, solve them.
- Run mink on k0s: Get the fixes from previous stage into mink to make it running on k0s too.
- Merge k0s and mink into a single binary
- Stretch goal: Find out what happens with architectures other than x86_64.
- Expected outcome: Finding issues blocking a minimal Knative distribution, and possibly fixing them. If there's none, actually preparing that distribution and running experiments with that. As a stretch goal, set improve the Knative CI to produce images that can run on architectures other than x86_64.
- Expected size of the project: 350h
- Difficulty rating: Hard
- Recommended Skills: Golang, Kubernetes, Knative, Kubernetes Controllers
- Mentor(s): Ali Ok @aliok, Carlos Santana @csantanapr
- Upstream Issue (URL): knative/serving#12718