Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor!(apis): add WALProvider and FileStorage #180

Merged
merged 3 commits into from
Sep 20, 2024

Conversation

zyy17
Copy link
Collaborator

@zyy17 zyy17 commented Sep 18, 2024

What's changed

APIs Breaking

  1. Add WALProvider to support RaftEngineWAL and KafkaWAL;
  2. Add FileStorage to support creating standalone wal or cache in different file system storage;

Enhancement

  1. Update the code comment of APIs;
  2. Add multiple field accessors of GreptimeDBCluster and GreptimeDBStandalone;

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Enhanced API definitions for GreptimeDB with expanded phase and component specifications.
    • Introduced new storage options, including detailed configurations for file and object storage.
    • Added support for Write-Ahead Logging (WAL) with both Raft and Kafka options.
    • Introduced a new configuration for standalone GreptimeDB instances with comprehensive specifications.
    • Added a new end-to-end test for GreptimeDB cluster with standalone WAL configuration.
  • Bug Fixes

    • Improved validation logic for configuration parameters, ensuring required fields are enforced.
  • Documentation

    • Updated API references with detailed descriptions of new and modified structures.
    • Enhanced documentation for configuration schemas and requirements.
    • Added examples for standalone configurations to enhance user guidance.
  • Chores

    • Refactored code for better clarity and maintainability, including the consolidation of constants and utility functions.

Copy link
Contributor

coderabbitai bot commented Sep 18, 2024

Walkthrough

The changes encompass extensive updates across multiple files in the apis/v1alpha1 package, introducing new structures for storage and logging configurations, enhancing API definitions, and enforcing stricter requirements for various fields. Documentation has been significantly improved for clarity, with new sections added to describe the updated configurations for both GreptimeDB clusters and standalone instances. Additionally, configuration files have been restructured to better organize related properties and enforce required fields.

Changes

Files Change Summary
apis/v1alpha1/common.go, apis/v1alpha1/greptimedbcluster_types.go, apis/v1alpha1/greptimedbstandalone_types.go Introduced new structures for FileStorage, WALProviderSpec, and others; updated existing structs to enforce required fields and improve clarity.
config/crd/resources/greptime.io_greptimedbclusters.yaml, config/crd/resources/greptime.io_greptimedbstandalones.yaml, manifests/bundle.yaml, manifests/crds.yaml Restructured storage properties into a nested fs object; added required fields for various configurations; removed deprecated fields.
docs/api-references/docs.md Expanded documentation for new and modified structures; clarified requirements for various components, including ComponentSpec and Storage types.
controllers/..., tests/e2e/... Updated deployment logic and test cases to accommodate new configurations and ensure correct handling of Persistent Volume Claims (PVCs) and standalone WAL setups.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant GreptimeDBCluster
    participant Storage
    participant WALProvider

    User->>GreptimeDBCluster: Create Cluster
    GreptimeDBCluster->>Storage: Configure Storage
    GreptimeDBCluster->>WALProvider: Setup WAL
    Storage-->>GreptimeDBCluster: Storage Ready
    WALProvider-->>GreptimeDBCluster: WAL Ready
    GreptimeDBCluster-->>User: Cluster Created
Loading

🐰 In the garden, changes bloom bright,
New structures and constants take flight.
With clusters and storage all set,
A rabbit’s delight, no need for regret!
Hops of joy, we cheer and play,
For GreptimeDB's bright new day! 🌼


Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 79fee25 and e677524.

Files selected for processing (10)
  • controllers/greptimedbcluster/deployers/datanode.go (7 hunks)
  • controllers/greptimedbstandalone/deployer.go (9 hunks)
  • tests/e2e/greptimedbcluster/test_basic_cluster.go (1 hunks)
  • tests/e2e/greptimedbcluster/test_cluster_enable_flow.go (1 hunks)
  • tests/e2e/greptimedbcluster/test_cluster_enable_remote_wal.go (1 hunks)
  • tests/e2e/greptimedbcluster/test_cluster_standalone_wal.go (1 hunks)
  • tests/e2e/greptimedbcluster/test_scale_cluster.go (1 hunks)
  • tests/e2e/greptimedbstandalone/test_basic_standalone.go (1 hunks)
  • tests/e2e/helper/helper.go (2 hunks)
  • tests/e2e/testdata/resources/cluster/standalone-wal/cluster.yaml (1 hunks)
Files skipped from review as they are similar to previous changes (8)
  • controllers/greptimedbcluster/deployers/datanode.go
  • controllers/greptimedbstandalone/deployer.go
  • tests/e2e/greptimedbcluster/test_basic_cluster.go
  • tests/e2e/greptimedbcluster/test_cluster_enable_flow.go
  • tests/e2e/greptimedbcluster/test_cluster_enable_remote_wal.go
  • tests/e2e/greptimedbcluster/test_cluster_standalone_wal.go
  • tests/e2e/helper/helper.go
  • tests/e2e/testdata/resources/cluster/standalone-wal/cluster.yaml
Additional comments not posted (3)
tests/e2e/greptimedbstandalone/test_basic_standalone.go (2)

86-86: Correct usage of ObjectKey fields.

The change from using testStandalone.Namespace for both the name and namespace fields of the ObjectKey to using testStandalone.Name for the name field is correct.

The ObjectKey should contain the actual name of the standalone object in the Name field and the namespace in the Namespace field to properly retrieve the object.


90-90: Update GetPVCs method with options parameter.

The addition of the options *Options parameter to the GetPVCs method provides flexibility for specifying additional options when retrieving PVCs.

Passing nil as the value for options in this case indicates that no specific options are being used, maintaining the default behavior.

The change is appropriate and maintains compatibility with the existing usage of GetPVCs.

tests/e2e/greptimedbcluster/test_scale_cluster.go (1)

160-160: Approve the fix for the incorrect client.ObjectKey usage.

The change correctly uses testCluster.Name for the Name field and testCluster.Namespace for the Namespace field when constructing the client.ObjectKey. This ensures that the correct cluster is targeted when checking for deletion.

Using testCluster.Namespace for both fields, as in the original code, would result in an incorrect lookup and potentially lead to false positives or false negatives in the test results.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    -- I pushed a fix in commit <commit_id>, please review it.
    -- Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    -- @coderabbitai generate unit testing code for this file.
    -- @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    -- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    -- @coderabbitai read src/utils.ts and generate unit testing code.
    -- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    -- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Outside diff range and nitpick comments (3)
apis/v1alpha1/testdata/greptimedbstandalone/test00/input.yaml (1)

1-9: LGTM with a suggestion!

The YAML configuration follows the expected structure for a Kubernetes resource, specifying the necessary fields such as apiVersion, kind, metadata, and spec. The main component is correctly configured to use the greptime/greptimedb Docker image.

However, consider using a specific version tag for the Docker image instead of latest to ensure consistent behavior across deployments.

spec:
  base:
    main:
-     image: greptime/greptimedb:latest
+     image: greptime/greptimedb:v1.0.0
apis/v1alpha1/testdata/greptimedbstandalone/test00/expect.yaml (1)

21-27: Datanode storage configuration looks good!

The data home directory and mount path are configured correctly, and the "Retain" policy ensures data persistence.

The allocated storage size of 10Gi should be sufficient for testing, but please remember to review and adjust the storage requirements based on your production needs.

pkg/dbconfig/meta_config.go (1)

81-84: Consider adding validation or processing of the input config.

The SetInputConfig method correctly sets the InputConfig field of the MetaConfig struct. However, consider adding validation or processing of the inputConfig parameter before assigning it to ensure the input config is valid and well-formed.

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between acf65ed and 3731472.

Files selected for processing (39)
  • apis/v1alpha1/common.go (9 hunks)
  • apis/v1alpha1/constants.go (1 hunks)
  • apis/v1alpha1/defaulting.go (7 hunks)
  • apis/v1alpha1/defaulting_test.go (1 hunks)
  • apis/v1alpha1/greptimedbcluster_types.go (6 hunks)
  • apis/v1alpha1/greptimedbstandalone_types.go (2 hunks)
  • apis/v1alpha1/testdata/greptimedbcluster/test00/expect.yaml (3 hunks)
  • apis/v1alpha1/testdata/greptimedbcluster/test00/input.yaml (1 hunks)
  • apis/v1alpha1/testdata/greptimedbcluster/test01/expect.yaml (2 hunks)
  • apis/v1alpha1/testdata/greptimedbcluster/test01/input.yaml (1 hunks)
  • apis/v1alpha1/testdata/greptimedbcluster/test02/expect.yaml (3 hunks)
  • apis/v1alpha1/testdata/greptimedbcluster/test02/input.yaml (1 hunks)
  • apis/v1alpha1/testdata/greptimedbstandalone/test00/expect.yaml (1 hunks)
  • apis/v1alpha1/testdata/greptimedbstandalone/test00/input.yaml (1 hunks)
  • apis/v1alpha1/zz_generated.deepcopy.go (12 hunks)
  • config/crd/resources/greptime.io_greptimedbclusters.yaml (11 hunks)
  • config/crd/resources/greptime.io_greptimedbstandalones.yaml (7 hunks)
  • controllers/common/common.go (2 hunks)
  • controllers/greptimedbcluster/controller.go (2 hunks)
  • controllers/greptimedbcluster/deployers/datanode.go (7 hunks)
  • controllers/greptimedbcluster/deployers/frontend.go (1 hunks)
  • controllers/greptimedbstandalone/controller.go (1 hunks)
  • controllers/greptimedbstandalone/deployer.go (9 hunks)
  • docs/api-references/docs.md (24 hunks)
  • examples/README.md (1 hunks)
  • examples/cluster/kafka-remote-wal/cluster.yaml (1 hunks)
  • examples/cluster/standalone-wal/cluster.yaml (1 hunks)
  • manifests/greptimedb-operator-crd.yaml (18 hunks)
  • manifests/greptimedb-operator-deployment.yaml (18 hunks)
  • pkg/dbconfig/datanode_config.go (3 hunks)
  • pkg/dbconfig/dbconfig.go (3 hunks)
  • pkg/dbconfig/dbconfig_test.go (4 hunks)
  • pkg/dbconfig/flownode_config.go (1 hunks)
  • pkg/dbconfig/frontend_config.go (1 hunks)
  • pkg/dbconfig/meta_config.go (1 hunks)
  • pkg/dbconfig/metasrv_config.go (0 hunks)
  • pkg/dbconfig/standalone_config.go (3 hunks)
  • tests/e2e/testdata/resources/cluster/enable-remote-wal/cluster.yaml (1 hunks)
  • tests/e2e/testdata/resources/cluster/scale/cluster.yaml (1 hunks)
Files not reviewed due to no reviewable changes (1)
  • pkg/dbconfig/metasrv_config.go
Files skipped from review due to trivial changes (2)
  • controllers/greptimedbstandalone/controller.go
  • examples/README.md
Additional context used
LanguageTool
docs/api-references/docs.md

[style] ~591-~591: ‘prior to’ might be wordy. Consider a shorter alternative.
Context: .../>Init containers are executed in order prior to containers being started. If any
i...

(EN_WORDINESS_PREMIUM_PRIOR_TO)


[style] ~591-~591: ‘taken into account’ might be wordy. Consider a shorter alternative.
Context: ...ceRequirements of an init container are taken into account during scheduling
by finding the h...

(EN_WORDINESS_PREMIUM_TAKEN_INTO_ACCOUNT)


[style] ~591-~591: Consider replacing this phrase with the adverb “similarly” to avoid wordiness.
Context: ...its are applied to init containers
in a similar fashion.
Init containers cannot currently ...

(IN_A_X_MANNER)


[style] ~696-~696: ‘prior to’ might be wordy. Consider a shorter alternative.
Context: .../>Init containers are executed in order prior to containers being started. If any
i...

(EN_WORDINESS_PREMIUM_PRIOR_TO)


[style] ~696-~696: ‘taken into account’ might be wordy. Consider a shorter alternative.
Context: ...ceRequirements of an init container are taken into account during scheduling
by finding the h...

(EN_WORDINESS_PREMIUM_TAKEN_INTO_ACCOUNT)


[style] ~696-~696: Consider replacing this phrase with the adverb “similarly” to avoid wordiness.
Context: ...its are applied to init containers
in a similar fashion.
Init containers cannot currently ...

(IN_A_X_MANNER)

Additional comments not posted (187)
apis/v1alpha1/testdata/greptimedbcluster/test00/input.yaml (3)

11-11: Verify if reducing the frontend replicas to 1 is intentional.

Reducing the number of frontend replicas from 3 to 1 eliminates redundancy and could lead to downtime if the single instance fails. This might be acceptable for testing or development but is not recommended for production environments.

Please confirm if this change is intentional and ensure that the reduced replica count is not used in production clusters.


13-14: LGTM!

The addition of the etcdEndpoints field is a positive change that integrates the GreptimeDB cluster with an external etcd cluster for service discovery and configuration management. The specified endpoint correctly points to the etcd service within the same Kubernetes cluster.


17-17: LGTM!

Increasing the number of datanode replicas from 1 to 3 is a positive change that enhances the data storage capacity, fault tolerance, and read performance of the GreptimeDB cluster. This aligns with best practices for production deployments of distributed databases.

examples/cluster/kafka-remote-wal/cluster.yaml (1)

Line range hint 17-21: Verify that all relevant places are updated to reflect the WAL configuration change.

Renaming remoteWal to wal and flattening the configuration structure improves clarity and aligns the configuration better with its purpose.

However, please ensure that:

  • All relevant documentation is updated to reflect this change.
  • Existing configurations or deployments using the old remoteWal structure are migrated to prevent breakage.
examples/cluster/standalone-wal/cluster.yaml (1)

1-23: LGTM!

The configuration file is well-structured and provides a solid foundation for deploying a GreptimeDB cluster with a standalone WAL. The use of the io2 storage class and the allocation of 5Gi storage for the WAL demonstrate a focus on performance and reliability.

The configuration of the frontend, metadata, and data node components with a single replica each is suitable for a minimal yet functional cluster setup.

Overall, this configuration file is a good example of how to set up a GreptimeDB cluster with a standalone WAL for enhanced write performance and data durability.

apis/v1alpha1/testdata/greptimedbcluster/test02/input.yaml (2)

22-28: Great addition! Verify the S3 configuration.

The introduction of the objectStorage section with an s3 configuration is a valuable addition to the system, enabling integration with S3-compatible object storage for handling large datasets and facilitating backup and restore operations.

To ensure the S3 configuration is valid and accessible, consider the following verification steps:

  1. Verify that the specified S3 bucket (greptimedb) exists and is accessible using the provided endpoint, region, and root path.

  2. Ensure that the referenced Kubernetes secret (s3-credentials) contains the necessary access credentials (access key and secret key) for authenticating with the S3 storage service.

  3. Run the following script to test the S3 connectivity and permissions:

Make sure to replace the placeholders with the actual values from the S3 configuration and ensure that the aws CLI tool is installed and properly configured with the access credentials.

Verification successful

S3 configuration looks good. Verify credentials manually.

The addition of the S3 object storage configuration is consistent with other examples in the codebase and aligns with the API references. The structure and parameters (bucket, endpoint, region, root, secretName) are correctly defined.

However, please ensure the following:

  1. Verify that the "s3-credentials" secret exists in the cluster and contains the correct AWS access key and secret key.
  2. Confirm that the specified S3 bucket "greptimedb" exists in the "us-west-2" region and is accessible with the provided credentials.
  3. Test the connection to the S3 bucket in your environment to ensure proper access and permissions.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the S3 configuration is valid and accessible.

# Test: Check if the S3 bucket is accessible using the provided configuration.
# Expect: Successful listing of objects in the specified bucket and path.
aws --endpoint-url=http://s3.amazonaws.com s3 ls s3://greptimedb/greptimedb/

Length of output: 149


Script:

#!/bin/bash
# Search for S3-related configurations in the codebase
echo "Searching for S3 configurations:"
rg -n -i 's3|objectstorage|bucket|endpoint|region|secretname' --type yaml

echo "\nChecking for S3 validation or test files:"
fd -e yaml -e go -e rs -e js -e py | rg -i 'test|spec|s3|storage'

echo "\nLooking for S3 setup documentation:"
rg -n -i 's3|object storage|setup|configuration' README.md docs

Length of output: 36688


14-15: LGTM! Verify the etcd endpoint accessibility.

The addition of the etcdEndpoints field is a good practice for managing critical cluster metadata using a dedicated etcd cluster.

To ensure the etcd endpoint is accessible, run the following script:

apis/v1alpha1/testdata/greptimedbstandalone/test00/expect.yaml (2)

1-5: LGTM!

The resource definition and metadata look good. The GreptimeDBStandalone resource is correctly defined with the appropriate API version, and the metadata specifies a valid name and namespace.


19-20: Service configuration looks good!

The "ClusterIP" service type is appropriate for internal access within the Kubernetes cluster.

apis/v1alpha1/testdata/greptimedbcluster/test01/input.yaml (4)

18-18: LGTM!

Explicitly setting the number of frontend replicas to 1 is a good practice to ensure the desired replica count is maintained.


26-26: LGTM!

Explicitly setting the number of meta replicas to 1 is a good practice to ensure the desired replica count is maintained.


28-28: LGTM!

Using the fully qualified domain name etcd.etcd-cluster.svc.cluster.local instead of etcd.default for the etcd endpoint is a better practice in a Kubernetes environment. This change aligns with the best practices and improves the connectivity within the cluster.


36-36: LGTM!

Explicitly setting the number of datanode replicas to 1 is a good practice to ensure the desired replica count is maintained.

tests/e2e/testdata/resources/cluster/scale/cluster.yaml (1)

35-36: LGTM!

The structural change to move storageRetainPolicy under the new fs key improves the organization and modularity of the configuration. It allows for potential future additions of filesystem-related settings while maintaining the existing functionality.

tests/e2e/testdata/resources/cluster/enable-remote-wal/cluster.yaml (1)

34-34: Verify the impact of renaming remoteWal to wal across the codebase and documentation.

The change looks good as it simplifies the configuration structure. However, please ensure that:

  • All code that parses this configuration is updated to use the new wal key instead of remoteWal.
  • All relevant documentation and examples are updated to reflect the new key name.

Run the following script to verify the usage of remoteWal across the codebase:

Verification successful

Renaming from 'remoteWal' to 'wal' has been successfully implemented and verified.

The change has been consistently applied across the codebase:

  • No occurrences of 'remoteWal' were found in any files.
  • The new 'wal:' key is present in multiple YAML files, including:
    • CRD definitions
    • Example configurations
    • Test data files

The renaming appears to be part of a broader, consistent change in the project's configuration structure. While automated checks show the change has been applied correctly, I recommend a final manual review of key files, such as:

  • ./config/crd/resources/greptime.io_greptimedbclusters.yaml
  • ./examples/cluster/kafka-remote-wal/cluster.yaml
  • ./manifests/greptimedb-operator-crd.yaml

This will ensure the semantic correctness of the changes in critical parts of the system.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify usage of `remoteWal` across the codebase.

# Test: Search for `remoteWal` usage in Go code. Expect: No occurrences of `remoteWal`.
rg --type go $'remoteWal'

# Test: Search for `remoteWal` usage in YAML files. Expect: No occurrences of `remoteWal`.
rg --type yaml $'remoteWal'

# Test: Search for `remoteWal` usage in Markdown files. Expect: No occurrences of `remoteWal`.
rg --type markdown $'remoteWal'

Length of output: 82


Script:

#!/bin/bash
# Description: Verify 'remoteWal' removal and 'wal' usage, and check search tool availability

# Test 1: Check if 'rg' is available
echo "Test 1: Checking 'rg' availability"
if command -v rg &> /dev/null; then
    echo "'rg' is available"
    rg --version
else
    echo "'rg' is not available"
fi

# Test 2: Search for 'remoteWal' using grep
echo -e "\nTest 2: Searching for 'remoteWal' using grep"
grep -R "remoteWal" .

# Test 3: Search for 'wal:' in YAML files
echo -e "\nTest 3: Searching for 'wal:' in YAML files"
grep -R "wal:" --include="*.yaml" --include="*.yml" .

# Test 4: List all YAML files in the repository
echo -e "\nTest 4: Listing all YAML files"
find . -name "*.yaml" -o -name "*.yml"

Length of output: 4286

apis/v1alpha1/testdata/greptimedbcluster/test00/expect.yaml (5)

22-22: LGTM!

Reducing the number of replicas for the frontend component aligns with the refactoring objective and the AI summary.


33-34: LGTM!

Adding the etcdEndpoints field to the meta section improves the configuration by specifying the etcd cluster endpoints for the GreptimeDB cluster. This change aligns with the AI summary.


38-38: LGTM!

Reducing the number of replicas for the meta component aligns with the refactoring objective and the AI summary.


49-49: LGTM!

Increasing the number of replicas for the datanode component aligns with the refactoring objective and the AI summary, suggesting a need for enhanced data handling capacity or redundancy.


52-56: LGTM!

Using a nested structure for defining the filesystem parameters in the storage configuration enhances clarity and organization. The fs field specifies the necessary filesystem configuration for the datanode component. This change aligns with the AI summary.

apis/v1alpha1/testdata/greptimedbcluster/test02/expect.yaml (4)

18-24: LGTM!

The addition of the objectStorage section with S3 storage parameters is a great enhancement to the storage capabilities of the GreptimeDB cluster. The use of a secret to store the S3 credentials is also a secure way to manage sensitive information.


44-45: Excellent addition!

The addition of the etcdEndpoints field under the meta section is a great enhancement to the configuration capabilities of the GreptimeDB cluster. The use of etcd can improve the cluster's resilience and service discovery, making it more robust and easier to manage.


64-68: Nice refactoring!

The modification of the storage structure to nest name, mountPath, storageRetainPolicy, and storageSize under a new fs field is a great refactoring that enhances the organization and readability of the configuration file. The change makes the configuration more intuitive and easier to manage without affecting the functionality of storage configuration.


14-17: Verify the field usage across the codebase.

The renaming of remoteWal to wal is a good change that clarifies the purpose of the field. However, ensure that all references to the old remoteWal field have been updated across the codebase to avoid any potential issues.

Run the following script to verify the field usage:

Verification successful

Field renaming from 'remoteWal' to 'wal' successfully verified

The renaming of 'remoteWal' to 'wal' has been consistently applied across the codebase:

  • The new 'wal' field is used in 9 YAML files, including CRDs, examples, and test data.
  • No occurrences of the old 'remoteWal' field or potential typos were found.
  • Affected files include:
    • Custom Resource Definitions
    • Operator deployment configurations
    • Example cluster configurations
    • Test data files

The change appears to be thorough and correctly implemented.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all references to the old `remoteWal` field have been updated.

# Test: Search for the old field usage. Expect: No occurrences of `remoteWal`.
rg --type yaml 'remoteWal'

Length of output: 26


Script:

#!/bin/bash
# Description: Verify the usage of the new 'wal' field and check for potential typos

# Test 1: Search for occurrences of the new 'wal' field in YAML files
echo "Occurrences of 'wal' field in YAML files:"
rg --type yaml '^\s*wal:' -c

# Test 2: Look for potential typos or variations of 'remoteWal'
echo "Potential typos or variations of 'remoteWal':"
rg --type yaml 'remote[wW]al|remove[wW]al|remote[vV]al'

# Test 3: Get a list of files using the new 'wal' field
echo "Files using the new 'wal' field:"
rg --type yaml '^\s*wal:' -l

Length of output: 1386

pkg/dbconfig/frontend_config.go (1)

32-33: LGTM!

The changes simplify the condition and improve code readability while maintaining the same functional outcome. The implementation is accurate and enhances the overall quality of the code.

pkg/dbconfig/flownode_config.go (1)

35-40: LGTM!

The changes in the code segment are good:

  • The code segment has been simplified and now uses getter methods for accessing the configuration, which improves readability and maintainability.
  • The changes do not affect the functionality of the method.
apis/v1alpha1/testdata/greptimedbcluster/test01/expect.yaml (2)

52-52: LGTM!

The change to the etcd endpoint looks good. Using the specific internal cluster address should improve the reliability of the etcd connection.


79-83: Looks good!

The restructuring of the storage configuration improves clarity and maintainability by grouping the filesystem-related properties under the fs key. The values of the properties remain unchanged, so this should not have any functional impact.

pkg/dbconfig/meta_config.go (5)

26-41: LGTM!

The MetaConfig struct is well-defined with clear field names and purposes. The struct is exported appropriately and the fields are annotated with tomlmapping tags for proper serialization.


44-63: LGTM!

The ConfigureByCluster method correctly sets the MetaConfig fields based on the given GreptimeDBCluster instance. It handles the configuration of region failover, store key prefix, input config, and WAL provider settings appropriately. The method also returns an error if setting the input config fails, which is a good practice.


66-68: LGTM!

The ConfigureByStandalone method is a placeholder that correctly indicates no configuration is needed in standalone mode. The method is exported appropriately and returns nil to indicate no error.


71-73: LGTM!

The Kind method correctly returns the MetaComponentKind constant from the v1alpha1 package. The method is exported appropriately.


76-78: LGTM!

The GetInputConfig method correctly returns the InputConfig field of the MetaConfig struct. The method is exported appropriately.

pkg/dbconfig/dbconfig_test.go (2)

Line range hint 33-56: LGTM!

The test function correctly sets up the test cluster with the updated ObjectStorageProviderSpec and WALProviderSpec configurations. The expected configuration string includes the new endpoint and region fields for the S3 storage configuration, and the generated configuration is correctly compared with the expected configuration.


Line range hint 86-110: LGTM!

The test function correctly sets up the test cluster with the updated ObjectStorageProviderSpec configuration and extra logging configuration. The expected configuration string includes the new endpoint and region fields for the S3 storage configuration and the extra logging configuration, and the generated configuration is correctly compared with the expected configuration.

apis/v1alpha1/constants.go (2)

1-57: LGTM!

The constants for default configurations are well-defined and will be useful across the codebase. The chosen default values seem appropriate.


59-75: LGTM!

The constants for secret keys are well-defined and follow good naming conventions. Centralizing these keys as constants promotes consistency and maintainability.

apis/v1alpha1/defaulting_test.go (1)

87-145: LGTM!

The addition of the TestStandaloneSetDefaults function is a valuable enhancement to the testing framework. It ensures that default values are correctly applied to the GreptimeDBStandalone configuration.

The function follows the same structure and error handling as the existing TestClusterSetDefaults function, maintaining consistency in the codebase. The test data is also organized in a separate directory, keeping it distinct from the GreptimeDBCluster test data.

Overall, this is a well-structured and thorough test function.

pkg/dbconfig/standalone_config.go (5)

64-65: LGTM!

The change from util.StringPtr to pointer.String for converting string values to pointers is consistent and appropriate.


67-71: LGTM!

The change from util.StringPtr to pointer.String for converting string values to pointers is consistently applied across S3 storage configurations.


79-86: LGTM!

The change from util.StringPtr to pointer.String for converting string values to pointers is consistently applied across OSS storage configurations.


95-102: LGTM!

The change from util.StringPtr to pointer.String for converting string values to pointers is consistently applied across GCS storage configurations.


106-116: LGTM!

The change to use getter methods (GetWALDir and GetDataHome) instead of directly accessing fields enhances encapsulation and aligns with a more object-oriented approach.

pkg/dbconfig/dbconfig.go (4)

54-54: LGTM!

The change from MetasrvConfig to MetaConfig improves naming consistency and clarity.


122-126: LGTM!

The change to use v1alpha1.ServiceAccountKey improves maintainability by centralizing the constant definition. The updated error message ensures accurate reporting.


142-152: LGTM!

The changes to use v1alpha1.AccessKeyIDSecretKey and v1alpha1.SecretAccessKeySecretKey improve maintainability by centralizing the constant definitions. The updated error messages ensure accurate reporting.


Line range hint 1-1: Approve the removal of the constants.

The removal of the constants AccessKeyIDSecretKey, SecretAccessKeySecretKey, and ServiceAccountKey aligns with the changes made in the getServiceAccountKey and getOCSCredentials functions. Centralizing the constant definitions in the v1alpha1 package improves maintainability and facilitates easier updates in the future.

pkg/dbconfig/datanode_config.go (11)

20-21: LGTM!

The import statement is valid and follows the correct syntax. The k8s.io/utils/pointer package is a well-known utility package from the Kubernetes project.


70-71: LGTM!

The changes are consistent with the overall update to use the k8s.io/utils/pointer package for creating string pointers. This improves code clarity and reduces dependency on the util package.


73-77: LGTM!

The changes are consistent with the overall update to use the k8s.io/utils/pointer package for creating string pointers. This improves code clarity and reduces dependency on the util package.


85-86: LGTM!

The changes are consistent with the overall update to use the k8s.io/utils/pointer package for creating string pointers. This improves code clarity and reduces dependency on the util package.


88-92: LGTM!

The changes are consistent with the overall update to use the k8s.io/utils/pointer package for creating string pointers. This improves code clarity and reduces dependency on the util package.


101-101: LGTM!

The change is consistent with the overall update to use the k8s.io/utils/pointer package for creating string pointers. This improves code clarity and reduces dependency on the util package.


104-108: LGTM!

The changes are consistent with the overall update to use the k8s.io/utils/pointer package for creating string pointers. This improves code clarity and reduces dependency on the util package.


112-115: LGTM!

The changes improve code readability by using the getter method cluster.GetWALDir() instead of directly accessing the field. The logic for setting the WalDir field is correct and follows the expected behavior.


117-118: LGTM!

The changes improve code readability by using the getter method cluster.GetDatanode().GetDataHome() instead of directly accessing the field. The logic for setting the StorageDataHome field is correct and follows the expected behavior.


121-125: LGTM!

The changes improve code readability by using the getter method cluster.GetDatanode().GetConfig() instead of directly accessing the field. The logic for setting the InputConfig field is correct and follows the expected behavior. The error handling is appropriate and ensures that any errors during the configuration setting are propagated.


127-129: LGTM!

The changes improve code readability by using the getter methods instead of directly accessing the fields. The logic for setting the WalProvider and WalBrokerEndpoints fields is correct and follows the expected behavior when the Kafka WAL is enabled.

controllers/common/common.go (4)

27-39: LGTM!

The introduction of DatanodeFileStorageLabels, WALFileStorageLabels, and CacheFileStorageLabels provides a standardized way to categorize file storage types for different components. The consistent naming convention for the label keys ensures clarity and improves the organization of storage resources in the codebase.


172-190: LGTM!

The new function FileStorageToPVC is a useful addition that enhances the ability to create persistent volume claims dynamically based on file storage configurations. It encapsulates the logic for constructing a PersistentVolumeClaim object, making it reusable across the codebase. The use of FileStorageAccessor allows for flexibility in specifying storage configurations.


20-25: LGTM!

The addition of necessary import statements for resource and metav1 packages ensures that the code compiles correctly. The reorganization of import statements, grouping related packages together, enhances code readability and maintainability.


Line range hint 1-19: LGTM!

Updating the copyright year to 2024 ensures that the license information is accurate and reflects the current year. Maintaining the same package name provides consistency and avoids potential breaking changes in other parts of the codebase.

apis/v1alpha1/greptimedbstandalone_types.go (11)

27-31: LGTM!

The Service field being optional and using a pointer to ServiceSpec is appropriate.


35-37: LGTM!

The HTTPPort field being optional and using int32 for the port number is appropriate.


39-41: LGTM!

The RPCPort field being optional and using int32 for the port number is appropriate.


43-45: LGTM!

The MySQLPort field being optional and using int32 for the port number is appropriate.


47-49: LGTM!

The PostgreSQLPort field being optional and using int32 for the port number is appropriate.


51-53: LGTM!

The PrometheusMonitor field being optional and using a pointer to PrometheusMonitorSpec is appropriate.


55-57: LGTM!

The Version field being optional and using string for the version is appropriate.


59-61: LGTM!

The Initializer field being optional and using a pointer to InitializerSpec is appropriate.


63-65: LGTM!

The ObjectStorageProvider field being optional and using a pointer to ObjectStorageProviderSpec is appropriate.


67-69: LGTM!

The DatanodeStorage field being optional and using a pointer to DatanodeStorageSpec is appropriate.


71-73: LGTM!

The WALProvider field being optional and using a pointer to WALProviderSpec is appropriate.

apis/v1alpha1/defaulting.go (14)

23-23: LGTM!

Importing the k8s.io/utils/pointer package is a common practice in Kubernetes codebases. It provides utility functions for creating pointers to primitive types.


31-35: LGTM!

Setting the version of the GreptimeDBClusterSpec based on the image tag is a good practice for consistency. The code segment handles the case when the version is not explicitly set.


37-37: LGTM!

Using a separate method to define the default spec improves code readability and maintainability. The mergo.Merge function is a good choice for merging the default settings into the GreptimeDBClusterSpec.


49-79: LGTM!

The defaultSpec method provides a comprehensive set of default values for creating a GreptimeDBCluster. Using constants for default values improves code maintainability and reduces duplication. The inclusion of a liveness probe for the main container and default values for the components ensures a proper initialization of the cluster.


82-92: LGTM!

The defaultFrontend method provides a consistent configuration for the frontend component of the GreptimeDBCluster. Setting default values for the template, replicas, and service type ensures a proper initialization of the frontend.


94-108: LGTM!

The defaultMeta method provides a consistent configuration for the meta component of the GreptimeDBCluster. Setting default values for the template, replicas, ports, and region failover ensures a proper initialization of the meta component. Enabling region failover by default when using a remote WAL provider improves the reliability and fault tolerance of the cluster.


110-120: LGTM!

The defaultDatanode method provides a consistent configuration for the datanode component of the GreptimeDBCluster. Setting default values for the template, replicas, ports, and storage ensures a proper initialization of the datanode. Using a separate function to define the default storage configuration improves code readability and maintainability.


122-129: LGTM!

The defaultFlownodeSpec method provides a consistent configuration for the flownode component of the GreptimeDBCluster. Setting default values for the template, replicas, and RPC port ensures a proper initialization of the flownode.


160-160: LGTM!

Setting the liveness probe port for the frontend component based on the configured HTTPPort ensures consistency and correctness. Using intstr.FromInt32 is the recommended way to convert an integer port number to an IntOrString type in Kubernetes.


174-174: LGTM!

Setting the liveness probe port for the meta component based on the configured HTTPPort ensures consistency and correctness. Using intstr.FromInt32 is the recommended way to convert an integer port number to an IntOrString type in Kubernetes.


188-188: LGTM!

Setting the liveness probe port for the datanode component based on the configured HTTPPort ensures consistency and correctness. Using intstr.FromInt32 is the recommended way to convert an integer port number to an IntOrString type in Kubernetes.


208-222: LGTM!

The SetDefaults method for GreptimeDBStandalone follows a similar structure and logic as the SetDefaults method for GreptimeDBCluster. Setting the version based on the image tag ensures consistency when the version is not explicitly specified. Merging the default spec provides a consistent configuration for the GreptimeDBStandalone.


224-251: LGTM!

The defaultSpec method for GreptimeDBStandalone provides a comprehensive set of default values for creating a standalone instance. Using constants for default values improves code maintainability and reduces duplication. The inclusion of a liveness probe for the main container and default values for the service type and datanode storage ensures a proper initialization of the standalone instance.


253-263: LGTM!

The defaultDatanodeStorage function provides a consistent configuration for the datanode storage. Using constants for default values improves code maintainability and reduces duplication. The default file storage configuration specifies the storage size, mount path, and retain policy, ensuring a proper initialization of the datanode storage.

controllers/greptimedbcluster/deployers/frontend.go (2)

252-252: LGTM!

The change enhances modularity by using the v1alpha1.TLSCrtSecretKey constant to construct the path for the TLS certificate file. This centralization of the constant definition potentially improves maintainability and reduces duplication across the codebase.


253-253: LGTM!

The change enhances modularity by using the v1alpha1.TLSKeySecretKey constant to construct the path for the TLS key file. This centralization of the constant definition potentially improves maintainability and reduces duplication across the codebase.

apis/v1alpha1/greptimedbcluster_types.go (10)

Line range hint 21-29: LGTM!

Making the Replicas field required with a minimum value of 1 is a good practice to ensure high availability and prevent accidental misconfiguration of zero replicas.


Line range hint 41-63: LGTM!

The changes to the MetaSpec struct look good:

  • Making EtcdEndpoints required ensures that the etcd cluster configuration is always provided.
  • The new optional fields provide more flexibility in configuring the meta component.

66-93: LGTM!

The new getter methods for the MetaSpec struct fields are implemented correctly:

  • They provide a convenient way to access the fields with default values.
  • The methods follow the Go naming convention and have appropriate nil checks.

99-127: LGTM!

The changes to the FrontendSpec struct look good:

  • The new optional fields Service and TLS provide more flexibility in configuring the frontend component.
  • The new getter methods are implemented correctly, providing a convenient way to access the fields with default values.

133-164: LGTM!

The changes to the DatanodeSpec struct look good:

  • The new optional fields RPCPort, HTTPPort, and Storage provide more flexibility in configuring the datanode component.
  • The new getter methods are implemented correctly, providing a convenient way to access the fields with default values.

176-181: LGTM!

The new GetConfig method for the FlownodeSpec struct is implemented correctly:

  • It provides a convenient way to access the Config field with a default value.
  • The method follows the Go naming convention and has an appropriate nil check.

197-324: LGTM!

The changes to the GreptimeDBClusterSpec struct look good:

  • Making Frontend, Meta, and Datanode required ensures that these essential components are always specified.
  • The new optional fields provide more flexibility in configuring the GreptimeDB cluster.
  • The new getter methods are implemented correctly, providing a convenient way to access the fields with default values.

329-360: LGTM!

Adding comments to the fields in GreptimeDBClusterStatus struct improves code readability and maintainability:

  • The comments provide a clear description of each field in the struct.
  • The comments follow the Go documentation convention.

364-400: LGTM!

Adding comments to the Replicas field in FrontendStatus, MetaStatus, DatanodeStatus, and FlownodeStatus structs improves code readability and maintainability:

  • The comments provide a clear description of the Replicas field in each struct.
  • The comments follow the Go documentation convention.

426-429: LGTM!

Adding comments to the Spec and Status fields in GreptimeDBCluster struct improves code readability and maintainability:

  • The comments provide a clear description of the Spec and Status fields in the struct.
  • The comments follow the Go documentation convention.
controllers/greptimedbstandalone/deployer.go (14)

61-61: LGTM!

The function correctly initializes a new StandaloneDeployer instance.


Line range hint 78-90: LGTM!

The function correctly generates the Kubernetes objects for a standalone deployment using the builder pattern. The generated objects include the service, config map, and stateful set.


93-119: LGTM!

The function correctly cleans up the storage for a standalone deployment based on the configured storage retain policy. It checks the policy for the data node file storage, WAL file storage, and cache file storage, and deletes the respective storage if the policy is set to delete.


Line range hint 120-144: LGTM!

The function correctly checks the status of a standalone deployment by retrieving the stateful set object and checking if it is ready using the IsStatefulSetReady function from the k8sutil package.


146-152: LGTM!

The function correctly retrieves the standalone instance from the provided CRD object by performing a type assertion to GreptimeDBStandalone. It returns an error if the type assertion fails.


Line range hint 154-186: LGTM!

The function correctly deletes the storage for a standalone deployment. It constructs a label selector based on the provided namespace, name, and additional labels to retrieve the matching persistent volume claims (PVCs). It then deletes each PVC.


Line range hint 195-223: LGTM!

The function correctly builds the service object for a standalone deployment. It sets the metadata and spec fields of the service based on the standalone instance, including the service type, selector, ports, and load balancer class. The generated service object is appended to the list of objects to be generated.


226-245: LGTM!

The function correctly builds the config map object for a standalone deployment. It generates the config data from the standalone instance using the dbconfig.FromStandalone function and then generates the config map object using the common.GenerateConfigMap function. The generated config map object is appended to the list of objects to be generated.


248-286: LGTM!

The function correctly builds the stateful set object for a standalone deployment. It sets the metadata and spec fields of the stateful set based on the standalone instance, including the replicas (always set to 1 for standalone mode), pod template spec, and persistent volume claims (PVCs). It also calculates the config hash and sets it as an annotation on the pod template spec. The generated stateful set object is appended to the list of objects to be generated.


289-311: LGTM!

The function correctly generates the pod template spec for a standalone deployment. It starts with the base pod template spec generated by the common.GeneratePodTemplateSpec function and then applies various customizations based on the standalone spec. These customizations include setting the main container args (if not provided), adding volume mounts, setting container ports, merging labels, mounting the config directory, and mounting the TLS secret (if TLS is enabled). The generated pod template spec is returned.


333-351: LGTM!

The function correctly generates the persistent volume claims (PVCs) for a standalone deployment. It initializes an empty slice to store the PVCs and then generates PVCs based on the storage configurations defined in the standalone spec. It generates a PVC for the data node file storage, WAL file storage, and cache file storage if they are defined. The generated PVCs are returned.


354-377: LGTM!

The function correctly returns the service ports for a standalone deployment. It defines a slice of corev1.ServicePort objects, which include ports for gRPC, HTTP, MySQL, and PostgreSQL. The port values are obtained from the standalone spec.


379-402: LGTM!

The function correctly returns the container ports for a standalone deployment. It defines a slice of corev1.ContainerPort objects, which include ports for gRPC, HTTP, MySQL, and PostgreSQL. The port values are obtained from the standalone spec.


404-424: LGTM!

The function correctly generates the main container arguments for a standalone deployment. It initializes a slice to store the arguments and appends various arguments based on the standalone spec. The basic arguments include "standalone", "start", data home, and addresses for RPC, MySQL, HTTP, and PostgreSQL. It also appends the config file argument. If TLS is enabled in the standalone spec, it appends the TLS-related arguments such as TLS mode, cert path, and key path.

controllers/greptimedbcluster/controller.go (2)

280-285: LGTM!

The changes to the TLS secret key constants enhance clarity and consistency in the codebase by standardizing the source of these constants.


312-314: LGTM!

The simplified region failover validation logic improves readability and maintainability. The changes directly utilize the IsEnableRegionFailover method and specifically require a Kafka WAL when region failover is enabled.

controllers/greptimedbcluster/deployers/datanode.go (4)

84-98: LGTM!

The changes enhance the CleanUp function to handle the deletion of storage for the datanode's file storage, WAL provider's raft engine file storage, and object storage provider's cache file storage based on their respective retention policies. This results in a more comprehensive cleanup process.


Line range hint 237-267: LGTM!

The modifications to the deleteStorage function enhance its flexibility by accepting additional parameters such as namespace, name, and labels. This allows the function to handle the deletion of PVCs for different storage types based on the provided labels. The changes improve the reusability and modularity of the function.


479-497: LGTM!

The generatePVCs function has been updated to construct a list of PVCs for the datanode, considering the datanode's file storage, WAL provider's raft engine file storage, and object storage provider's cache file storage. It uses the common.FileStorageToPVC function to convert the file storage configurations to PVC objects, which promotes code reusability. The changes reflect a modular approach to generating PVCs for different storage types.


568-597: LGTM!

The addVolumeMounts function has been updated to mount the necessary storage volumes for the datanode pod. It considers the datanode's file storage, WAL provider's raft engine file storage, and object storage provider's cache file storage. For each configured storage type, it creates a volume mount using the storage's name and mount path and appends it to the main container's volume mounts in the pod template spec. These changes ensure that the appropriate storage volumes are mounted in the datanode pod.

apis/v1alpha1/common.go (13)

Line range hint 22-33: LGTM!

The StorageRetainPolicyType type and its constants are well-defined with clear comments explaining their purpose.


38-51: LGTM!

The updated comments for the phase constants provide better clarity by explicitly including standalone operations alongside cluster operations.


58-71: LGTM!

The new component kind constants are well-defined with clear comments explaining their roles.


Line range hint 75-105: LGTM!

The updated comments for the SlimPodSpec struct fields improve clarity and consistency by referencing the original corev1.PodSpec fields and formatting the links uniformly.


Line range hint 117-361: LGTM!

The FileStorage struct provides a detailed specification for file storage, including properties like Name, StorageClassName, StorageSize, MountPath, and StorageRetainPolicy. The FileStorageAccessor interface and its implementation for FileStorage offer a clean way to access the struct's fields.


363-415: LGTM!

The WALProviderSpec struct is well-designed, allowing for flexible configuration of either local (raft-engine) or remote (Kafka) WAL. The getter methods provide convenient access to the nested structs' fields, enhancing the usability of the API.


417-434: LGTM!

The ServiceSpec struct is well-defined, allowing for customization of the service's type, annotations, labels, and load balancer class. The comments clearly explain the purpose of each field.


436-449: LGTM!

The TLSSpec struct is well-defined, allowing for the specification of the secret containing the TLS certificates. The comments clearly explain the requirements for the secret, and the getter method provides convenient access to the SecretName field.


452-511: LGTM!

The ObjectStorageProviderSpec struct is well-designed, allowing for the configuration of various object storage providers (S3, OSS, GCS) and cache storage. The getter methods provide convenient access to the nested structs' fields, and the getSetObjectStorageCount method is useful for validating the configuration to ensure only one object storage provider is set.


513-520: LGTM!

The DatanodeStorageSpec struct is well-defined, allowing for the specification of the data home directory and file storage configuration. The comments clearly explain the purpose of each field.


523-533: LGTM!

The CacheStorage struct is well-defined, allowing for the specification of the file storage configuration and cache capacity. The comments clearly explain the purpose of each field and provide guidance on the default behavior when the storage is not specified.


Line range hint 535-627: LGTM!

The S3Storage, OSSStorage, and GCSStorage structs are well-defined, allowing for the specification of the necessary configuration properties for each storage provider. The comments clearly explain the purpose of each field and provide guidance on the requirements for the secrets. The getter methods provide convenient access to the SecretName field in each struct.


633-648: LGTM!

The updates to the PrometheusMonitorSpec struct ensure that the Enabled and Labels fields are always specified when the monitor is enabled, improving the clarity and reliability of the configuration. The IsEnablePrometheusMonitor method provides a convenient way to check if the monitor is enabled.

apis/v1alpha1/zz_generated.deepcopy.go (10)

26-34: LGTM!

The deepcopy function for CacheStorage is implemented correctly. It performs a nil check on the FileStorage field and deep copies it when not nil.


36-44: LGTM!

The deepcopy function for CacheStorage is implemented correctly. It handles nil input by returning nil and uses DeepCopyInto to copy the fields when the input is not nil.


92-96: LGTM!

The deep copy of the Storage field in the DeepCopyInto function for DatanodeSpec is implemented correctly. It performs a nil check and deep copies the field when not nil.


124-142: LGTM!

The deepcopy functions for DatanodeStorageSpec are implemented correctly:

  • The DeepCopyInto function performs a nil check on the FileStorage field and deep copies it when not nil.
  • The DeepCopy function handles nil input by returning nil and uses DeepCopyInto to copy the fields when the input is not nil.

144-162: LGTM!

The deepcopy functions for FileStorage are implemented correctly:

  • The DeepCopyInto function performs a nil check on the StorageClassName field and deep copies it when not nil.
  • The DeepCopy function handles nil input by returning nil and uses DeepCopyInto to copy the fields when the input is not nil.

199-203: LGTM!

The deep copy of the Service field in the DeepCopyInto function for FrontendSpec is implemented correctly. It performs a nil check and deep copies the field when not nil.


237-246: LGTM!

The deepcopy functions for GCSStorage (previously GCSStorageProvider) are implemented correctly:

  • The DeepCopyInto function performs a shallow copy of the fields, which is sufficient since there are no pointer fields.
  • The DeepCopy function handles nil input by returning nil and uses DeepCopyInto to copy the fields when the input is not nil.

350-357: LGTM!

The deep copy of the ObjectStorageProvider and WALProvider fields in the DeepCopyInto function for GreptimeDBClusterSpec is implemented correctly. It performs nil checks and deep copies the fields when not nil.


490-503: LGTM!

The deep copy of the ObjectStorageProvider, DatanodeStorage, and WALProvider fields in the DeepCopyInto function for GreptimeDBStandaloneSpec is implemented correctly. It performs nil checks and deep copies the fields when not nil.


Line range hint 558-962: LGTM!

The deepcopy functions for KafkaWAL, OSSStorage, ObjectStorageProviderSpec, RaftEngineWAL, S3Storage, TLSSpec, and WALProviderSpec are implemented correctly:

  • The DeepCopyInto functions perform necessary nil checks and deep copy the fields accordingly.
  • The DeepCopy functions handle nil input by returning nil and use DeepCopyInto to copy the fields when the input is not nil.
docs/api-references/docs.md (11)

18-33: LGTM!

The CacheStorage section is well-defined with clear descriptions for each field. The fields cover the essential aspects of configuring cache storage.


Line range hint 41-53: Minimum replicas set to 1 is a good move!

Enforcing a minimum of 1 replica for the components ensures that they are not deployed without redundancy. This change improves the reliability and availability of the components.


134-150: LGTM!

The DatanodeStorageSpec section is well-defined with clear descriptions for each field. The fields cover the essential aspects of configuring datanode storage.


152-172: LGTM!

The FileStorage section is well-defined with clear descriptions for each field. The fields cover the essential aspects of configuring file storage and generating PVCs.


Line range hint 121-132: Field descriptions added for better clarity!

The added descriptions for the replicas and readyReplicas fields in the DatanodeStatus section provide clarity on their purpose. This makes the documentation more comprehensive and easier to understand.


Line range hint 199-209: Field descriptions added for better clarity!

The added descriptions for the replicas and readyReplicas fields in the FlownodeStatus section provide clarity on their purpose. This makes the documentation more comprehensive and easier to understand.


249-267: Secret requirements clarified for GCS authentication!

The updated description for the secretName field in the GCSStorage section provides clear instructions on the secret requirements for GCS authentication. Specifying the required key and namespace ensures proper configuration and reduces potential issues.


Line range hint 285-332: Enhanced configurability with new fields!

The added fields in the GreptimeDBClusterSpec section provide more flexibility and control over the configuration of the GreptimeDB cluster. Notable enhancements include:

  • Specifying ports for HTTP, RPC, MySQL, and PostgreSQL.
  • Configuring Prometheus monitoring with the prometheusMonitor field.
  • Specifying the GreptimeDB version with the version field.
  • Customizing the init container with the initializer field.
  • Configuring storage and WAL providers with the objectStorage and wal fields.

These changes greatly improve the configurability and extensibility of the GreptimeDB cluster.


Line range hint 353-400: Enhanced configurability with new fields!

The added fields in the GreptimeDBStandaloneSpec section provide more flexibility and control over the configuration of the GreptimeDB standalone instance. Notable enhancements include:

  • Configuring the service with the service field.
  • Configuring TLS with the tls field.
  • Specifying ports for HTTP, RPC, MySQL, and PostgreSQL.
  • Configuring Prometheus monitoring with the prometheusMonitor field.
  • Specifying the GreptimeDB version with the version field.
  • Customizing the init container with the initializer field.
  • Configuring storage providers with the objectStorage and datanodeStorage fields.
  • Configuring the WAL provider with the wal field.
  • Specifying the configuration file content with the config field.

These changes greatly improve the configurability and extensibility of the GreptimeDB standalone instance.


419-420: Initializer image can now be customized!

The added image field in the InitializerSpec section allows specifying a custom image for the init container. This provides flexibility in the initialization process and enables using custom initializer images tailored to specific requirements.


422-435: LGTM!

The KafkaWAL section is well-defined with a clear description for the brokerEndpoints field. The field specifies the list of Kafka broker endpoints, which is essential for configuring Kafka as the remote WAL provider.

config/crd/resources/greptime.io_greptimedbstandalones.yaml (4)

2805-2826: LGTM!

The datanodeStorage object has a well-structured schema for configuring datanode storage. The fs object provides flexibility and covers important storage-related configurations. The field validations for storageRetainPolicy and storageSize are appropriate.


2840-2861: Looks good!

The addition of the cache object under objectStorage is a nice enhancement. It allows configuring caching for object storage, with the cacheCapacity field for specifying the cache capacity. The reuse of the fs object maintains consistency with datanodeStorage and provides the necessary storage-related configurations.


2874-2876: Required fields look good!

The updates to the required fields for gcs, oss, and s3 objects are appropriate. Requiring bucket and root for all three storage types ensures that the essential information is provided. Additionally, including region as a required field for oss and s3 aligns with the typical requirements of those storage services.

Also applies to: 2890-2893, 2907-2910


2957-2988: WAL configuration looks good!

The introduction of the wal object is a valuable addition for configuring write-ahead logging (WAL). The kafka object allows specifying Kafka broker endpoints, while the raftEngine object enables configuring the storage for the Raft consensus algorithm used in WAL.

The structure and fields of the wal object are well-defined, and the reuse of the fs object under raftEngine maintains consistency with other storage configurations.

config/crd/resources/greptime.io_greptimedbclusters.yaml (7)

2824-2824: Enforce minimum replica count for enhanced reliability.

Setting the minimum value of spec.datanode.replicas to 1 is a good practice. It ensures that at least one replica of the datanode component is always running, enhancing the reliability and availability of the GreptimeDB cluster.


2833-2849: Structured filesystem storage configuration for datanode.

The introduction of the spec.datanode.storage.fs field is a great addition to the GreptimeDBCluster CRD. It provides a clear and structured way to configure the filesystem storage for the datanode component.

The included properties (mountPath, name, storageClassName, storageRetainPolicy, and storageSize) cover the essential aspects of storage configuration. The storageRetainPolicy enum and storageSize pattern validations ensure that valid values are specified.

This change enhances the flexibility and maintainability of the storage configuration for the datanode component.


5613-5614: Require specifying datanode replicas.

Adding replicas as a required field under spec.datanode is a good decision. It ensures that the number of replicas for the datanode component is always explicitly specified.

This change complements the previous modification of setting the minimum value of replicas to 1. By making the field required, it prevents misconfiguration and guarantees a minimum level of redundancy for the datanode component.


8389-8390: Require specifying flownode replicas.

Adding replicas as a required field under spec.flownode is a consistent and beneficial change. It ensures that the number of replicas for the flownode component is always explicitly specified.

This change helps prevent misconfiguration and guarantees a minimum level of redundancy for the flownode component, similar to the datanode component.


11181-11182: Require specifying TLS secret name for frontend.

Adding secretName as a required field under spec.frontend.tls is an important security enhancement. It ensures that the secret containing the TLS certificate and key is always specified when TLS is configured for the frontend component.

This change helps prevent misconfiguration and guarantees that the necessary TLS secrets are provided for secure communication with the frontend component.


11184-11185: Require specifying frontend replicas.

Adding replicas as a required field under spec.frontend is a consistent and beneficial change. It ensures that the number of replicas for the frontend component is always explicitly specified.

This change helps prevent misconfiguration and guarantees a minimum level of redundancy for the frontend component, aligning with the modifications made for the datanode and flownode components.


13981-13983: Require specifying etcd endpoints and replicas for meta component.

Adding etcdEndpoints and replicas as required fields under spec.meta is a crucial configuration requirement.

Requiring the etcdEndpoints field ensures that the endpoints for the etcd cluster used by the meta component are always specified. This helps prevent misconfiguration and guarantees that the necessary etcd cluster information is provided for the meta component to function correctly.

Additionally, making the replicas field required for the meta component ensures that the number of replicas is always explicitly specified, similar to the modifications made for other components. This helps prevent misconfiguration and ensures a minimum level of redundancy for the meta component.

manifests/greptimedb-operator-crd.yaml (14)

2823-2823: LGTM!

Setting the minimum number of replicas to 1 for the datanode component is a good practice to ensure availability and prevent accidental deletion of all replicas.


2832-2848: LGTM!

The new spec.datanode.storage.fs field provides a structured way to configure the storage for the datanode component using a filesystem. It includes essential properties such as mountPath, storageClassName, storageRetainPolicy, and storageSize.

The storageRetainPolicy enum ensures valid values are used, and the storageSize regex pattern validates the specified size format.


5612-5613: LGTM!

Marking the spec.datanode.replicas field as required is a good practice. It ensures that the number of datanode replicas is always specified when creating a GreptimeDBCluster resource, preventing the creation of a datanode component without specifying the desired number of replicas.


5621-5621: LGTM!

Setting the minimum number of replicas to 1 for the flownode component is a good practice to ensure availability and prevent accidental deletion of all replicas.


8388-8389: LGTM!

Marking the spec.flownode.replicas field as required is a good practice. It ensures that the number of flownode replicas is always specified when creating a GreptimeDBCluster resource, preventing the creation of a flownode component without specifying the desired number of replicas.


8397-8397: LGTM!

Setting the minimum number of replicas to 1 for the frontend component is a good practice to ensure availability and prevent accidental deletion of all replicas.


11180-11181: LGTM!

Marking the spec.frontend.tls.secretName field as required is a good practice. It ensures that the secret name for TLS configuration is always specified when creating a GreptimeDBCluster resource, preventing the creation of a frontend component without specifying the TLS secret name.


11183-11184: LGTM!

Marking the spec.frontend.replicas field as required is a good practice. It ensures that the number of frontend replicas is always specified when creating a GreptimeDBCluster resource, preventing the creation of a frontend component without specifying the desired number of replicas.


11211-11211: LGTM!

Setting the minimum number of replicas to 1 for the meta component is a good practice to ensure availability and prevent accidental deletion of all replicas.


13980-13982: LGTM!

Marking the spec.meta.etcdEndpoints and spec.meta.replicas fields as required is a good practice.

  • It ensures that the etcd endpoints are always specified for the meta component when creating a GreptimeDBCluster resource, preventing the creation of a meta component without the necessary etcd configuration.
  • It ensures that the number of meta replicas is always specified, preventing the creation of a meta component without specifying the desired number of replicas.

14092-14094: LGTM!

Marking the spec.wal.kafka.brokerEndpoints field as required is a good practice. It ensures that the Kafka broker endpoints are always specified when using Kafka for write-ahead logging (WAL) in a GreptimeDBCluster resource. This prevents the creation of a WAL configuration without specifying the necessary Kafka endpoints.


14097-14113: LGTM!

The new spec.wal.raftEngine.fs field provides a structured way to configure the storage for the Raft engine used in write-ahead logging (WAL). It includes essential properties such as mountPath, storageClassName, storageRetainPolicy, and storageSize.

The storageRetainPolicy enum ensures valid values are used, and the storageSize regex pattern validates the specified size format.


14116-14119: LGTM!

Marking the spec.datanode, spec.frontend, and spec.meta fields as required is a good practice. It ensures that the essential components of a GreptimeDB cluster (datanode, frontend, and meta) are always specified when creating a GreptimeDBCluster resource. This prevents the creation of an incomplete or invalid cluster configuration.


Line range hint 17014-17196: LGTM!

The spec fields for the GreptimeDBStandalone custom resource provide a comprehensive set of configuration options for a standalone GreptimeDB instance. The fields are well-structured and cover essential aspects of the standalone setup, including:

  • Storage configuration for the datanode, object storage cache, and write-ahead logging (WAL) using a consistent structure.
  • Object storage configuration for different providers (GCS, OSS, S3) with the necessary credentials and bucket information.
  • Prometheus monitoring configuration for the standalone instance.
manifests/greptimedb-operator-deployment.yaml (9)

Line range hint 1-6: LGTM!

The Namespace resource is correctly defined to create the greptimedb-operator-system namespace for isolating the GreptimeDB operator resources.


Line range hint 7-14123: LGTM!

The CustomResourceDefinition for GreptimeDBCluster is comprehensively defined, covering the desired state and observed state of a GreptimeDB cluster. The schema includes various configuration options for fine-grained control over the cluster setup.


Line range hint 14124-17204: LGTM!

The CustomResourceDefinition for GreptimeDBStandalone is well-defined, covering the desired state and observed state of a standalone GreptimeDB instance. The schema provides various configuration options for customizing the standalone setup.


Line range hint 17205-17209: LGTM!

The ServiceAccount resource correctly defines the greptimedb-operator service account in the greptimedb-operator-system namespace. This service account will be used by the GreptimeDB operator to interact with the Kubernetes API.


Line range hint 17210-17235: LGTM!

The Role resource correctly defines the greptimedb-operator-leader-election-role in the greptimedb-operator-system namespace. The permissions granted to manage configmaps, leases, and events are appropriate for the leader election functionality of the GreptimeDB operator.


Line range hint 17236-17369: LGTM!

The ClusterRole resource comprehensively defines the greptimedb-operator-role with the necessary permissions for the GreptimeDB operator to manage GreptimeDB clusters and standalone instances across the cluster. The permissions cover all the required resources, including CRDs, deployments, statefulsets, configmaps, events, PVCs, pods, secrets, services, and custom resources.


Line range hint 17370-17380: LGTM!

The RoleBinding resource correctly binds the greptimedb-operator-leader-election-role to the greptimedb-operator service account in the greptimedb-operator-system namespace. This grants the necessary leader election permissions to the GreptimeDB operator.


Line range hint 17381-17390: LGTM!

The ClusterRoleBinding resource correctly binds the greptimedb-operator-role to the greptimedb-operator service account in the greptimedb-operator-system namespace. This grants the necessary permissions to the GreptimeDB operator across the cluster.


Line range hint 17391-17443: LGTM!

The Deployment resource properly defines the deployment of the GreptimeDB operator in the greptimedb-operator-system namespace. It ensures one replica of the operator is running, with the necessary configuration, including the command, image, health probes, resource limits and requests, and the associated service account. The deployment looks good and is ready to manage GreptimeDB clusters and standalone instances.

@zyy17 zyy17 force-pushed the refactor/add-wal-provider branch 4 times, most recently from 16700a7 to d6829c4 Compare September 18, 2024 09:11
Signed-off-by: zyy17 <zyylsxm@gmail.com>
@daviderli614
Copy link
Member

daviderli614 commented Sep 19, 2024

What difference from the Spec.Datanode.Storage and Spec.DatanodeStorage ?
Maybe same naming ?

@zyy17
Copy link
Collaborator Author

zyy17 commented Sep 19, 2024

What difference from the Spec.Datanode.Storage and Spec.DatanodeStorage ? Maybe same naming ?

It's the same thing. I think their name is almost the same.

@daviderli614
Copy link
Member

daviderli614 commented Sep 20, 2024

Can we add a testing of mounting wal and cache on different disks.

Copy link
Member

@daviderli614 daviderli614 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Outside diff range and nitpick comments (1)
tests/e2e/testdata/resources/cluster/standalone-wal/cluster.yaml (1)

7-9: Consider using a specific version tag for the image.

While using the latest tag ensures that the most recent version of GreptimeDB is deployed, it may lead to unexpected behavior if the image is updated without proper testing.

Consider using a specific version tag (e.g., greptime/greptimedb:v1.0.0) to ensure consistency and stability across deployments.

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 7cd8c90 and 84728ec.

Files selected for processing (8)
  • controllers/greptimedbstandalone/deployer.go (9 hunks)
  • tests/e2e/greptimedbcluster/test_basic_cluster.go (1 hunks)
  • tests/e2e/greptimedbcluster/test_cluster_enable_flow.go (1 hunks)
  • tests/e2e/greptimedbcluster/test_cluster_enable_remote_wal.go (1 hunks)
  • tests/e2e/greptimedbcluster/test_cluster_standalone_wal.go (1 hunks)
  • tests/e2e/greptimedbstandalone/test_basic_standalone.go (1 hunks)
  • tests/e2e/helper/helper.go (2 hunks)
  • tests/e2e/testdata/resources/cluster/standalone-wal/cluster.yaml (1 hunks)
Additional comments not posted (11)
tests/e2e/testdata/resources/cluster/standalone-wal/cluster.yaml (4)

1-2: LGTM!

The API version and kind are correctly specified for a GreptimeDB cluster custom resource.


3-5: LGTM!

The metadata section correctly specifies the name and namespace for the GreptimeDB cluster.


10-17: LGTM!

The frontend, metadata, and data node configurations are correctly specified with appropriate replica counts and etcd endpoints.


18-25: LGTM!

The WAL configuration correctly specifies the use of a Raft-based engine with persistent storage and an appropriate retention policy for testing purposes.

tests/e2e/greptimedbstandalone/test_basic_standalone.go (1)

90-90: Verify that passing nil is safe.

The change to pass an additional nil argument to h.GetPVCs appears to be a refinement to the function's interface. It likely represents an optional parameter, such as options, that was not previously included.

This change is unlikely to alter the core logic or expected outcomes of the test. However, please verify that passing nil as the new argument is safe and doesn't lead to any unexpected behavior.

Run the following script to verify the GetPVCs function usage:

Verification successful

Passing nil as the last argument to GetPVCs is safe and consistent with its usage across the codebase.

After analyzing the usage of the GetPVCs function across multiple test files, it's clear that passing nil as the last argument is a common and intended pattern. The function is designed to handle both nil and non-nil values for the additionalLabels parameter.

The implementation in tests/e2e/helper/helper.go explicitly checks for nil before processing additional labels:

if additionalLabels != nil {
    // ... handle additional labels
}

This design ensures that passing nil is safe and doesn't lead to any unexpected behavior. The change in the reviewed file aligns with the consistent usage pattern observed throughout the codebase.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the `GetPVCs` function usage with the new `nil` argument.

# Test: Search for the function usage. Expect: Only occurrences with the new `nil` argument.
rg --type go -A 5 $'GetPVCs'

Length of output: 5001

tests/e2e/greptimedbcluster/test_basic_cluster.go (1)

90-90: Verify the purpose and usage of the new options parameter in GetPVCs.

The updated function call to GetPVCs with the nil argument looks valid. However, to ensure this change aligns with the overall design and requirements, please provide more information about the purpose and intended usage of the new options parameter.

tests/e2e/greptimedbcluster/test_cluster_enable_flow.go (1)

90-90: LGTM! The change enhances flexibility without altering the core test logic.

The addition of the nil argument to the h.GetPVCs function call provides flexibility for potential future extensions or modifications in how PVCs are retrieved. It allows for the possibility of adding optional filtering or specifying options when fetching the PVCs without altering the existing test logic.

The core functionality of the test, which checks if the number of datanode PVCs matches the number of datanode replicas, remains unchanged. This ensures that the test case continues to verify the expected behavior of the cluster.

Overall, this change is a positive enhancement that improves the adaptability of the code while maintaining the integrity of the test case.

tests/e2e/greptimedbcluster/test_cluster_enable_remote_wal.go (1)

90-90: Verify the behavior of GetPVCs with nil argument.

The change looks good in the context of this test. Passing nil as the new argument to GetPVCs seems to retain the original behavior, which is desirable here.

However, please ensure that passing nil for the new options parameter in GetPVCs is safe and aligns with the expected semantics. It's important to verify that it does not introduce any unintended behavior that could affect this test.

To verify, you can add a test case specifically for GetPVCs that passes nil for the options argument and asserts the expected behavior. This will help catch any potential issues early.

tests/e2e/helper/helper.go (1)

128-151: Excellent enhancements to the GetPVCs function!

The changes made to the GetPVCs function significantly improve its functionality and flexibility:

  • The addition of the additionalLabels parameter allows for more dynamic querying of PVCs by enabling the merging of custom labels with the existing component name label.
  • The use of the util.MergeStringMap function ensures a clean and efficient merging of the label maps.
  • The updated variable names (pvcList -> claims) enhance code readability and clarity.

These enhancements make the GetPVCs function more versatile and easier to use in different scenarios.

tests/e2e/greptimedbcluster/test_cluster_standalone_wal.go (1)

34-111: Test function implementation looks good

The TestClusterStandaloneWAL function is well-structured and follows best practices for end-to-end testing. It correctly sets up the cluster, executes the SQL test, and performs cleanup by deleting the cluster and associated resources. Resource handling and error checking are appropriately managed throughout the test.

controllers/greptimedbstandalone/deployer.go (1)

Line range hint 61-146: Consistent Receiver Naming Enhances Readability

The receiver names have been changed from s to d in StandaloneDeployer methods and from s to b in standaloneBuilder methods. This improves code readability by making receiver names more intuitive and consistent with their struct names. Ensure that this naming convention is applied consistently across the entire codebase for clarity.

tests/e2e/greptimedbcluster/test_cluster_standalone_wal.go Outdated Show resolved Hide resolved
controllers/greptimedbstandalone/deployer.go Show resolved Hide resolved
controllers/greptimedbstandalone/deployer.go Outdated Show resolved Hide resolved
controllers/greptimedbstandalone/deployer.go Show resolved Hide resolved
@zyy17 zyy17 merged commit 53816a2 into GreptimeTeam:develop Sep 20, 2024
4 checks passed
@zyy17 zyy17 deleted the refactor/add-wal-provider branch September 20, 2024 05:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants