Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support MantleBackupConfig #11

Merged
merged 3 commits into from
Jul 9, 2024
Merged

Conversation

ushitora-anqou
Copy link
Contributor

@ushitora-anqou ushitora-anqou commented Jun 12, 2024

No description provided.

@ushitora-anqou ushitora-anqou changed the title change group from "backup" to "mantle" Support MantleBackupConfig Jun 12, 2024
@ushitora-anqou ushitora-anqou force-pushed the support-mantlebackupconfig branch 6 times, most recently from 2e9f2ae to a0bb4ef Compare June 17, 2024 00:41
@ushitora-anqou ushitora-anqou force-pushed the support-mantlebackupconfig branch 8 times, most recently from e4d2b84 to 2a6172a Compare June 26, 2024 04:29
@ushitora-anqou ushitora-anqou marked this pull request as ready for review June 26, 2024 04:41
@satoru-takeuchi
Copy link
Contributor

@ushitora-anqou Could you add signed-off-by in the 2nd and the 3rd commits?

@ushitora-anqou
Copy link
Contributor Author

@ushitora-anqou Could you add signed-off-by in the 2nd and the 3rd commits?

I'll fix it later. That being said, we may need to introduce DCO GitHub App here, like TopoLVM.

api/v1/mantlebackupconfig_types.go Show resolved Hide resolved

func getMBName(mbc *mantlev1.MantleBackupConfig, jobID string) string {
name := mbc.GetName()
if len(name) >= 43 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if len(name) >= 43 {
if len(name) > 43 {

cmd/backupandrotate/main.go Show resolved Hide resolved
return "", fmt.Errorf("JOB_NAME not found")
}
if len(jobName) < 8 {
return "", fmt.Errorf("the length of JOB_NAME is smaller than 8")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return "", fmt.Errorf("the length of JOB_NAME is smaller than 8")
return "", fmt.Errorf("the length of JOB_NAME must be >= 8")

To let users know what they should do by reading this message clearer.

return fmt.Errorf("can't get mbc: %s: %s: %w", mbcName, mbcNamespace, err)
}

if err := backup(ctx, cli, &mbc); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if err := backup(ctx, cli, &mbc); err != nil {
if err := createMantleBackup(ctx, cli, &mbc); err != nil {

This function doesn't create backup directly. So it's better to clarify this function creates MantleBackup resource.


// At this point we know that mb was created by the previous run of the current job and we're retrying it.
// Thus, it is safe to ignore this "already exists" error.
logger.Info("avoid creating a new MantleBackup because it already exists",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logger.Info("avoid creating a new MantleBackup because it already exists",
logger.Info("MantleBackup already exists",

Actualy we tried to create MantleBackup. So omitting "avoid ..." is better.

mbc *mantlev1.MantleBackupConfig,
expireOffset time.Duration,
) error {
// List all MantleBackup objects associated with mbc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// List all MantleBackup objects associated with mbc.
// List all MantleBackup objects associated with the mbc.

flags.StringVar(&mbcName, "name", "", "MantleBackupConfig resource's name")
flags.StringVar(&mbcNamespace, "namespace", "", "MantleBackupConfig resource's namespace")
flags.StringVar(&expireOffset, "expire-offset", "0s",
"An offset for expire field. Use this option for testing purposes only.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about writing down more strict meaning of this paramter? It would be like this: mb's expire becomes mb.spec.expire - expore-offset.

// - https://pkg.go.dev/sigs.k8s.io/controller-runtime@v0.16.3/pkg/reconcile
func (r *MantleBackupConfigReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
// Get the running pod info
runningPod, err := getRunningPod(ctx, r.Client)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
runningPod, err := getRunningPod(ctx, r.Client)
cronJobInfo, err := getCronJobInfo(ctx, r.Client)

How about remaning the function to exlain the role of this function clearly? In addition, it's better to define the dedicated struct for cronJobInfo because this informatino is not about the controller's pod itself. The new struct would be like this.

struct cronJobInfo {
  Namespace
  ServiceAccountName
  Image
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly agree with you. We should probably define two separate functions: getCronJobInfo and getRunningPod. Then, getCronJobInfo can call getRunningPod. This way, getRunningPod can be reused for other purposes, even though it's currently only used for creating CronJobs.

return ctrl.Result{}, fmt.Errorf("failed to get Ceph cluster ID: %s: %s: %w", mbc.Namespace, mbc.Spec.PVC, err)
}
if clusterID != r.managedCephClusterID {
logger.Info("mbc not managed", "name", req.Name, "namespace", req.Namespace, "error", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logger.Info("mbc not managed", "name", req.Name, "namespace", req.Namespace, "error", err)
logger.Info("the target pvc is not managed by this controller", "name", req.Name, "namespace", req.Namespace, "error", err)

@satoru-takeuchi
Copy link
Contributor

@ushitora-anqou I didn't check test code since they will be changed after reflecting my comments.

@ushitora-anqou ushitora-anqou force-pushed the support-mantlebackupconfig branch 2 times, most recently from 45e9704 to aa8e4fa Compare June 27, 2024 04:24

parsedExpireOffset, err := strfmt.ParseDuration(expireOffset)
if err != nil {
return fmt.Errorf("can't parse the expire offset: %w", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return fmt.Errorf("can't parse the expire offset: %w", err)
return fmt.Errorf("couldn't parse the expire offset: %w", err)


cli, err := client.New(config.GetConfigOrDie(), client.Options{Scheme: scheme})
if err != nil {
return fmt.Errorf("can't create a new client: %w", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return fmt.Errorf("can't create a new client: %w", err)
return fmt.Errorf("couldn't create a new client: %w", err)

// Get the target mbc.
var mbc mantlev1.MantleBackupConfig
if err := cli.Get(ctx, types.NamespacedName{Name: mbcName, Namespace: mbcNamespace}, &mbc); err != nil {
return fmt.Errorf("can't get mbc: %s: %s: %w", mbcName, mbcNamespace, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return fmt.Errorf("can't get mbc: %s: %s: %w", mbcName, mbcNamespace, err)
return fmt.Errorf("couldn't get mbc: %s: %s: %w", mbcName, mbcNamespace, err)

return nil
}
if !errors.IsAlreadyExists(err) {
return fmt.Errorf("can't create a MantleBackup: %s: %s: %w", mbName, mbNamespace, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return fmt.Errorf("can't create a MantleBackup: %s: %s: %w", mbName, mbNamespace, err)
return fmt.Errorf("couldn't create a MantleBackup: %s: %s: %w", mbName, mbNamespace, err)


var mb mantlev1.MantleBackup
if err := cli.Get(ctx, types.NamespacedName{Name: mbName, Namespace: mbNamespace}, &mb); err != nil {
return fmt.Errorf("can't get MantleBackup: %s: %s: %w", mbName, mbNamespace, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return fmt.Errorf("can't get MantleBackup: %s: %s: %w", mbName, mbNamespace, err)
return fmt.Errorf("couldn't get MantleBackup: %s: %s: %w", mbName, mbNamespace, err)

return nil
}

// cf. https://zenn.dev/nobishii/articles/type_param_intro_2#%E5%85%AC%E5%BC%8F%E3%83%89%E3%82%AD%E3%83%A5%E3%83%A1%E3%83%B3%E3%83%88%E3%81%AB%E8%A6%8B%E3%82%8B%E5%88%B6%E7%B4%84%E5%9E%8B%E6%8E%A8%E8%AB%96%E3%81%AE%E6%B4%BB%E7%94%A8%E4%BE%8B
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you replace this article with another one writtein in English, if possible?

Expect(err).NotTo(HaveOccurred())
},
Entry("various expires 1", "0 0 * * *", "24h"),
Entry("various expires 1", "0 0 * * *", "1d"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Entry("various expires 1", "0 0 * * *", "1d"),
Entry("various expires 1", "0 0 * * *", "1d"),
Suggested change
Entry("various expires 1", "0 0 * * *", "1d"),
Entry("various expires 2", "0 0 * * *", "1d"),

Subsequent 1st argus should also be updated.

test/e2e/backup_test.go Show resolved Hide resolved
return err
}).Should(Succeed())

By("Creating a Job from the CronJob")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It it to emulating the cretion of Job by CronJob and we don't wait for one day in this test? If so, describing it in L296 is better. Or can we introduce another command line parameter for test (like expireOffset)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It it to emulating the cretion of Job by CronJob and we don't wait for one day in this test? If so, describing it in L296 is better.

Yes, it is. I should have added some comments on it.

Or can we introduce another command line parameter for test (like expireOffset)?

That sounds better. I'll fix the test that way.

Copy link
Contributor Author

@ushitora-anqou ushitora-anqou Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of scheduleOffset, I introduced overwrite-mbc-schedule option. When we use this option with some value (e.g., * * * * *), every .spec.schedule of a CronJob created for a MBC is replaced with the value. We can set this to * * * * * for the e2e test. Then, the backup-and-rotate subcommand will be triggered every minute, and we don't need to create a new Job from the CronJob explicitly.

@@ -0,0 +1,2 @@
controller:
expireOffset: 100d
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we emulate the actual behavior by changing expireOffset to several seconds or decent secods like '2w-10s'?

Copy link
Contributor Author

@ushitora-anqou ushitora-anqou Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried that we can't predict how quickly the MantleBackup resource will be deleted after it's created. For instance, if we set expireOffset to 1w6d23h59m50s (i.e., 2w-10s), the deletion needs to happen at least 10 seconds after the creation is complete. If it happens any sooner, the test will fail. This makes the test flakier.

Maybe we can add a short sleep in the test to ensure that the 10 seconds have passed. This might be a better approach. I'll give it a try and see how it works.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, please keep this code as is.

Copy link
Contributor Author

@ushitora-anqou ushitora-anqou Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, after my fix for your previous feedback, backup-and-rotate subcommand will be triggered every 60 seconds, so we can safely use 1w6d23h59m50s now.

@ushitora-anqou ushitora-anqou force-pushed the support-mantlebackupconfig branch 5 times, most recently from 76b3e65 to 71cfd55 Compare July 4, 2024 08:35
@ushitora-anqou ushitora-anqou force-pushed the support-mantlebackupconfig branch 2 times, most recently from c896312 to ceb7a2d Compare July 8, 2024 04:58
…BackupConfig` and `make`

Signed-off-by: Ryotaro Banno <ryotaro.banno@gmail.com>
Signed-off-by: Ryotaro Banno <ryotaro.banno@gmail.com>
@ushitora-anqou ushitora-anqou force-pushed the support-mantlebackupconfig branch 3 times, most recently from 925a975 to 8548600 Compare July 8, 2024 05:29
Copy link
Contributor

@satoru-takeuchi satoru-takeuchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ushitora-anqou LGTM. Please squash your commits.

Signed-off-by: Ryotaro Banno <ryotaro.banno@gmail.com>
@ushitora-anqou ushitora-anqou merged commit 4d07d80 into main Jul 9, 2024
2 checks passed
@ushitora-anqou ushitora-anqou deleted the support-mantlebackupconfig branch July 9, 2024 04:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants