Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Parse kopia snapshot restore progress output. #2776

Merged
merged 10 commits into from
May 21, 2024
93 changes: 92 additions & 1 deletion pkg/kopia/command/parse_command_output.go
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ const (

//nolint:lll
snapshotCreateOutputRegEx = `(?P<spinner>[|/\-\\\*]).+[^\d](?P<numHashed>\d+) hashed \((?P<hashedSize>[^\)]+)\), (?P<numCached>\d+) cached \((?P<cachedSize>[^\)]+)\), uploaded (?P<uploadedSize>[^\)]+), (?:estimating...|estimated (?P<estimatedSize>[^\)]+) \((?P<estimatedProgress>[^\)]+)\%\).+)`
snapshotRestoreOutputRegEx = `Processed (?P<processedCount>\d+) \((?P<processedSize>.*)\) of (?P<totalCount>\d+) \((?P<totalSize>.*)\) (?P<dataRate>.*) \((?P<percentage>.*)%\) remaining (?P<remainingTime>.*)\.`
extractSnapshotIDRegEx = `Created snapshot with root ([^\s]+) and ID ([^\s]+).*$`
repoTotalSizeFromBlobStatsRegEx = `Total: (\d+)$`
repoCountFromBlobStatsRegEx = `Count: (\d+)$`
Expand Down Expand Up @@ -205,7 +206,10 @@ type SnapshotCreateStats struct {
ProgressPercent int64
}

var kopiaProgressPattern = regexp.MustCompile(snapshotCreateOutputRegEx) //nolint:lll
var (
kopiaProgressPattern = regexp.MustCompile(snapshotCreateOutputRegEx)
kopiaSnapshotRestorePattern = regexp.MustCompile(snapshotRestoreOutputRegEx)
)

// SnapshotStatsFromSnapshotCreate parses the output of a kopia snapshot
// create execution for a log of the stats for that execution.
Expand Down Expand Up @@ -327,6 +331,93 @@ func parseKopiaProgressLine(line string, matchOnlyFinished bool) (stats *Snapsho
}
}

// SnapshotRestoreStats is a container for stats parsed from the output of a
// `kopia snapshot restore` command.
type SnapshotRestoreStats struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can it just be called RestoreStats? i quickly checked what kopia restore command is called and its kopia restore so I think just RestoreStats should also be ok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But please feel free to disagree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I was always using kopia snapshot restore but I've just checked documentation, and you are right. Seems that snapshot restore is an alias to restore. So your suggestion makes sense.

FilesProcessed int64
SizeProcessedB int64
FilesTotal int64
SizeTotalB int64
ProgressPercent int64
}

// SnapshotStatsFromSnapshotRestore parses the output of a `kopia snapshot
// restore` execution for a log of the stats for that execution.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I was not able to understand this comment properly, especially the part execution for a log of the stats for that execution..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment was done in same styling (wording) as in similar function:
https://github.com/kanisterio/kanister/pull/2776/files#diff-00217f95488de292da931fcdc8bdc46e611d503b26e17e13119cb4ce04548892R214-R215

In both cases it means that we have output of some command, we are parsing it and trying to find stats logged by command.

Possible rephrasing (for both cases) :
// XXX parses the output of 'ZZZZZ' line-by-line in search of progress statistics.
// It returns nil if no statistics are found, or the most recent statistic if multiple are encountered.

WDYT ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced for both SnapshotStatsFromSnapshotCreate and RestoreStatsFromRestoreOutput

func SnapshotStatsFromSnapshotRestore(snapRestoreStderrOutput string) (stats *SnapshotRestoreStats) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func SnapshotStatsFromSnapshotRestore(snapRestoreStderrOutput string) (stats *SnapshotRestoreStats) {
func RestoreStatsFromRestoreOP(snapRestoreStderrOutput string) (stats *SnapshotRestoreStats) {

maybe this or similar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've renamed it to RestoreStatsFromRestoreOutput to make it as straightforward as possible.

if snapRestoreStderrOutput == "" {
return nil
}
logs := regexp.MustCompile("[\r\n]").Split(snapRestoreStderrOutput, -1)

for _, l := range logs {
lineStats := parseKopiaSnapshotRestoreProgressLine(l)
if lineStats != nil {
stats = lineStats
}
}

return stats
}

func parseKopiaSnapshotRestoreProgressLine(line string) (stats *SnapshotRestoreStats) {
match := kopiaSnapshotRestorePattern.FindStringSubmatch(line)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a good idea if we can also share (in comments) how exactly this line looks like.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if len(match) < 8 {
return nil
}

groups := make(map[string]string)
for i, name := range kopiaSnapshotRestorePattern.SubexpNames() {
if i != 0 && name != "" {
groups[name] = match[i]
}
}

processedCount, err := strconv.Atoi(groups["processedCount"])
if err != nil {
log.WithError(err).Print("Skipping entry due to inability to parse number of processed files", field.M{"processedCount": groups["processedCount"]})
return nil
}

processedSize, err := humanize.ParseBytes(groups["processedSize"])
if err != nil {
log.WithError(err).Print("Skipping entry due to inability to parse amount of processed bytes", field.M{"processedSize": groups["processedSize"]})
return nil
}

totalCount, err := strconv.Atoi(groups["totalCount"])
if err != nil {
log.WithError(err).Print("Skipping entry due to inability to parse expected number of files", field.M{"totalCount": groups["totalCount"]})
return nil
}

totalSize, err := humanize.ParseBytes(groups["totalSize"])
if err != nil {
log.WithError(err).Print("Skipping entry due to inability to parse expected amount of bytes", field.M{"totalSize": groups["totalSize"]})
return nil
}

progressPercent, err := strconv.ParseFloat(groups["percentage"], 64)
if err != nil {
log.WithError(err).Print("Skipping entry due to inability to parse progress percent string", field.M{"progressPercent": groups["progressPercent"]})
return nil
}

if progressPercent >= 100 {
// It may happen that kopia reports progress of 100 or higher without actual
e-sumin marked this conversation as resolved.
Show resolved Hide resolved
// completing the task. This can occur due to inaccurate estimation.
// In such case, we will return the progress as 99% to avoid confusion.
e-sumin marked this conversation as resolved.
Show resolved Hide resolved
progressPercent = 99
}

return &SnapshotRestoreStats{
FilesProcessed: int64(processedCount),
SizeProcessedB: int64(processedSize),
FilesTotal: int64(totalCount),
SizeTotalB: int64(totalSize),
ProgressPercent: int64(progressPercent),
}
}

// RepoSizeStatsFromBlobStatsRaw takes a string as input, interprets it as a kopia blob stats
// output in an expected format (Contains the line "Total: <size>"), and returns the integer
// size in bytes or an error if parsing is unsuccessful.
Expand Down
62 changes: 62 additions & 0 deletions pkg/kopia/command/parse_command_output_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -440,6 +440,68 @@ func (kParse *KopiaParseUtilsTestSuite) TestSnapshotStatsFromSnapshotCreate(c *C
}
}

func (kParse *KopiaParseUtilsTestSuite) TestSnapshotStatsFromSnapshotRestore(c *C) {
type args struct {
snapshotRestoreOutput string
}
tests := []struct {
name string
args args
wantStats *SnapshotRestoreStats
}{
{
name: "Basic test case",
args: args{
snapshotRestoreOutput: "Processed 2 (397.5 MB) of 3 (3.1 GB) 14.9 MB/s (12.6%) remaining 3m3s.",
},
wantStats: &SnapshotRestoreStats{
FilesProcessed: 2,
SizeProcessedB: 397500000,
FilesTotal: 3,
SizeTotalB: 3100000000,
ProgressPercent: 12,
},
},
{
name: "Real test case",
args: args{
snapshotRestoreOutput: "Processed 2 (13.7 MB) of 2 (3.1 GB) 8.5 MB/s (0.4%) remaining 6m10s.",
},
wantStats: &SnapshotRestoreStats{
FilesProcessed: 2,
SizeProcessedB: 13700000,
FilesTotal: 2,
SizeTotalB: 3100000000,
ProgressPercent: 0,
},
},
{
name: "Ignore incomplete stats without during estimation",
args: args{
snapshotRestoreOutput: "Processed 2 (32.8 KB) of 2 (3.1 GB).",
},
wantStats: nil,
},
{
name: "Progress is over 100% and still not ready - set 99%",
args: args{
snapshotRestoreOutput: "Processed 2 (13.7 MB) of 2 (3.1 GB) 8.5 MB/s (120.4%) remaining 6m10s.",
},
wantStats: &SnapshotRestoreStats{
FilesProcessed: 2,
SizeProcessedB: 13700000,
FilesTotal: 2,
SizeTotalB: 3100000000,
ProgressPercent: 99,
},
},
}
for _, tt := range tests {
stats := SnapshotStatsFromSnapshotRestore(tt.args.snapshotRestoreOutput)
c.Check(stats, DeepEquals, tt.wantStats, Commentf("Failed for %s", tt.name))
}
}

func (kParse *KopiaParseUtilsTestSuite) TestPhysicalSizeFromBlobStatsRaw(c *C) {
for _, tc := range []struct {
blobStatsOutput string
Expand Down
Loading