Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Format reports on Golang CLI #18166

Merged
merged 15 commits into from
Sep 22, 2023

Conversation

twalluxio
Copy link
Contributor

@twalluxio twalluxio commented Sep 18, 2023

For now, Java is in charge of formatting the output of reports, e.g. file size in bytes to readable values; date formats.

This PR moves the format job to the golang CLI side. When checking reports, data transferred to golang CLI, and CLI deal with the formatting.

Current date format: 2006-01-02T15:04:05Z07:00
Current duration format: 0d 12h34m56s
Current file size format: 1024.00MB (scale up when number > 5120)

Format on dates:

  • summary object -> started object
  • job_service object -> masterStatus array -> startTime object
  • job_service object -> recentModifiedJobs array -> timestamp object
  • job_service object -> recentFailedJobs array -> timestamp object
  • job_service object -> longestRunningJobs array -> timestamp object

Format on duration:

  • summary object -> uptime object

Format on file size:

  • summary object -> freeCapacity object
  • summary object -> totalCapacityOnTiers map
  • summary object -> usedCapacityOnTiers map
  • ufs map -> ufsCapacityBytes object
  • ufs map -> ufsUsedBytes object

twalluxio and others added 8 commits September 18, 2023 11:21
Update the Golang side commands to be able to use this output:
1. Return either the yaml or json (default) output to the console.
2. Users can define the format they want with `--format` flag, like `bin/alluxio info report --format yaml`
3. In JSON format, print properties in a fixed, easy-to-read order
4. In YAML format, print properties alphabetically (since YAML specification regards property order non-significant)

Before:
```
{"safeMode":false,"masterVersions":[{"version":"304-SNAPSHOT","host":"localhost","port":19998,"state":"PRIMARY"}],"masterAddress":"localhost:19998","zookeeperAddress":[],"useZookeeper":false,"raftJournalAddress":["localhost:19200"],"useRaftJournal":true,"liveWorkers":1,"lostWorkers":0,"freeCapacity":"1024.00MB","totalCapacityOnTiers":{"MEM":"1024.00MB"},"usedCapacityOnTiers":{"MEM":"0B"},"version":"304-SNAPSHOT","webPort":19999,"started":"09-15-2023 15:54:56:635","uptime":"0 day(s), 0 hour(s), 26 minute(s), and 37 second(s)","rpcPort":19998}
```

After (in JSON):
```
{
    "rpcPort": 19998,
    "started": "09-15-2023 15:54:56:635",
    "uptime": "0 day(s), 0 hour(s), 55 minute(s), and 31 second(s)",
    "safeMode": false,
    "version": "304-SNAPSHOT",
    "webPort": 19999,
    "masterVersions": [
        {
            "version": "304-SNAPSHOT",
            "host": "localhost",
            "port": 19998,
            "state": "PRIMARY"
        }
    ],
    "masterAddress": "localhost:19998",
    "zookeeperAddress": [],
    "useZookeeper": false,
    "raftJournalAddress": [
        "localhost:19200"
    ],
    "useRaftJournal": true,
    "liveWorkers": 1,
    "lostWorkers": 0,
    "freeCapacity": "1024.00MB",
    "totalCapacityOnTiers": {
        "MEM": "1024.00MB"
    },
    "usedCapacityOnTiers": {
        "MEM": "0B"
    }
}
```

After (in YAML):
```
freeCapacity: 1024.00MB
liveWorkers: 1
lostWorkers: 0
masterAddress: localhost:19998
masterVersions:
    - host: localhost
      port: 19998
      state: PRIMARY
      version: 304-SNAPSHOT
raftJournalAddress:
    - localhost:19200
rpcPort: 19998
safeMode: false
started: 09-15-2023 15:54:56:635
totalCapacityOnTiers:
    MEM: 1024.00MB
uptime: 0 day(s), 1 hour(s), 1 minute(s), and 36 second(s)
useRaftJournal: true
useZookeeper: false
usedCapacityOnTiers:
    MEM: 0B
version: 304-SNAPSHOT
webPort: 19999
zookeeperAddress: []
```
			pr-link: Alluxio#18159
			change-id: cid-deb6e74552de9afcf45391c6c230a9fe00785e37

(cherry picked from commit 86308c3)
- parse JSON in report.go
- write format functions
- modify nested properties
- fix unit tests
- fix checkstyle issues
return stacktrace.Propagate(err, "error unmarshalling json from java command")
}

if reportArg == "summary" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this approach is very brute force, but it is hard to determine which fields need to be formatted. i have 2 ideas:

  1. represent the output as a struct. set annotations on struct fields to determine if any conversion function should be applied when serializing
  • pros: define exactly which fields need to be converted, no chance for mistakenly converting the wrong field
  • cons: need to define a struct for the output and the struct needs to be updated if the output is updated
  1. define a series of patterns/rules to determine if a field needs to be converted (ex. if field name ends with Bytes, cast value to float64 and run convertBytesToString
  • pros: no need to do anything if fields are changed
  • cons: need to be careful with field names. also it can be hard to navigate the output if the output is not a flat map (ex. some fields have a map as a value instead of a string or number)

example for 1:

type SummaryOutput struct {
  FreeCapacity float64 `json:"freeCapacity" yaml:"freeCapacity" format:"convertBytesToString"`
  Started float64 `json:"started" yaml:"started" format:"convertMsToDatetime"`
  ...
}

func (o SummaryOutput) MarshalJSON() ([]byte, error) {
  val := reflect.Value(o)
  // the below can be abstracted to a function, reusable for each struct
  obj := orderedmap.New()
  for i := 0; i < val.NumField(); i++ {
    f := val.Field(i)
    convertName, ok := f.Tag.Lookup("format")
    if !ok {
      obj.Set(<fieldName>, <fieldValue>)
      continue
    }
    convertFunc, ok := convertFuncMap[convertName] // convertFuncMap is a mapping of name (ex. "convertBytesToString") to the function definition
    // check ok
    obj.Set(<fieldName>, convertFunc(<fieldValue>)
  } 
  return json.Marshal(obj)
}

var summary SummaryOutput
json.Unmarshal(buf, &summary)
json.Marshal(summary)

example for 2:
use the argument suffix as the way to determine if the value should be formatted.

  • if the key ends with Bytes -> cast value to float64 and run convertBytesToString
  • if the key ends with Date -> cast value to float64 and run convertMsToDatetime
  • if the key ends with Duration -> cast the value to float64 and run convertMsToDuration
  • if the key ends with Tiers -> assume the value is a map of key -> bytes

this will require some keys to be renamed to fit the above mapping

  • freeCapacity -> freeCapacityBytes
  • started -> startDate
  • uptime -> uptimeDuration
// after populating ordered map obj
out := orderedMap.New()
for k, v := obj {
  if strings.HasSuffix(k, "Bytes") {
    out.Set(k, convertBytesToString(v.(float64))
  else if ...
}
json.Marshal(out)

Copy link
Contributor Author

@twalluxio twalluxio Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer example 2, it's right time to review all summary property formats and keep them in similar formats.
But there is one more thing, need to think about a way to handle nested JSON conversion.
There are three scenarios:

  1. The key-value pair to convert is in an object node. e.g. "report": {"keyBytes": valueBytes}
  2. The key-value pair to convert is in an array node. e.g. "reports": [{"keyBytes": valueBytes}, {...}]
  3. The key-value pair to convert is in a map. e.g. "report": {"keyBytes": {"keyMap": valueBytes}, {...}}

I might need to think about all scenarios.

cli/src/alluxio.org/cli/cmd/info/report.go Outdated Show resolved Hide resolved
cli/src/alluxio.org/cli/cmd/info/report.go Outdated Show resolved Hide resolved
if val, ok := obj.Get("uptime"); ok {
obj.Set("uptime", convertMsToDuration(val.(float64)))
}
if val, ok := obj.Get("totalCapacityOnTiers"); ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiacheliu3 can the tiers information be safely removed?

}
}

if reportArg == "jobservice" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiacheliu3 also can you help to confirm if this report is still needed?

cli/src/alluxio.org/cli/cmd/info/report.go Show resolved Hide resolved
cli/src/alluxio.org/cli/cmd/info/report.go Outdated Show resolved Hide resolved
cli/src/alluxio.org/cli/cmd/info/report.go Show resolved Hide resolved
cli/src/alluxio.org/cli/cmd/info/report.go Outdated Show resolved Hide resolved
@Xenorith Xenorith added the type-feature This issue is a feature request label Sep 22, 2023
@Xenorith
Copy link
Contributor

alluxio-bot, merge this please

@alluxio-bot alluxio-bot merged commit ba7fedf into Alluxio:main Sep 22, 2023
12 checks passed
@twalluxio twalluxio deleted the golangCLI-space-date-formats branch November 8, 2023 02:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-feature This issue is a feature request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants