-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Format reports on Golang CLI #18166
Format reports on Golang CLI #18166
Conversation
Update the Golang side commands to be able to use this output: 1. Return either the yaml or json (default) output to the console. 2. Users can define the format they want with `--format` flag, like `bin/alluxio info report --format yaml` 3. In JSON format, print properties in a fixed, easy-to-read order 4. In YAML format, print properties alphabetically (since YAML specification regards property order non-significant) Before: ``` {"safeMode":false,"masterVersions":[{"version":"304-SNAPSHOT","host":"localhost","port":19998,"state":"PRIMARY"}],"masterAddress":"localhost:19998","zookeeperAddress":[],"useZookeeper":false,"raftJournalAddress":["localhost:19200"],"useRaftJournal":true,"liveWorkers":1,"lostWorkers":0,"freeCapacity":"1024.00MB","totalCapacityOnTiers":{"MEM":"1024.00MB"},"usedCapacityOnTiers":{"MEM":"0B"},"version":"304-SNAPSHOT","webPort":19999,"started":"09-15-2023 15:54:56:635","uptime":"0 day(s), 0 hour(s), 26 minute(s), and 37 second(s)","rpcPort":19998} ``` After (in JSON): ``` { "rpcPort": 19998, "started": "09-15-2023 15:54:56:635", "uptime": "0 day(s), 0 hour(s), 55 minute(s), and 31 second(s)", "safeMode": false, "version": "304-SNAPSHOT", "webPort": 19999, "masterVersions": [ { "version": "304-SNAPSHOT", "host": "localhost", "port": 19998, "state": "PRIMARY" } ], "masterAddress": "localhost:19998", "zookeeperAddress": [], "useZookeeper": false, "raftJournalAddress": [ "localhost:19200" ], "useRaftJournal": true, "liveWorkers": 1, "lostWorkers": 0, "freeCapacity": "1024.00MB", "totalCapacityOnTiers": { "MEM": "1024.00MB" }, "usedCapacityOnTiers": { "MEM": "0B" } } ``` After (in YAML): ``` freeCapacity: 1024.00MB liveWorkers: 1 lostWorkers: 0 masterAddress: localhost:19998 masterVersions: - host: localhost port: 19998 state: PRIMARY version: 304-SNAPSHOT raftJournalAddress: - localhost:19200 rpcPort: 19998 safeMode: false started: 09-15-2023 15:54:56:635 totalCapacityOnTiers: MEM: 1024.00MB uptime: 0 day(s), 1 hour(s), 1 minute(s), and 36 second(s) useRaftJournal: true useZookeeper: false usedCapacityOnTiers: MEM: 0B version: 304-SNAPSHOT webPort: 19999 zookeeperAddress: [] ``` pr-link: Alluxio#18159 change-id: cid-deb6e74552de9afcf45391c6c230a9fe00785e37 (cherry picked from commit 86308c3)
- parse JSON in report.go - write format functions - modify nested properties - fix unit tests - fix checkstyle issues
return stacktrace.Propagate(err, "error unmarshalling json from java command") | ||
} | ||
|
||
if reportArg == "summary" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this approach is very brute force, but it is hard to determine which fields need to be formatted. i have 2 ideas:
- represent the output as a struct. set annotations on struct fields to determine if any conversion function should be applied when serializing
- pros: define exactly which fields need to be converted, no chance for mistakenly converting the wrong field
- cons: need to define a struct for the output and the struct needs to be updated if the output is updated
- define a series of patterns/rules to determine if a field needs to be converted (ex. if field name ends with
Bytes
, cast value tofloat64
and runconvertBytesToString
- pros: no need to do anything if fields are changed
- cons: need to be careful with field names. also it can be hard to navigate the output if the output is not a flat map (ex. some fields have a map as a value instead of a string or number)
example for 1:
type SummaryOutput struct {
FreeCapacity float64 `json:"freeCapacity" yaml:"freeCapacity" format:"convertBytesToString"`
Started float64 `json:"started" yaml:"started" format:"convertMsToDatetime"`
...
}
func (o SummaryOutput) MarshalJSON() ([]byte, error) {
val := reflect.Value(o)
// the below can be abstracted to a function, reusable for each struct
obj := orderedmap.New()
for i := 0; i < val.NumField(); i++ {
f := val.Field(i)
convertName, ok := f.Tag.Lookup("format")
if !ok {
obj.Set(<fieldName>, <fieldValue>)
continue
}
convertFunc, ok := convertFuncMap[convertName] // convertFuncMap is a mapping of name (ex. "convertBytesToString") to the function definition
// check ok
obj.Set(<fieldName>, convertFunc(<fieldValue>)
}
return json.Marshal(obj)
}
var summary SummaryOutput
json.Unmarshal(buf, &summary)
json.Marshal(summary)
example for 2:
use the argument suffix as the way to determine if the value should be formatted.
- if the key ends with
Bytes
-> cast value to float64 and runconvertBytesToString
- if the key ends with
Date
-> cast value to float64 and runconvertMsToDatetime
- if the key ends with
Duration
-> cast the value to float64 and runconvertMsToDuration
- if the key ends with
Tiers
-> assume the value is a map of key -> bytes
this will require some keys to be renamed to fit the above mapping
- freeCapacity -> freeCapacityBytes
- started -> startDate
- uptime -> uptimeDuration
// after populating ordered map obj
out := orderedMap.New()
for k, v := obj {
if strings.HasSuffix(k, "Bytes") {
out.Set(k, convertBytesToString(v.(float64))
else if ...
}
json.Marshal(out)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer example 2, it's right time to review all summary property formats and keep them in similar formats.
But there is one more thing, need to think about a way to handle nested JSON conversion.
There are three scenarios:
- The key-value pair to convert is in an object node. e.g.
"report": {"keyBytes": valueBytes}
- The key-value pair to convert is in an array node. e.g.
"reports": [{"keyBytes": valueBytes}, {...}]
- The key-value pair to convert is in a map. e.g.
"report": {"keyBytes": {"keyMap": valueBytes}, {...}}
I might need to think about all scenarios.
if val, ok := obj.Get("uptime"); ok { | ||
obj.Set("uptime", convertMsToDuration(val.(float64))) | ||
} | ||
if val, ok := obj.Get("totalCapacityOnTiers"); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jiacheliu3 can the tiers information be safely removed?
} | ||
} | ||
|
||
if reportArg == "jobservice" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jiacheliu3 also can you help to confirm if this report is still needed?
alluxio-bot, merge this please |
For now, Java is in charge of formatting the output of reports, e.g. file size in bytes to readable values; date formats.
This PR moves the format job to the golang CLI side. When checking reports, data transferred to golang CLI, and CLI deal with the formatting.
Current date format:
2006-01-02T15:04:05Z07:00
Current duration format:
0d 12h34m56s
Current file size format:
1024.00MB
(scale up when number > 5120)Format on dates:
summary
object ->started
objectjob_service
object ->masterStatus
array ->startTime
objectjob_service
object ->recentModifiedJobs
array ->timestamp
objectjob_service
object ->recentFailedJobs
array ->timestamp
objectjob_service
object ->longestRunningJobs
array ->timestamp
objectFormat on duration:
summary
object ->uptime
objectFormat on file size:
summary
object ->freeCapacity
objectsummary
object ->totalCapacityOnTiers
mapsummary
object ->usedCapacityOnTiers
mapufs
map ->ufsCapacityBytes
objectufs
map ->ufsUsedBytes
object