Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slave Usage Monitoring UI #1405

Merged
merged 43 commits into from
Mar 20, 2017
Merged

Slave Usage Monitoring UI #1405

merged 43 commits into from
Mar 20, 2017

Conversation

matush-v
Copy link
Contributor

@matush-v matush-v commented Jan 24, 2017

This is the front-end for the slave usage tracking. Definitely a work in progress, but the general structure is there. Feel free to edit the task lists.

Design Tasks:

  • slave button styling
  • slave stat menu styling
  • aggregate page styling: do we want aggregate info/what do we want here?

Functionality Tasks:

  • what are the thresholds and where are they defined
  • over subscription issues
  • timestamp display locale
  • clicking on a slave stat copies to clipboard
  • how to show dead slaves?
  • show/sort by rack info?
  • filtering/sorting options

Updated - 2/21/2017
image

cc: @ssalinas @wsorenson @tpetr
BE: #1400

@ssalinas
Copy link
Member

A few comments based on your task list above:

  • There is a method in utils for converting those timestamps to a more human-readble format
  • Not sure what you categorize as disabled slaves, but it might be good to separate out at least the 'Dead' ones like on the slaves page
  • Would be interesting to be able to aggregate by rack (i.e. one big box of summed up metrics for each rack)
  • would be nice to be able to search/filter for a slave in some way. Not sure what the best way for that would be since we don't want those boxes to be text-heavy at all
  • For the over-subscription piece, we should add extra attributes that Singularity will look for on the slave. If those attributes are present, it will use that as the 'total' for mem/cpu/etc. If they are not present it will use the values reported by the slave. This will need to be a change on the backend PR

Will look more at the actual js and leave comments there shortly

@matush-v
Copy link
Contributor Author

The actual js has a lot of room for improvement. Modularization is one I'm working on now. I think that review is better saved for once the design is decided since it can change a lot before that.

@wsorenson
Copy link
Contributor

cool!

We need to merge the data with the slave APIs so that they should hostname and cpusUsed vs cpusAllocated (as opposed to the raw no.)

Maybe that was obvious / not implemented yet, wanted to make sure!

@ssalinas
Copy link
Member

Yep we talked about using the attribute values (falling back to the resource ones) as the denominator for getting the usage percentages already

And yes @darcatron , would be nice if we sub that slave ID in with the hostname, would be much easier to reason about/read in the ui that way

Copy link
Member

@ssalinas ssalinas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you have here will function. But, in terms of having cleaner, easier to uderstand code in react. I think it might be better to separate these out into more components.

Right now you have the one large SlaveUsage componenet. I think you could easily have a component for each Slave box (that can take props like the warn/crit levels and actual numerical stats), and maybe even a sub component for the info box that has the raw stats displayed

@@ -2,14 +2,14 @@ import { buildApiAction, buildJsonApiAction } from './base';

export const FetchSlaves = buildApiAction(
'FETCH_SLAVES',
{url: '/slaves'}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you remove all of the leading slashes here?

Copy link
Contributor Author

@matush-v matush-v Jan 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the buildApiAction already has /api/ so it's just an extra slash. ../api//slaves will work, but it's not proper so I figured i'd remove it. I can drop it from the rest of the files too, but prob better for another pr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

turns out this is handled in staging so the buildApiAction doesn't return the extra slash bitmoji

var warningStyle = 'warning';
var okStyle = 'success';

if (isStatCritical(slave, props.cpusUsedStat) || isStatCritical(slave, props.memoryBytesUsedStat) || isStatCritical(slave, props.numTasksStat)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably catch the exceptions you are throwing in the case where we can't properly find the stats. It'd be better to show less stats (excluding the ones that threw an exception), rather than throwing js errors all the way up

};

const slaveWithStats = (slave, index, bsStyle, glyphicon) => (
<Dropdown key={slave.slaveId} id={index.toString()}>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reasoning behind using dropdown here? Seems like an odd component choice

Copy link
Contributor Author

@matush-v matush-v Jan 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dropdown is the little stat menu you get when clicking on a slave btn. Using Dropdown over DropdownMenu allows more styling flexibility. Not sure if that's what you were talking about

@@ -36,6 +36,7 @@ const rootComponent = (Wrapped, title, refresh = _.noop, refreshInterval = true,
}).catch((reason) => {
// Boot React errors out of the promise so they can be picked up by Sentry
setTimeout(() => {
console.log(reason.stack);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't forget to remove these after you're done testing ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm never done testing 😈

@matush-v matush-v added the hs_qa label Feb 2, 2017
@ssalinas ssalinas changed the title [WIP] Slave Usage Monitoring Slave Usage Monitoring UI Feb 2, 2017
@ssalinas ssalinas added this to the 0.14.0 milestone Feb 9, 2017
@matush-v
Copy link
Contributor Author

@ssalinas got the updates in for the default values. Once you think it's correct for staging, i'll set our config for shortening the slave name to true and deploy it

@@ -239,6 +239,10 @@ public String getRedirectOnUnauthorizedUrl() {
return redirectOnUnauthorizedUrl;
}

public boolean isShortenSlaveUsageHostname() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @darcatron , without the getter here, this value is not available when rendering the template, so {{ shortenSlaveUsageHostname }} would come back empty in the rendered template

@ssalinas ssalinas modified the milestones: 0.14.0, 0.15.0 Mar 13, 2017
@demobox
Copy link
Contributor

demobox commented Mar 17, 2017

Just related to this, or perhaps for potential inspiration: we are currently doing similar monitoring of our Mesos agent usage and utilization via https://github.com/Capgemini/mesos-ui (video at https://youtu.be/bKbOod8Pn4E)

@ssalinas ssalinas merged commit e2d5c5a into master Mar 20, 2017
@ssalinas ssalinas deleted the slave-usage branch March 20, 2017 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants