Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time travel control #2524

Merged
merged 21 commits into from
Jun 12, 2017
Merged

Time travel control #2524

merged 21 commits into from
Jun 12, 2017

Conversation

fbarl
Copy link
Contributor

@fbarl fbarl commented May 16, 2017

Resolves #1620 by adding a timeline control next to the footer in the bottom-right corner of the screen.

Currently the feature is restricted to Weave Cloud (after #2575 is merged, we could also make it work with Scope standalone if the collector service is up and running) and hidden behind a feature flag (as we want to take time experimenting on dev while not blocking roll-outs to prod) - see #2607.

Also, time travel currently only happens on the websocket channel (the node deltas), so the other individual API calls also need to be extended with a custom timestamp to make the whole UI correct and consistent.

Notable changes
  • Extended the backend handleWebsocket function to accept an optional timestamp parameter from the UI, which means we want to be receiving periodic reports starting from that timestamp (instead of the default time.Now()).
  • Created a timeline component with a slider and moved the pause button there.
  • Moved nodes delta buffer into the global state to fix the bug with paused info not being updated after every change.
  • Improved empty topology error messaging a bit by distinguishing between the case of:
    • probe not reporting any nodes (because something is not properly configured)
    • some nodes are reported but none is displayed (because of the particular timestamp or the filters applied in the UI)
  • Refactored some UI websockets code.
Next steps (discussed with @2opremio)
  1. Clicking on node details makes a direct call to api/topology/[topId]/[nodeId], disregarding the timestamp. So the API needs to be extended similarly to websockets to make the nodes details data consistent with time travel control. The same goes for api/topology? calls which give the info about all the available (sub)topologies as well as their stats (node/edge count, etc...). Here we need to give some thought to the UI, e.g. what happens when the topology we are currently at disappears if we go back in time and it's not currently there? - Time Travel: Use timestamp in calls to /api/topology #2609
  2. A couple of optimizations should be done on the backend to make nodes delta requests faster. The problem is that with the current architecture, most of the queries about the past would be bypassing Memcache and always call S3 for the data. Some hand-waving ideas:
    i. Whenever a request comes through query service that ends up pulling the info from S3, we might want to add it to Memcache and optionally instance local memory cache.
    ii. After that, we might try predicting and pre-filling memcache with the next probable requested timestamps (e.g. near future from the current timestamp).
Some other pain points
  • The initial load, which now happens after every jump in time, is still very slow and makes the timeline experience worse. This is a tough one as it would probably involve a lot of backend optimizations.
  • The position of the timeline component is not that ideal, especially when expanded & on low screen widths. The buttons/filters/selectors in the Scope UI have pilled up again, so we should probably revisit all the elements and come up with a cleaner view. That especially holds for the bottom of the screen (see Status pane overlaps and hides namespace pane #2569).
  • Graph layout not being deterministic on one hand and diff criterion for layout caching being to weak on the other (two overlapping nodes, drawn at the same location #2356) results in an ugly layout being rendered sometimes when travelling in time. My take on this is that we should try making the layout deterministic and then we could maybe reset the cache when jumping in time.
  • The UI websockets code seems a bit fragile and not so robust. A nice time to address that would be when we add support for multiple channels/topologies to make resource views dynamic.
  • Some parts of the code could be unit tested nicely, but I didn't do it because I wanted to have the PR ready asap.

@fbarl fbarl self-assigned this May 16, 2017
@fbarl fbarl force-pushed the timeline-prototype branch from 40a4e78 to c6d96ca Compare May 16, 2017 16:43
@rade rade mentioned this pull request May 18, 2017
@fbarl fbarl force-pushed the timeline-prototype branch from c6d96ca to 865ba9a Compare May 29, 2017 17:39
@davkal
Copy link
Contributor

davkal commented Jun 1, 2017

Unstructured comments:

  • works super smooth. But maybe a bit too smooth. The UI could show off a bit more that you're seeing the past
  • the time keeps ticking after clicking pause
  • "Pause" and the pause icon are too close together
  • The date format is not ISO
  • The component's background touches the border of the viewport (no other component does)
  • Showing the time itself should be reserved for showing past, when showing the current state better use "Now", as it's visually distinct
  • The time selector has too many options, I'd limit it to 3, maximum 6. Better to choose values that are nearer to now
  • The slider is not very intuitive, possibly because there are no tick, like you would expect on a timeline. If anything, maybe a tape-recorder with play/fast forward may be a better metaphor.
  • Instana shows a big overlay across the screen after each big interaction (for us that could be a topology change), to remind the user that they are looking at the past
  • The clock icon is not very intuitive, here I prefer the play/forward metaphor again
  • Jump-to-now will be an important escape action, and the current icon on the right is a bit far removed from the other controls (I understand why it's there, but I think it belongs to the controls). Plus, a text that supports the icon would help as I suspect that the action would be quite important
  • The nodes could use some enter/exit animations to draw attention to what changed as a consequence of moving the slider. We could also add the date to the loading message on the topologies (where it says "Optimizing topologies...")
  • The title "Explore" should be replaced by something more instructive

@fbarl fbarl force-pushed the timeline-prototype branch 3 times, most recently from f6100fe to 6f43bd9 Compare June 6, 2017 15:33
@fbarl
Copy link
Contributor Author

fbarl commented Jun 6, 2017

Thanks for your comments @davkal, they were really useful! Here are my responses:

works super smooth. But maybe a bit too smooth. The UI could show off a bit more that you're seeing the past

I made one or two of changes that hopefully address that issue now. Most notably, I made the whole timestamp blink (unless we're at now).

the time keeps ticking after clicking pause

Yeah, the pausing was temporarily broken at that commit but that's been fixed now.

"Pause" and the pause icon are too close together

That was a CSS bug that has now also been fixed.

The date format is not ISO

Not sure how I feel about that. I chose this particular format because it's more readable, and the argument of ISO timestamp being nicer to copy & paste is a bit tricky because selection gets removed every time the value changes (which is normally every second). Are there any other arguments in favor of ISO?

The component's background touches the border of the viewport (no other component does)

Right, I made the timeline more consistent without how footer is shown. A bigger problem is where to position the component on the screen. I think the current position is not that bad, but to make it better for low resolutions, we should probably either merge it back into footer or clear up some other corner of the screen (and make sure it doesn't overlap with nodes details). Some different ideas:

bottom-right
bottom-right-2

Showing the time itself should be reserved for showing past, when showing the current state better use "Now", as it's visually distinct

That's a very good point - I made the changes.

The time selector has too many options, I'd limit it to 3, maximum 6. Better to choose values that are nearer to now

I dropped it from 16 to 12, which I think is quite visually manageable - in any case, it's much less than what Graphana has. I don't think we have a problem with vertical space there and at the same time my impression is that it might be quite useful to go back far into the past. @2opremio what do you think?

The slider is not very intuitive, possibly because there are no tick, like you would expect on a timeline. If anything, maybe a tape-recorder with play/fast forward may be a better metaphor.

I was feeling the same so I added a short explanation just above the slider. Ticks would make things more explicit, but timestamps take too much space to make a good use of them IMO. I would personally prefer to stick with the slider as it enables us to jump between timestamps faster and it's also a good visual indicator where we are in time.

Instana shows a big overlay across the screen after each big interaction (for us that could be a topology change), to remind the user that they are looking at the past

That sounds like a good idea, but the new blinking timestamp already seems sufficiently aggressive to me :) If you disagree, maybe we can talk about how this could be done nicely.

The clock icon is not very intuitive, here I prefer the play/forward metaphor again

I made the whole timestamp + clock icon into a clickable button which should hopefully make it more intuitive.

Jump-to-now will be an important escape action, and the current icon on the right is a bit far removed from the other controls (I understand why it's there, but I think it belongs to the controls). Plus, a text that supports the icon would help as I suspect that the action would be quite important

I put the button between the timestamp and the pause button (it felt right since it had to do more with the timestamp and less with the paused state). I don't see the button as super important since we get the same effect by pulling the slider to the right end, so I decided to skip the text. Hopefully the tooltip is enough to suggest what the button does.

The nodes could use some enter/exit animations to draw attention to what changed as a consequence of moving the slider. We could also add the date to the loading message on the topologies (where it says "Optimizing topologies...")

The idea that @rade and @pidster suggested was to fade out the nodes while we're moving to a different point in time so that the diff in nodes and edges would be more visible between the states. Since we're skipping a usual loader, I also added a small spinner to the timeline control while the user is transitioning.

As for animating nodes & edges, it is already being done, but before we make the layout engine deterministic, I had to chose between not using cache (bunch of stuff that didn't change gets repositioned) and using the cache (things don't move around sometimes even when they change). I decided to go with the latter because the diff is easier to observe, but I'd be happy to work on further refining the caching and/or stability of the layout engine in some other PR.

The title "Explore" should be replaced by something more instructive

I'm still open to suggestions on what that something should be exactly, but I think that the text I inserted between the range options and the slider already removes a lot of confusion.

@fbarl fbarl changed the title [WIP] Timeline control prototype Timeline control Jun 7, 2017
@fbarl fbarl requested a review from davkal June 7, 2017 15:19
@fbarl fbarl force-pushed the timeline-prototype branch from 5c4c4e7 to 5236eb3 Compare June 7, 2017 15:22
@fbarl fbarl force-pushed the timeline-prototype branch from c47a363 to 5abaa8a Compare June 8, 2017 14:49
Copy link
Contributor

@davkal davkal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocker: Add mixpanel tracking.

Otherwise, code and UX looks good to merge. LGTM

Minor line comments.

Extensions:

  • query available times for travel, and grey our unavailable options
  • direct editing of times, for incident response

this.updateTimestamp.bind(this), TIMELINE_DEBOUNCE_INTERVAL);
}

componentWillMount() {

This comment was marked as abuse.

This comment was marked as abuse.


componentWillUnmount() {
clearInterval(this.timer);
this.updateTimestamp();

This comment was marked as abuse.

This comment was marked as abuse.

return (
<div className="timeline-control">
{showSliderPanel && <div className="slider-panel">
<span className="caption">Explore</span>

This comment was marked as abuse.

This comment was marked as abuse.

{this.renderRangeOption(sliderRanges.last90Days)}
{this.renderRangeOption(sliderRanges.last1Year)}
</div>
<div className="column">

This comment was marked as abuse.

This comment was marked as abuse.


componentWillMount() {
// Force periodic re-renders to update the slider position as time goes by.
this.timer = setInterval(() => { this.forceUpdate(); }, TIMELINE_SLIDER_UPDATE_INTERVAL);

This comment was marked as abuse.

This comment was marked as abuse.


this.setState({ millisecondsInPast });
this.debouncedUpdateTimestamp(millisecondsInPast);
this.props.startWebsocketTransition();

This comment was marked as abuse.

This comment was marked as abuse.

@@ -1,28 +1,23 @@
import debug from 'debug';
import find from 'lodash/find';
import { find } from 'lodash';

This comment was marked as abuse.

This comment was marked as abuse.

type: ActionTypes.CLICK_RESUME_UPDATE
});
// Periodically merge buffered nodes deltas until the buffer is emptied.
nodesDeltaBufferUpdateTimer = setInterval(

This comment was marked as abuse.

This comment was marked as abuse.

import { TIMELINE_TICK_INTERVAL } from '../constants/timer';


class TopologyTimestampButton extends React.Component {

This comment was marked as abuse.

This comment was marked as abuse.

const timestamp = isPaused ? updatePausedAt : moment().utc().subtract(millisecondsInPast);

return (
<time>{timestamp.format('MMMM Do YYYY, h:mm:ss a')} UTC</time>

This comment was marked as abuse.

This comment was marked as abuse.

@fbarl fbarl changed the title Timeline control Time travel control Jun 9, 2017
@fbarl
Copy link
Contributor Author

fbarl commented Jun 9, 2017

From the feedback I got today from @2opremio and @davkal, the UX of the Time Travel component as implemented in this PR still looks far from natural and intuitive and there are quite a few pain points do be addressed.

Since the PR is quite big already and it's been hanging there for a while, I will merge it as it is now so that we can test its behaviour in dev, where we'll be able to get a feeling of how bad the performance really is and how much additional backend work will be needed to achieve smooth performance. Everything will be hidden behind a feature flag which will allow uninterrupted roll-outs to prod while being able to test it on dev.

Regarding the next steps, I think it would be useful for me to get a fresh perspective about the UX from @bia. Otherwise, we should probably discuss more which controls would be most important for the users and which ones would be secondary/optional (while @pidster & @rade prefer the visual sliders, @davkal & @2opremio feel the direct timestamp editing would be more useful). :)

@fbarl fbarl merged commit b6dfe25 into master Jun 12, 2017
@fbarl fbarl deleted the timeline-prototype branch June 19, 2017 12:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Time travel support
2 participants