Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial telemetry implementation #289

Merged
merged 3 commits into from
Apr 19, 2016
Merged

Conversation

peschkaj
Copy link
Contributor

@peschkaj peschkaj commented Apr 7, 2016

This implements basic telemetry, as outlined in #254

  • Telemetry can be enabled/disabled with rustup telemetry on|off
  • Telemetry configuration is stored in ~/.multirust/telemetry
  • Telemetry is stored in ~/.multirust/telemetry-YYYY-MM-DD

@brson
Copy link
Contributor

brson commented Apr 7, 2016

Thanks for the quick turnaround @peschkaj!

Here are some initial impressions:

First, telemetry will need to be in the multirust library, not multirust-cli so it will be available to other clients, e.g. IDEs. I'd suggest just taken the functionality you've implemented here and just lowering it by one 'layer', so that telemetry is measured in e.g. Toolchain::add_component etc.

This is going to sound mean, but I'd like you to switch from serde to rustc-serialize for the serialization, both so we don't add more nightly dependencies, but also because serde still breaks a lot. rustc-serialize should work fine for our purpose.

On where telemetry 'lives', I'd say you've got the right idea by putting the TelemetryEvent enum in utils - telemetry may be needed from several rustup crates and utils has no dependencies. I'd further suggest putting all telemetry stuff in its own telemetry module; that includes the logic for dealing with filesystem, so your 3 function on Cfg would just just small pass-troughs to functions in utils::telemetry.

I'll also go and leave some individual comments on the patch.

@peschkaj
Copy link
Contributor Author

peschkaj commented Apr 7, 2016

Many thanks, none of it sounds mean at all.

I was worried about the serde dependency and what you say makes sense. I
think serde may be a leftover from when I was attempting to record
timestamps as an ISO8601 string before I punted and just used seconds since
epoch.

On Thu, Apr 7, 2016 at 1:44 PM Brian Anderson notifications@github.com
wrote:

Thanks for the quick turnaround @peschkaj https://github.com/peschkaj!

Here are some initial impressions:

First, telemetry will need to be in the multirust library, not
multirust-cli so it will be available to other clients, e.g. IDEs. I'd
suggest just taken the functionality you've implemented here and just
lowering it by one 'layer', so that telemetry is measured in e.g.
Toolchain::add_component etc.

This is going to sound mean, but I'd like you to switch from serde to
rustc-serialize for the serialization, both so we don't add more nightly
dependencies, but also because serde still breaks a lot. rustc-serialize
should work fine for our purpose.

On where telemetry 'lives', I'd say you've got the right idea by putting
the TelemetryEvent enum in utils - telemetry may be needed from several
rustup crates and utils has no dependencies. I'd further suggest putting
all telemetry stuff in its own telemetry module; that includes the logic
for dealing with filesystem, so your 3 function on Cfg would just just
small pass-troughs to functions in utils::telemetry.

I'll also go and leave some individual comments on the patch.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#289 (comment)

Jeremiah Peschka


if (arg0 == "rustc" || arg0 == "rustc.exe") && cfg.telemetry_enabled() == true {
return run_rustc_with_telemetry(cmd, &args, &cfg);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me like all this logic, including common::run_inner and common::run_rustc_with_telemetry will need to move into multirust, perhaps into something like Cfg::run_command_for_dir.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(because telemetry needs to be done in the lib).

@brson
Copy link
Contributor

brson commented Apr 7, 2016

OK, I've given lots of feedback on what I'd consider "architectural" issues. Let's get the basic organization squared away and move from there.

@peschkaj
Copy link
Contributor Author

peschkaj commented Apr 7, 2016

Thanks for the quick and thorough review! I'll get on this.

@peschkaj peschkaj force-pushed the jp254-telemetry branch 4 times, most recently from 26f302f to ed1b623 Compare April 10, 2016 14:45
@@ -1,3 +1,5 @@
#![feature(custom_derive, plugin)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be needed without serde.

@brson
Copy link
Contributor

brson commented Apr 11, 2016

OK, I left a lot of feedback and requests for change, but it's all heading in the right direction. Looking good.

@bors
Copy link
Contributor

bors commented Apr 13, 2016

☔ The latest upstream changes (presumably 221bf75) made this pull request unmergeable. Please resolve the merge conflicts.

@peschkaj peschkaj force-pushed the jp254-telemetry branch 2 times, most recently from 77c062c to bac8baf Compare April 13, 2016 23:01
@peschkaj
Copy link
Contributor Author

On further review, I think I've addressed the main issues from the last code review, all tests pass, and the only outstanding item is the loss of formatting when we pipe stderr on Windows, and this is only when telemetry is on. This may be an unfortunate side effect of piping stderr and using the Windows control sequences. The problem should be eliminated on Windows 10 and newer (see my comment on command.rs).

This also might not be considered that big of an issue, I would assume that many of the people who are willing to generate telemetry are also gather telemetry on an unattended build farm, so they may not care that they're losing formatting.

@brson
Copy link
Contributor

brson commented Apr 15, 2016

@peschkaj Looks like I'm not going to get to this until tomorrow. Thanks for your patience.


try!(utils::write_file("temp", &work_file, ""));

try!(utils::rename_file("telemetry", &*work_file, &self.multirust_dir.join("telemetry_on")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just personal preference: can you call this "telemetry-on" instead of "telemetry_on"?

@brson
Copy link
Contributor

brson commented Apr 16, 2016

OK, this looks great. I did make two comments about minor changes I'd like to do before landing.

Some thoughts about next steps:

We need to be concerned about the on-disk format and forward-compatibility. Every time we change rustup's metedata format in the future we're going to have to provide an upgrade path. Right now, while we're still iterating a lot, I've been pretty sloppy about it, but soon we're going to need to be really careful about not breaking the metadata on upgrade. This scheme we have now with the "telemetry-on" file and the "telemetry" folder will suite us for now, but will almost certainly change again in the future - this isn't a problem, just something to be aware of. The way data upgrades work presently is that the "version" file contains a number representing the format of the ~/.multirust directory - when we need to make incompatible changes we bump that number, and write some transition code that runs during the rustup self upgrade-data command.

I think we will want to put a similar "version" field inside of our LogMessage type so that later analysis can distinguish exactly what it's looking at (this structure is going to change a lot over time). You can start it at version 1 and just bump every time we make any change.

Telemetry should probably limit the amount of log files it will accumulate to avoid growing without bound. Perhaps a simple way to do this would be to just check whether there are more than e.g. 100 files in the telemetry directory and delete the oldest.

Besides those small concerns, I'd say there are three avenues for further work: analysis, upload, and defining more events. The latter is probably least important - best to get everything working end-to-end to prove it out.

So I'd suggest we start thinking about what it will take to deliver the telemetry logs to us and then for us to analyze them.

Just some ideas:

We can probably start without a server at all, and just ask friends to turn on the telemetry and email them to us. We do though need a way to package up the logs in a convenient form for sending, and this form we should design to be the same whether the logs are sent manually by email or automatically over HTTPS.

To that end I'd suggest that we add functionality to convert the telemetry logs to a tarball, which can be either emailed or uploaded. We might have a rustup telemetry package command that stuffs all the logs into a tarball with a unique name, puts it in the current directory, prints the name of the tarball, then deletes all the old telemetry data.

The tarball will need to have a unique name so I'd suggest a scheme like rustup-telemetry-YYYY-MM-DD-$randomhexdigits.tar.gz. The date helps with sorting and the random number just makes it unique.

With that we can start imagining how to create tooling to turn a big mess of these tarballs into useful information. I think it makes sense to put the analysis in the same crate as the rest of our telemetry code (and we'll probably end up creating a rustup-telemetry crate to hold it all eventually) so that we can share the code between rustup and the hypothetical analysis tools.

Again just spitballing, but we might add another CLI tool called maybe rustup-telemetry-report, in this same repo for doing the later analysis.

OK, that's all the ideas I've got right now. Let me know what you think.

@peschkaj
Copy link
Contributor Author

Sounds good. I'll get the fixes together by the end of Monday and respond in more detail tomorrow.

In short, this all sounds like a good plan.

Thanks for the review.

@brson
Copy link
Contributor

brson commented Apr 16, 2016

Maybe we can just do all the reporting in rustup itself. Make it useful both for individuals and the group. So perhaps it could draw its data from the users telemetry directory or a directory of tarballs.

Oh, maybe we make the log files and the tarballs have the same naming scheme, and the analysis just works off of a directory of log files or tarballs. That seems pretty clean.

@peschkaj
Copy link
Contributor Author

I've got basic implementation of the "must haves" done.

I've started working on the delete files, but when I try the usual test case shenanigans, the proxy'd multirust/rustup directory isn't created. Is there an incantation I should be using for this? An example is at https://gist.github.com/peschkaj/dc1211f3511320da7cfb8f0ef84bfb24

@peschkaj
Copy link
Contributor Author

Oh, should add that if we want to hold off on deleting files so we can ship the initial feature, I can pull out the code and then create multiple issues so we can track the new features separately. That does seem like a better fit to me, but I spent 8 hours in aircraft yesterday, so my decision making might be suspect.

- Can currently set the `telemetry` flag.
- Telemetry status is stored in `MULTIRUST_HOME/telemetry`
- Adds routing for telemetry for rustc calls and target add
- Provides basic command proxying for telemetry
- Create a Telemetry struct and impl
- Add a Telemetry to the Toolchain struct
- Clean up telemetry checks to look for a telemetry_on file
- Remove unused directives and imports
- Modifying telemetry_rustc to stread stederr to the user. This strips formatting for some reason, not sure why.
@peschkaj
Copy link
Contributor Author

Travis appears to be drunk - cp: cannot stat ‘target/i686-unknown-linux-gnu/release/rustup-init’: No such file or directory

@brson
Copy link
Contributor

brson commented Apr 19, 2016

I definitely do want to land this before moving on to more features. I think we can land this right now. Just taking another look over it.

@brson brson merged commit dd24b42 into rust-lang:master Apr 19, 2016
@brson
Copy link
Contributor

brson commented Apr 19, 2016

The travis failures are due to an upstream change, nbd.

@brson
Copy link
Contributor

brson commented Apr 19, 2016

I've started working on the delete files, but when I try the usual test case shenanigans, the proxy'd multirust/rustup directory isn't created. Is there an incantation I should be using for this? An example is at https://gist.github.com/peschkaj/dc1211f3511320da7cfb8f0ef84bfb24

Do you mean that the directory represented in tessts by config.rustupdir isn't created (i.e. what is usually ~/.multirust)? From reading your test case I don't see anything obviously wrong, though I would not expect you to need to write ensure_dir_exists since rustup should create that directory itself. Which line in the test is failing?

Please do file issues about further features so we can continue this discussion there (I'll probably forget to check on this closed PR again).

@brson
Copy link
Contributor

brson commented Apr 19, 2016

Thanks @peschkaj!

@peschkaj
Copy link
Contributor Author

👍 happy to help

On Mon, Apr 18, 2016 at 8:57 PM Brian Anderson notifications@github.com
wrote:

Thanks @peschkaj https://github.com/peschkaj!


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#289 (comment)

Jeremiah Peschka

@peschkaj peschkaj deleted the jp254-telemetry branch April 19, 2016 01:08
@messense
Copy link

messense commented May 2, 2017

@brson

Now that serde reached 1.0 release and rustc-serialize is deprecated, should rustup switch to serde?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants