Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle git blame #761

Merged
merged 4 commits into from
Nov 16, 2021
Merged

Handle git blame #761

merged 4 commits into from
Nov 16, 2021

Conversation

dandavison
Copy link
Owner

Ref #426

@dandavison dandavison changed the title Blame resurrection Handle git blame Nov 4, 2021
@dandavison
Copy link
Owner Author

cc @th1000s @Kr1ss-XD FYI in case you'd like to give thoughts. I may merge this to master, but please don't let that dissuade you from commenting if you feel like it.

This PR resurrects and extends the git blame work initially discussed in #426.

There's quite a bit of discussion in that issue: basically, the implementation here is intended to work for users with vanilla git configs i.e. who are not making use of git settings such as blame.coloring, color.blame.highlightRecent, blame.date, blame.showEmail. But a different approach might be required if delta is to enhance blame for users making use of those git settings.

I've made two changes to what I showed in #426

  1. I'm only showing blame info for the first line when there would be multiple identical successive lines.
  2. Added default palettes (3 colors) for delta's light and dark "modes" (--light --dark).
image image

@Kr1ss-XD
Copy link
Contributor

Kr1ss-XD commented Nov 4, 2021

I'm only showing blame info for the first line when there would be multiple identical successive lines.

This is a great idea. The output looks much more clearly arranged like that.

I'm looking forward to check this branch out for testing, most likely the upcoming weekend 😃

@zachriggle
Copy link

@dandavison I downloaded this branch and built it (per #426, after nuking ~/.cargo to get the build to work) and get good results with Rust files (i.e. delta's own source)...

Screen Shot 2021-11-08 at 8 34 56 PM

... but it does not appear to work on C\C++ source code (e.g. torvalds/linux file init/main.c).

Screen Shot 2021-11-08 at 8 34 16 PM

@Kr1ss-XD
Copy link
Contributor

Kr1ss-XD commented Nov 9, 2021

I'm seeing this too, however I don't necessarily think it has to do with the files being C source code vs. Rust. I rather suspect this to be caused by large sized repositories, more precise large repo histories.

For example, I have cloned this repository, which only contains a handful of .c and .h files, the history of [master] is about 100 commits long. git blame output is formatted by delta here, as expected. The same goes for a test repo I've set up with two commits to a test.c file :

blame-test

On the other hand, if I do git blame on any file in the John repo or the Linux sources (which both have an immense git history), the output looks as if delta would just pass it through. (git blame <file> seems to yield the same output as git -P blame <file> | less)

@dandavison
Copy link
Owner Author

Thanks @zachriggle @Kr1ss-XD: yes, as @Kr1ss-XD says the problem was due to history size: in my initial implementation I had hardcoded an assumption that the commit hash would be truncated to 8 characters, and then forgotten to go back and do something actually reasonable. It appears that in repos with more history, git automatically displays more characters of the hash to avoid the possibility of displaying two distinct commits with identical formatting. So, we assume 7-40 characters now.

-    [0-9a-f]{8}    # commit hash
+    [0-9a-f]{7,40} # commit hash

@dandavison
Copy link
Owner Author

One change I'm planning to make to this branch is the following:

Currently, due to recycling of colors from the palette, it's possible that two distinct commits appearing next to each other would be assigned the same color. I'm thinking of choosing a different color in that situation. In other words I would give more importance to "give neighboring commits different colors" than to "ensure that a given commit always receives the same color".

@zachriggle
Copy link

zachriggle commented Nov 9, 2021 via email

@dandavison
Copy link
Owner Author

How can one customize the palette of available colors?

That's the blame-palette option.

@Kr1ss-XD
Copy link
Contributor

Kr1ss-XD commented Nov 9, 2021

I would give more importance to "give neighboring commits different colors" than to "ensure that a given commit always receives the same color".

I like the idea, too. It would be even better if this could be toggled on or off by a config setting.

@Kr1ss-XD
Copy link
Contributor

Kr1ss-XD commented Nov 9, 2021

It appears that in repos with more history, git automatically displays more characters of the hash to avoid the possibility of displaying two distinct commits with identical formatting.

That's correct. It can also be configured by core.abbrev (which may be auto).

So, we assume 7-40 characters now.

The smallest allowed length is 4 though, according to the documentation. Maybe it would be better to use that as a minimum for the regex ?

@dandavison
Copy link
Owner Author

The smallest allowed length is 4 though, according to the documentation. Maybe it would be better to use that as a minimum for the regex ?

Thanks; done. (Btw for anyone building this branch you may have to put up with force-pushes to it, so git fetch origin blame-resurrection && git reset --hard origin/blame-resurrection )

@th1000s
Copy link
Collaborator

th1000s commented Nov 9, 2021

What is the current state and reasoning for/against syntax highlighting? I see the first pair of screenshot still have it, but later ones and my local build do not.

@dandavison
Copy link
Owner Author

What is the current state and reasoning for/against syntax highlighting? I see the first pair of screenshot still have it, but later ones and my local build do not.

Thank you, I had entirely forgotten about this. In 0b8f145 I added a default-language option. blame handling currently relies upon this to know what language to use, because the blame output itself does not contain the usual file extensions used for this purpose. On the plus side, this can be set on a per-repository basis, so if your repo is monolingual then we have a reasonable solution:

delta(blame-resurrection) head -n2 .git/config
[delta]
    default-language = rs

Anyone have any better suggestions here?

One thing that occurs to me is that it if someone wanted to create a rust tool to automatically predict language from source code, then that could make a nice rust crate. It really seems that it is not a problem pushing at the frontiers of machine learning :) Github has a tool that does this: https://github.com/github/linguist/blob/master/docs/how-linguist-works.md I believe that it wouldn't be hard to use the data used by their classifier (naive Bayes FWIW) and make a Rust version, i.e. skipping all the hard work of collecting the dataset. But, something of a tangent for a branch trying to add blame handling.

@Kr1ss-XD
Copy link
Contributor

Kr1ss-XD commented Nov 10, 2021

When the blame-format setting contains "constant" characters, for instance I have set it to "{author:<18} ({commit:>7}) ┊{timestamp:^16}┊ " (note the parenthesises around commit), delta prints them out on consecutive lines of the same commit.

blame-format

I think it would be nicer to suppress those characters, too. @dandavison ?

EDIT/ I just realized that it would probably look worse if delta did that, because then also the (or whatever users define as limiter lines) would be omitted in those lines ...
Maybe it's better to just not use something like () or [] in this setting.

@dandavison
Copy link
Owner Author

dandavison commented Nov 10, 2021

@Kr1ss-XD hm, yes as you say, we can either have those lines blank for variable content only (as currently) or we can have them completely blank. I went for the former because at the very least, users are going to want a separating character at the right-hand end to be present in a blank line.

Incidentally, hyperlinks makes a lot of sense with git blame. I see you're using tmux. Are you building tmux with the hyperlinks patch applied, and if so have you managed to apply it against a recent tmux version? I changed laptops and there seemed to be a problem with the older tmux commit I was using on my new laptop (apple M1).

(I propose languid as the name for the language-identifying crate... although perhaps if it's a straight port of linguist it should retain that name.)

@th1000s
Copy link
Collaborator

th1000s commented Nov 10, 2021

blame handling / which language to use

One quick solution would be to check the command line of the parent process, then git blame hello_world.rs with #770 will return rs.

@dandavison dandavison force-pushed the blame-resurrection branch 4 times, most recently from 776b746 to e423ecf Compare November 12, 2021 01:46
@dandavison
Copy link
Owner Author

One quick solution would be to check the command line of the parent process, then git blame hello_world.rs with #770 will return rs.

Fantastic. This PR is now using @th1000s's parent process inspection utility, so there should no longer be any need for default-language, if the file you are blameing has a standard file extension for the language.

@Kr1ss-XD
Copy link
Contributor

Are you building tmux with the hyperlinks patch applied, and if so have you managed to apply it against a recent tmux version? I changed laptops and there seemed to be a problem with the older tmux commit I was using on my new laptop (apple M1).

I have actually tried that a while ago, and I kind of got hyperlinks working in tmux, in combination with your open-in-editor script. Although there was something that really annoyed me, which I don't remember any more. In any case, I've switched back to the tmux binary package from the Archlinux repos.

@@ -15,6 +15,8 @@ name = "delta"
path = "src/main.rs"

[dependencies]
chrono = "0.4.19"
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a security bug in this library (flagged by cargo audit). But I think there's nothing we can do about our exposure other than wait for upstream changes because syntect uses the library, and we can't replace syntect.

trishume/syntect#385

if let Some(field) = field {
let field = pad(&field);
if is_repeat {
s.push_str(&" ".repeat(measure_text_width(&field)));
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: it should be possible to determine the required width once only and thereafter construct the blank line more efficiently.

Fixes #426

Partial versions of these changes were previously in master and then
reverted multiple times.

See #746
0745f85
3aab5d1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants