Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable GitHub-Actions CI and deployment #1023

Merged
merged 12 commits into from
Jun 3, 2020
Merged

Conversation

rivy
Copy link
Contributor

@rivy rivy commented May 26, 2020

Fixes #978

@rivy rivy marked this pull request as draft May 26, 2020 03:51
@rivy rivy changed the title [WIP] Maint/CI ~ add GitHub-Actions CI (aka GHA) Maint/CI ~ add GitHub-Actions CI (aka GHA) May 28, 2020
@rivy rivy force-pushed the add.gha-cicd branch 2 times, most recently from 8aa4cd3 to c30d7d0 Compare May 28, 2020 17:45
@rivy
Copy link
Contributor Author

rivy commented May 28, 2020

@sharkdp , I found some time to work on this ...

The CICD file is essentially the same as the one used for pastel with an additional build target and the two windows-gnu targets disabled (for linker issues; some explanation and refs are in the commit which disables them). I also added (automatic) 'deb' packaging for two more of the platforms that coincide with your builds.

I don't have a build/pkg for the '..._i386.deb' package. I'm not sure how to build it, and I don't see a target platform on the rust platform list to build toward for that package.

I found two problems (and included solutions):

  1. All the text comparison integrations tests were failing on the 'windows' platforms. I thought it might be a line ending issue but couldn't tell from the test output. So, I instrumented the tests to give me more information. Line endings were a problem, so I added some .gitattribute settings to force LF line endings for the test fixture files in examples and snapshots.

As the test time seemed unchanged, I left the instrumentation in the tests (future value?).

  1. The pager process execution was failing for echo (which is a built-in command for Windows). To execute built-in commands, you have to go execute them via the shell. Also, the direct execution method in-use bypassed usual shell algorithm for finding the executable along the PATH (ie, it doesn't try the various extension in PATHEXT). So, the pager won't be found if it is missing an extension. This is how I worked around that in env for coreutils.
#[cfg(not(windows))]
fn build_command<'a, 'b>(prog_name: &'b str, args: &'a mut Vec<&'b str>) -> (std::borrow::Cow<'b, str>, &'a [&'b str]) {
    (prog_name, &args[..])
}

#[cfg(windows)]
fn build_command<'a, 'b>(prog_name: &'b str, args: &'a mut Vec<&'b str>) -> (std::borrow::Cow<'b, str>, &'a [&'b str]) {
    args.insert(0, prog_name);
    args.insert(0, "/d/c");
    let prog_name = std::env::var("ComSpec")
        .map(std::borrow::Cow::from)
        .unwrap_or_else(|_| std::borrow::Cow::from("cmd"));

    (prog_name, &args[..])
}

I created a simpler (maybe slightly hacky 🤷 ) solution for the pager process logic.

#[cfg(windows)]
let (pager_path, args) = {
    let p = std::env::var("ComSpec").unwrap_or_else(|_| "cmd".to_string());
    let mut a = args.to_vec();
    a.insert(0, pager_path.to_str().unwrap().to_string());
    a.insert(0, "/d/c".to_string());
    (p, a)
};

Of course, with this, and reverting to using echo, the EOL issue popped up again. So, I added end of line normalization (via predicates) for the tests expecting echo output. The normalization could be used on all tests, but the current hybrid '.gitattributes' and targeted normalization is working.

Ultimately, all tests are passing and auto-deployment is working.

I should be able to muck with this more later this weekend if you'd like any changes.

Cheers!

@rivy rivy marked this pull request as ready for review May 28, 2020 20:00
@sharkdp
Copy link
Owner

sharkdp commented May 28, 2020

@rivy Thank you so much for working on this!

I haven't looked into the details, but a short comment regarding the pager issue: Instead of always using an intermediate cmd shell on Windows, couldn't we call it with something like this?

bat --pager="cmd /c echo …"

.github/workflows/CICD.yml Outdated Show resolved Hide resolved
@rivy
Copy link
Contributor Author

rivy commented May 29, 2020

@rivy Thank you so much for working on this!

I haven't looked into the details, but a short comment regarding the pager issue: Instead of always using an intermediate cmd shell on Windows, couldn't we call it with something like this?

bat --pager="cmd /c echo …"

Yes, that's a solution, but I think it's not ideal. The included solution (or a future variation) let's users of bat use it in a similar way cross-platform and in a "normal" way for them on the windows platform.

Asking just windows users to specify the pager in an unusual, somewhat arcane, way seems counter-productive to wider use of the tool and creates more "user friction".

(If it's not obvious, I'm a cross-platform developer, using Windows primarily and linux of various stripes quite commonly. And I really like to have all of my tools on both platforms. And, as much as possible, I expect them to work in the same manner, minimizing platform-specific quirks.)

As an example that will bite windows users... less when installed on a windows machine is usually saved into a file named less.exe somewhere on the PATH, but the user will almost always just use it as just less, no extension used (eg, dir | less). But bat --pager=less won't work. It would have to be bat --pager=less.exe. It can be even more complicated with some commands which commonly use a preliminary BAT or CMD file as an alias or preprocessor which calls a later EXE executable (transparent to the user).

And, concretely, you'd have to special case several of your tests and the bat.conf file just for windows tests.

As written here, bat --pager=less means the same thing and works the same way on all linux/unix, macos, and windows platforms.

So, I've pushed a variation of this solution into env for coreutils and I'm a fan of it's more common use in other CLIs.

@sharkdp
Copy link
Owner

sharkdp commented May 29, 2020

But bat --pager=less won't work. It would have to be bat --pager=less.exe. It can be even more complicated with some commands which commonly use a preliminary BAT or CMD file as an alias or preprocessor which calls a later EXE executable (transparent to the user).

I absolutely get your points. In principle, I'm all for getting users the best experience. Three counter arguments:

  • The --pager=<cmd> option is not something that the average user will typically use from the shell. If a user really wants to use a different pager (which one?!), they would put a --config=my-pager.exe line in their config file. Typing .exe once is not too bad. The large majority of users will never change the pager and never use this option. Which leads us to the next point.

  • Spawning an intermediate shell will lead to a performance overhead on every single bat call. It may very well be negligible. But it's something I would like to see measured before we activate this. On Linux, spawning an intermediate bash shell takes 2.4 ms:

    Command Mean [ms] Min [ms] Max [ms] Relative
    bash -c 'echo test' 2.4 ± 0.7 1.6 6.2 1.00

    … which would only be a 3% overhead* (but see Improve bat startup speed #951):

    Command Mean [ms] Min [ms] Max [ms] Relative
    bat --paging=always --pager=less test.txt 57.9 ± 2.9 53.8 66.7 1.00
    bat --paging=always --pager="bash -c less" test.txt 59.9 ± 2.3 55.8 65.3 1.03 ± 0.07

    but how does the situation look like on Windows with cmd?

  • Last, I'm not sure what kind of weird side effects we might have by passing the unfiltered <cmd> in --pager=<cmd> to a shell. But that might just be an uneasy feeling from someone who has seen too many SQL injection bugs.

* Benchmarks via:

hyperfine \
  --warmup 5 \
  'bat --paging=always --pager=less test.txt' \
  'bat --paging=always --pager="bash -c less" test.txt' \
  --export-markdown test.md

@sharkdp sharkdp changed the title Maint/CI ~ add GitHub-Actions CI (aka GHA) Enable GitHub-Actions CI and deployment May 29, 2020
@sharkdp
Copy link
Owner

sharkdp commented May 29, 2020

All the text comparison integrations tests were failing on the 'windows' platforms. I thought it might be a line ending issue but couldn't tell from the test output. So, I instrumented the tests to give me more information. Line endings were a problem, so I added some .gitattribute settings to force LF line endings for the test fixture files in examples and snapshots.

These tests seem to be running just fine on AppVeyor: https://ci.appveyor.com/project/sharkdp/bat/builds/33205057 - any idea what is different on the GHA machines?

@rivy
Copy link
Contributor Author

rivy commented May 30, 2020

  • The --pager=<cmd> option is not something that the average user will typically use from the shell. If a user really wants to use a different pager (which one?!), they would put a --config=my-pager.exe line in their config file. Typing .exe once is not too bad. The large majority of users will never change the pager and never use this option. Which leads us to the next point.

True, but, reiterating the point, if I, as a user, use a paging program which operates via a BAT/CMD entry point, and it "just works" as say less. Setting the pager configuration in the configuration file to just less will surprisingly fail, without any explanation.

  • Spawning an intermediate shell will lead to a performance overhead on every single bat call. It may very well be negligible. But it's something I would like to see measured before we activate this.

Sure, I'd like to see an analysis of that as well.

I do think it's generally irrelevant here as a premature optimization, because when you're using a pager, you're planning to look at the results. And the console display + actual reading of the content should far outweigh the time spent starting an extra process. bat doesn't start a pager every time, even with redirected output, does it?

  • Last, I'm not sure what kind of weird side effects we might have by passing the unfiltered <cmd> in --pager=<cmd> to a shell. But that might just be an uneasy feeling from someone who has seen too many SQL injection bugs.

This, I agree with... but "injection bugs" really only occur with permission differentials. On windows, I don't see how the pager would ever have any different permissions than bat, so I can't see any way to do harm. And, by not using this kind of implementation, you're exchanging some possible CMD parsing weirdness (and, I agree, it's weird sometimes) for increased user confusion/surprise and more difficult cross-platform testing (you'll have to at least re-write all of the pager tests which are using echo, likely with platform-based conditionals).

  • Benchmarks via:
hyperfine \
  --warmup 5 \
  'bat --paging=always --pager=less test.txt' \
  'bat --paging=always --pager="bash -c less" test.txt' \
  --export-markdown test.md

I've not see a way to do this with the CMD shell, as of yet, although I'd like to be able to do timings such as this...

Just my cross-platform centric opinion... I'm happy to help investigate and implement a "best" strategy for you. Let me know what I can do to help.

... later ...

And now I've found Measure-Command on PowerShell, I'll try to play around with it to get some timing numbers for you. Initial (naive) numbers show no difference, both hover around 105ms +/- 5ms.

@rivy
Copy link
Contributor Author

rivy commented May 30, 2020

These tests seem to be running just fine on AppVeyor: https://ci.appveyor.com/project/sharkdp/bat/builds/33205057 - any idea what is different on the GHA machines?

Hmm, well, I'm sure that is a line ending difference. So, my suspicion is that AppVeyor and GHA have different baseline git configuration setups. Likely, AppVeyor is using core.autocrlf=false (my preferred configuration as well), and GHA is using nothing or, the equivalent, core.autocrlf=input, which would checkout text files with CRLF on a windows platform.

I generally use a .gitattributes file in the main directory when I'm concerned about this issue. Something like...

# default to LF EOLs for all (`git`-determined) text files
* text=auto eol=lf

# BAT/CMD shell script files *require* CRLF EOLs
*.[Bb][Aa][Tt] text eol=crlf
*.[Cc][Mm][Dd] text eol=crlf
## *maybe* needed ## *.[Pp][Ss]1 text eol=crlf

# for specific fixtures
*.eol-cr.* text eol=cr
*.eol-lf.* text eol=lf
*.eol-crlf.* text eol=crlf

@rivy
Copy link
Contributor Author

rivy commented May 30, 2020

Well, ok then, I didn’t realize you wrote hyperfine and that it’s cross-platform. 😃
I’ll try that ...

@rivy
Copy link
Contributor Author

rivy commented May 30, 2020

After testing a few command variations, it appears that there is about a 13.3 ms penalty (or about 12% overhead) for using the cmd shell to execute the pager.

C:\>env BAT_PAGER="more.com" hyperfine --warmup 5 "bat.exe --paging=always test.txt"
Benchmark #1: bat.exe --paging=always test.txt
  Time (mean ± σ):     108.6 ms ±   2.2 ms    [User: 4.4 ms, System: 6.8 ms]
  Range (min … max):   106.6 ms … 116.7 ms    23 runs
C:\>env BAT_PAGER="cmd.exe /c more.com" hyperfine --warmup 5 "bat.exe --paging=always test.txt"
Benchmark #1: bat.exe --paging=always test.txt
  Time (mean ± σ):     121.9 ms ±   2.0 ms    [User: 2.9 ms, System: 5.4 ms]
  Range (min … max):   118.6 ms … 126.6 ms    21 runs

I had to use the BAT_PAGER environment variable because, ironically, quoting issues block the use of hyperfine to compare the variations directly.

C:\> hyperfine --warmup 5 "bat.exe --paging=always --pager=more.com test.txt" "bat.exe --paging=always --pager=\"cmd.exe /c more.com\" test.txt"
Benchmark #1: bat.exe --paging=always --pager=more.com test.txt                                                                                 0
  Time (mean ± σ):     109.2 ms ±   2.4 ms    [User: 4.2 ms, System: 9.1 ms]                                                                    0
  Range (min … max):   106.5 ms … 116.0 ms    23 runs

Benchmark #2: bat.exe --paging=always --pager="cmd.exe /c more.com" test.txt
Error: Command terminated with non-zero exit code. Use the '-i'/'--ignore-failure' option if you want to ignore this. Alternatively, use the '--show-output' option to debug what went wrong.

@mati865
Copy link

mati865 commented May 31, 2020

Hmm, well, I'm sure that is a line ending difference. So, my suspicion is that AppVeyor and GHA have different baseline git configuration setups. Likely, AppVeyor is using core.autocrlf=false (my preferred configuration as well), and GHA is using nothing or, the equivalent, core.autocrlf=input, which would checkout text files with CRLF on a windows platform.

That's true AppVeyor has sane false default but GH Actions actually use dumbest possible choice here which is true.
input converts CRLF to LF when committing but doesn't convert LF to CRLF on clone (i.e. doesn't break stuff).

@sharkdp
Copy link
Owner

sharkdp commented Jun 1, 2020

bat doesn't start a pager every time, even with redirected output, does it?

It does not, no. Only if the output goes to a TTY.

I do think it's generally irrelevant here as a premature optimization, because when you're using a pager, you're planning to look at the results. And the console display + actual reading of the content should far outweigh the time spent starting an extra process.

After testing a few command variations, it appears that there is about a 13.3 ms penalty (or about 12% overhead) for using the cmd shell to execute the pager.

The actual reading time outweighs the 13 ms by far, sure. However, what I'm more concerned about is latency / startup speed. I really like programs that feel fast. This is why bat is written in Rust. The baseline startup time is already quite bad (which is why I opened #951). I really wouldn't like to add another 13 ms for a slight enhancement of a feature that does not seem to be used at all (changing the default pager). bat for Windows has been available for two years and I have not heard any reports/complaints about this.

I generally use a .gitattributes file in the main directory when I'm concerned about this issue.

I didn't really know that CR/LF issues could be handled via a .gitattributes file. Thanks! 👍

src/output.rs Outdated Show resolved Hide resolved
tests/integration_tests.rs Outdated Show resolved Hide resolved
@rivy
Copy link
Contributor Author

rivy commented Jun 1, 2020

The actual reading time outweighs the 13 ms by far, sure. However, what I'm more concerned about is latency / startup speed. I really like programs that feel fast. This is why bat is written in Rust. The baseline startup time is already quite bad (which is why I opened #951). I really wouldn't like to add another 13 ms for a slight enhancement of a feature that does not seem to be used at all (changing the default pager). bat for Windows has been available for two years and I have not heard any reports/complaints about this.

Alright, I can see I'm not going to prevail here, but, for the record, you officially have one windows user reporting this as a problem (me 👋 😄) and thinks a slightly slower pager startup would not be a problem. And it likely affects the default pager path for windows users as well.

I'll drop the code, but it will break testing on windows platforms.

I didn't really know that CR/LF issues could be handled via a .gitattributes file. Thanks! 👍

Alternatively, I just saw another .gitattributes variant to solve the same problem...

* -text diff

@rivy
Copy link
Contributor Author

rivy commented Jun 1, 2020

That's true AppVeyor has sane false default but GH Actions actually use dumbest possible choice here which is true.

Thanks for the clarification.
And, wow!, that's a poor choice on their part.
🤷‍♂️ ... Maybe, they just want to play the junior developer role, throwing a weird spanner into the works, so we have to fix it preemptively. 😄

rivy added 8 commits June 1, 2020 10:32
- using `echo` on 'windows' platforms requires process execution indirectly via the shell
- `printf` is available on all GHA CI platforms
  - `printf` is *not* available on usual 'windows' platforms; so this is just temporizing, awaiting a true fix
- fixes windows test failures
- avoids `git` platform-dependent conversion of line endings for checkout of test fixtures (*tests/examples/...*)
## [why]

For 'windows' platforms, directly spawning a process (eg, called PATHNAME) bypasses the
usual windows shell machinery for determining which process to execute. Specifically,
the extensions in PATHEXT will not be used to determine the final executable. So,
`PATHNAME.bat`, `PATHNAME.cmd`, ... will *not* be executed even if on they exist on the
PATH; and this is counter to the usual expectation of a Windows user. Additionally,
built-in commands, such as `echo` and `dir`, will never be accessible as they do not
have a PATH to execute and, so, will never be found.

To use the usual machinery, giving access to PATHNAME.bat and `echo`, execute the PATHNAME
using the windows shell, eg `cmd /d/c PATHNAME`. Note this may expose the constructed
command line to the windows shell quoting vagaries (sadly, that may be part of the price).

Following Windows standards, the ComSpec environment variable is used to determine which
shell to use, with a fallback to the "modern", built-in `cmd` shell.
@rivy rivy force-pushed the add.gha-cicd branch 2 times, most recently from 3997cde to 8b70bbd Compare June 1, 2020 21:59
@rivy
Copy link
Contributor Author

rivy commented Jun 1, 2020

Reversions added and rebased to master.
Cheers!

@rivy rivy force-pushed the add.gha-cicd branch 2 times, most recently from 93a43c0 to b77c4c0 Compare June 1, 2020 22:22
…ests) per PR feedback/owner request

- reverts commit f8ed8aa
@sharkdp
Copy link
Owner

sharkdp commented Jun 3, 2020

@rivy: Thank you very much!

I'm going to merge this as-is and leave it running in parallel to TravisCI / AppVeyor for now.

One thing that would be really valuable for bat would be a special CI run that executes:

cargo install --release
bash assets/create.sh
cargo test --release # '--release' because we could prevent another full recompile

The bash assets/create.sh script is something that I typically run prior to a new release. It uses the installed bat to build a binary cache of syntaxes and themes. The subsequent run of cargo test will then use these new binary caches.

Quite a lot of tests depend on the builtin set of syntaxes, so this special run would be very valuable in catching bugs/inconsistencies in the syntax additions early.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Github actions integration
3 participants