Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in-toto Github Actions: different builds for windows fail with the same errors #337

Closed
SolidifiedRay opened this issue Feb 22, 2021 · 12 comments

Comments

@SolidifiedRay
Copy link

Description of issue or feature request:
As issue #407 suggests, I set up the Github Actions Workflow for in-toto in this draft PR. However, each time I run the workflow on my fork, different builds for windows fail, and they all give the same errors:

======================================================================
ERROR: test_timeout_setting (tests.test_runlib.TestLinkCmdExecTimeoutSetting)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "d:\a\in-toto\in-toto\.tox\py\lib\site-packages\securesystemslib\process.py", line 220, in run_duplicate_streams
    raise subprocess.TimeoutExpired(cmd, timeout)
subprocess.TimeoutExpired: Command '['d:\\a\\in-toto\\in-toto\\.tox\\py\\scripts\\python.exe', '-c', 'while True: pass']' timed out after 0.1 seconds

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\a\in-toto\in-toto\tests\test_runlib.py", line 463, in test_timeout_setting
    in_toto.runlib.execute_link([sys.executable, '-c', 'while True: pass'], True)
  File "D:\a\in-toto\in-toto\in_toto\runlib.py", line 362, in execute_link
    timeout=float(in_toto.settings.LINK_CMD_EXEC_TIMEOUT))
  File "d:\a\in-toto\in-toto\.tox\py\lib\site-packages\securesystemslib\process.py", line 230, in run_duplicate_streams
    os.remove(stdout_name)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpzhlq0vde'

Here is the link for the full test results: https://github.com/SolidifiedRay/in-toto/runs/1948989796?check_suite_focus=true

Not sure if it is relevant, but another person also encounter a test failure in the same process module, and it is also related to the run_duplicate_streams function.

Current behavior:
Failing test: test_timeout_setting (tests.test_runlib.TestLinkCmdExecTimeoutSetting)

Expected behavior:
No failing test.

@jku
Copy link
Collaborator

jku commented Feb 23, 2021

Not sure if it is relevant, but another person also encounter a test failure in the same process module, and it is also related to the run_duplicate_streams function.

These don't seem related: in your case TimeoutExpired happens as expected but cleanup fails because something (subprocess or some other process) is still using the file. In my case the TimeoutExpired is expected but does not happen (I suspect our process froze for a moment and the subprocess exited before we started checking timeout).

@jku
Copy link
Collaborator

jku commented Feb 23, 2021

I have an idea though... the code uses wait() after killing: I think we may need communicate() instead in the exception case. Otherwise I think the fds might still be open after the process terminates on windows (see the windows implementation of _wait() and _communicate() https://github.com/python/cpython/blob/2d6f2eed14ff5d89155b52771cc8ef957e8145b4/Lib/subprocess.py#L1473)

@SolidifiedRay are you able to test this?

@SolidifiedRay
Copy link
Author

@SolidifiedRay are you able to test this?

Thanks for the idea! I will test it.

@SolidifiedRay
Copy link
Author

I have an idea though... the code uses wait() after killing: I think we may need communicate() instead in the exception case.

Unfortunately, this doesn't work, and the tests give the same errors. Here is the link for the results. I also run the tests on my Windows machine, and no errors show up (as expected). I believe this error is related to how the Github Actions workflow runs windows tests. I think I can run the Github Action locally and see if I can trace the error.

@jku
Copy link
Collaborator

jku commented Mar 4, 2021

Are you sure this tests the correct securesystemslib branch? The pinned requirement does not seem to mention a branch and your master does not have the communicate() change?

(or maybe i'm reading this incorrectly?)

@SolidifiedRay
Copy link
Author

SolidifiedRay commented Mar 4, 2021

Are you sure this tests the correct securesystemslib branch?

Yes, the tests are using my securesystemslib fork with the communicate() change: SolidifiedRay@d285d7d. Is this what you mean? Thanks! @jku (And I have the pinned requirement here. It is in my gh-action-windows branch)

20210304083538

@jku
Copy link
Collaborator

jku commented Mar 4, 2021

oh your securesystemslib forks default branch is gh-actions, that explains it.

Thanks for the test, sadly we can drop the communicate() theory.

@SolidifiedRay
Copy link
Author

SolidifiedRay commented Mar 7, 2021

I just realized (not sure if this is correct) that we don't use tox for the in-toto windows tests. Instead, we use python tests/runtest.py in the appveyor.yml file. If I use python tests/runtests.py to run windows tests in Github Actions, all the tests can pass. You may see the tests result here.
So maybe we should use tox for mac and linux and python tests/runtest.py for windows? What do you think? @lukpueh

@lukpueh
Copy link
Member

lukpueh commented Mar 8, 2021

Hm. That's interesting. It could indeed be related. Maybe with tox there is one subprocess level too much? Would still be nice to figure out the exact source of the problem. Maybe we do have a bug in our process module.

@lukpueh
Copy link
Member

lukpueh commented Mar 8, 2021

Just tried to find out why we didn't use tox on appveyor, but I couldn't dig up anything quickly. I suppose it was because tox "didn't work" on appveyor (see theupdateframework/python-tuf@f0cb767 from the PR, which added an appveyor config to TUF, which served as blueprint for the config in in-toto).

Regardless, taking a deeper look into tox' implementation should be worthwhile. I saw that they do some special configuration wrt to processes and signals on Windows.

@lukpueh
Copy link
Member

lukpueh commented Mar 19, 2021

It looks like it is not related to tox after all. I was able to reproduce the error by running this stripped down variant of securesystemslib.process.run_duplicate_streams on a Windows VM:

# I ran this using CPython 3.9.1

import os, sys, io
from tempfile import mkstemp
from subprocess import Popen
from time import sleep

stdout_fd, stdout_name = mkstemp()
with os.fdopen(stdout_fd, "w") as stdout_fp:
    proc = Popen(
            [sys.executable, "-c", "while True: pass"],
            stdout=stdout_fp)

    # NOTE: For some reason the issue never occurred
    # when I killed the process right away. 
    sleep(0.1)

    proc.kill()
    proc.wait()

os.remove(stdout_name) # raises  (most of the times, but not always) ...
# PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\vagrant\\AppData\\Local\\Temp\\tmp8ouavoii'

systeminfo:

OS Name: Microsoft Windows Server 2019 Standard Evaluation
OS Version: 10.0.17763 N/A Build 17763

I am pretty sure that it is related to Popen.kill() being an alias for Popen.terminate() on Windows and Popen.wait() returning before the process has released the file. Unfortunately, the Popen.communicate() workaround does not work.

lukpueh added a commit to lukpueh/in-toto that referenced this issue Feb 9, 2023
* Instead of `securesystemslib.process.run`, use `subprocess.run`
  directly. The former is only a thin wrapper over the latter,
  which provides no advantage in Py3-only times.

* Copy `securesystemslib.process.run_duplicate_streams` to a helper
  in runlib. This function is really only needed internally by
  in-toto-run tooling, for which it was developped originally
  (in-toto#160). It does not have to be pulic API in securesystemslib.
  Moreover, the function is not trivial and there have been
  multiple related bugs, one of them still unresolved
  (see secure-systems-lab/securesystemslib#337).

This patch allows us to deprecate `securesystemslib.process`, and
fix or workaround related issue wrt the specific runlib use case.

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
lukpueh added a commit to lukpueh/in-toto that referenced this issue Feb 9, 2023
* Instead of `securesystemslib.process.run`, use `subprocess.run`
  directly. The former is only a thin wrapper over the latter,
  which provides no advantage in Py3-only times.

* Copy `securesystemslib.process.run_duplicate_streams` to a helper
  in runlib. This function is really only needed internally by
  in-toto-run tooling, for which it was developped originally
  (in-toto#160). It does not have to be pulic API in securesystemslib.
  Moreover, the function is not trivial and there have been
  multiple related bugs, one of them still unresolved
  (see secure-systems-lab/securesystemslib#337).

This patch allows us to deprecate `securesystemslib.process`, and
fix or workaround related issue wrt the specific runlib use case.

**Note to git historians:**
run_duplicate_streams was originally removed from in-toto in
in-toto/in-toto@29f063a, and now copied back from
https://github.com/secure-systems-lab/securesystemslib/blob/v0.26.0/securesystemslib/process.py

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
lukpueh added a commit to lukpueh/in-toto that referenced this issue Feb 10, 2023
* Instead of `securesystemslib.process.run`, use `subprocess.run`
  directly. The former is only a thin wrapper over the latter,
  which provides no advantage in Py3-only times.

* Copy `securesystemslib.process.run_duplicate_streams` to a helper
  in runlib. This function is really only needed internally by
  in-toto-run tooling, for which it was developped originally
  (in-toto#160). It does not have to be pulic API in securesystemslib.
  Moreover, the function is not trivial and there have been
  multiple related bugs, one of them still unresolved
  (see secure-systems-lab/securesystemslib#337).

This patch allows us to deprecate `securesystemslib.process`, and
fix or workaround related issue wrt the specific runlib use case.

**Note to git historians:**
run_duplicate_streams was originally removed from in-toto in
in-toto/in-toto@29f063a, and now copied back from
https://github.com/secure-systems-lab/securesystemslib/blob/v0.26.0/securesystemslib/process.py

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
lukpueh added a commit to lukpueh/in-toto that referenced this issue Feb 10, 2023
* Instead of `securesystemslib.process.run`, use `subprocess.run`
  directly. The former is only a thin wrapper over the latter,
  which provides no advantage in Py3-only times.

* Copy `securesystemslib.process.run_duplicate_streams` to a helper
  in runlib. This function is really only needed internally by
  in-toto-run tooling, for which it was developped originally
  (in-toto#160). It does not have to be public API in securesystemslib.
  Moreover, the function is not trivial and there have been
  multiple related bugs, one of them still unresolved
  (see secure-systems-lab/securesystemslib#337).

This patch allows us to deprecate `securesystemslib.process`, and
fix or workaround related issue wrt the specific runlib use case.

**Note to git historians:**
run_duplicate_streams was originally removed from in-toto in
in-toto/in-toto@29f063a, and now copied back from
https://github.com/secure-systems-lab/securesystemslib/blob/v0.26.0/securesystemslib/process.py

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
lukpueh added a commit to lukpueh/securesystemslib that referenced this issue Feb 10, 2023
This module was originally developed in in-toto and transferred to
securesystemslib in secure-systems-lab#174, primarily as Py2/Py3-agnostic wrapper
around stdlib's `subprocess.run`, to to execute `gpg`.
In secure-systems-lab#502 we switched to using `subprocess.run` directly.

Another wrapper around `subprocess.run`, provided by the module,
allows capturing standard streams but still write them to a
terminal. It was developed as specific `in-toto-run` feature and
does not need to be public API in sslib.  in-toto/in-toto#544 moves
the function back to in-toto.

closes secure-systems-lab#345,
transfers secure-systems-lab#337 (to in-toto)

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
lukpueh added a commit to lukpueh/securesystemslib that referenced this issue Feb 10, 2023
This module was originally developed in in-toto and transferred to
securesystemslib in secure-systems-lab#174, primarily as Py2/Py3-agnostic wrapper
around stdlib's `subprocess.run`, to run the `gpg` command. In secure-systems-lab#502
we switched to using `subprocess.run` directly.

Another wrapper around `subprocess.run`, provided by the module,
allows capturing standard streams AND write them to a terminal. It
was developed as specific `in-toto-run` feature and does not need
to be public API in sslib. in-toto/in-toto#544 moves the function
back to in-toto.

closes secure-systems-lab#345,
transfers secure-systems-lab#337 (to in-toto)

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
@lukpueh
Copy link
Member

lukpueh commented Feb 16, 2023

Transferred to in-toto/in-toto#547

@lukpueh lukpueh closed this as completed Feb 16, 2023
idunbarh pushed a commit to idunbarh/in-toto that referenced this issue Mar 22, 2023
* Instead of `securesystemslib.process.run`, use `subprocess.run`
  directly. The former is only a thin wrapper over the latter,
  which provides no advantage in Py3-only times.

* Copy `securesystemslib.process.run_duplicate_streams` to a helper
  in runlib. This function is really only needed internally by
  in-toto-run tooling, for which it was developped originally
  (in-toto#160). It does not have to be public API in securesystemslib.
  Moreover, the function is not trivial and there have been
  multiple related bugs, one of them still unresolved
  (see secure-systems-lab/securesystemslib#337).

This patch allows us to deprecate `securesystemslib.process`, and
fix or workaround related issue wrt the specific runlib use case.

**Note to git historians:**
run_duplicate_streams was originally removed from in-toto in
in-toto/in-toto@29f063a, and now copied back from
https://github.com/secure-systems-lab/securesystemslib/blob/v0.26.0/securesystemslib/process.py

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants