Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resolve random CI failures #809

Merged
merged 9 commits into from
Feb 17, 2023
Merged

resolve random CI failures #809

merged 9 commits into from
Feb 17, 2023

Conversation

prjemian
Copy link
Contributor

@prjemian prjemian added the bug label Feb 16, 2023
@prjemian prjemian added this to the 1.6.14 milestone Feb 16, 2023
@prjemian prjemian self-assigned this Feb 16, 2023
@prjemian
Copy link
Contributor Author

Unit test workflow has a manual trigger to re-run. Trigger a re-run.

@prjemian
Copy link
Contributor Author

Py 3.11:

apstools/devices/tests/test_positioner_soft_done.py::test_move_and_stop_nonzero FAILED [ 21%]

=================================== FAILURES ===================================
__________________________ test_move_and_stop_nonzero __________________________

rbv = EpicsSignal(read_pv='gp:gp:float1', name='rbv', value=0.5, timestamp=[167](https://github.com/BCDA-APS/apstools/actions/runs/4198440690/jobs/7282085823#step:15:168)6584057.792367, tolerance=0.0001, auto_monitor=True, string=False, write_pv='gp:gp:float1', limits=False, put_complete=False)
pos = PVPositionerSoftDoneWithStop(prefix='gp:gp:', name='pos', settle_time=0.0, timeout=None, read_attrs=['readback', 'setpoint'], configuration_attrs=['done', 'tolerance', 'target'], limits=None, egu='')

    def test_move_and_stop_nonzero(rbv, pos):
        timed_pause()
    
        # move to non-zero
        longer_delay = 2
        target = round(rand(2, 5), 2)
        delayed_complete(pos, rbv, delay=longer_delay)
        # t0 = time.time()  # time it
        status = pos.move(target)  # readback set by delayed_complete()
        # dt = time.time() - t0
        assert status.done
        assert status.success
        # assert status.elapsed >= longer_delay
        timed_pause()
        # assert dt >= longer_delay
>       assert pos.inposition
E       AssertionError: assert False
E        +  where False = PVPositionerSoftDoneWithStop(prefix='gp:gp:', name='pos', settle_time=0.0, timeout=None, read_attrs=['readback', 'setpoint'], configuration_attrs=['done', 'tolerance', 'target'], limits=None, egu='').inposition

/home/runner/work/apstools/apstools/apstools/devices/tests/test_positioner_soft_done.py:266: AssertionError
=============================== warnings summary ===============================
../../../micromamba-root/envs/anaconda-test-env-py-3.11/lib/python3.11/site-packages/setuptools_scm/git.py:135
  /home/runner/micromamba-root/envs/anaconda-test-env-py-3.11/lib/python3.11/site-packages/setuptools_scm/git.py:135: UserWarning: "/home/runner/work/apstools/apstools" is shallow and may cause errors
    warnings.warn(f'"{wd.path}" is shallow and may cause errors')

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED apstools/devices/tests/test_positioner_soft_done.py::test_move_and_stop_nonzero - AssertionError: assert False
 +  where False = PVPositionerSoftDoneWithStop(prefix='gp:gp:', name='pos', settle_time=0.0, timeout=None, read_attrs=['readback', 'setpoint'], configuration_attrs=['done', 'tolerance', 'target'], limits=None, egu='').inposition
!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
============= 1 failed, 134 passed, 1 warning in 110.85s (0:01:50) =============
Error: Process completed with exit code 1.

@prjemian
Copy link
Contributor Author

different test had a failure:

apstools/plans/tests/test_alignment.py::test_direct_implementation_with_rel_scan[signal1-mover1--1.2-1.2-11-cen] FAILED [ 41%]

=================================== FAILURES ===================================
___ test_direct_implementation_with_rel_scan[signal1-mover1--1.2-1.2-11-cen] ___

signal = SynPseudoVoigt(name='pvoigt', value=385, timestamp=1676585458.7826302)
mover = EpicsSignal(read_pv='gp:gp:float1', name='axis', value=-0.9345, timestamp=1676585458.971158, tolerance=0.0001, auto_monitor=True, string=False, write_pv='gp:gp:float1', limits=False, put_complete=False)
start = -1.2, finish = 1.2, npts = 11, feature = 'cen'

    @pytest.mark.parametrize(
        "signal, mover, start, finish, npts, feature",
        [
            # FIXME: [noisy, m1, -1.2, 1.2, 11, "max"],  # slower
            [pvoigt, axis, -1.2, 1.2, 11, "max"],  # faster
            [pvoigt, axis, -1.2, 1.2, 11, "cen"],
            [pvoigt, axis, -1.2, 1.2, 51, "com"],
        ],
    )
    def test_direct_implementation_with_rel_scan(signal, mover, start, finish, npts, feature):
        RE(bps.mv(mover, 0))
        assert get_position(mover) == 0.0
    
generator rel_scan ['e498790c'] (scan num: 2)



=============================== warnings summary ===============================
../../../micromamba-root/envs/anaconda-test-env-py-3.9/lib/python3.11/site-packages/setuptools_scm/git.py:135
  /home/runner/micromamba-root/envs/anaconda-test-env-py-3.9/lib/python3.11/site-packages/setuptools_scm/git.py:135: UserWarning: "/home/runner/work/apstools/apstools" is shallow and may cause errors
    warnings.warn(f'"{wd.path}" is shallow and may cause errors')

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED apstools/plans/tests/test_alignment.py::test_direct_implementation_with_rel_scan[signal1-mover1--1.2-1.2-11-cen] - AssertionError: lo=0.8084099249862092 position=-0.934 fwhm=0.[291](https://github.com/BCDA-APS/apstools/actions/runs/4198581472/jobs/7282422710#step:15:292)00000000000015
assert 0.8084099249862092 <= (-0.934 + 0.29100000000000015)
!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
============= 1 failed, 261 passed, 1 warning in 234.80s (0:03:54) =============
Error: Process completed with exit code 1.

@prjemian
Copy link
Contributor Author

This lint workflow failure is not related to the issue:

2023-02-16 22:38:28 [ERROR]   Found errors in [mypy] linter!
2023-02-16 22:38:28 [ERROR]   Error code: 1. Command output:
------
WARNING: The directory '/github/home/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.
Collecting types-PyYAML
  Downloading types_PyYAML-6.0.12.6-py3-none-any.whl (14 kB)
Installing collected packages: types-PyYAML
Successfully installed types-PyYAML-6.0.12.6

Notice:  A new release of pip available: 22.3.1 -> 23.0
Notice:  To update, run: python -m pip install --upgrade pip
apstools/plans/command_list.py:419: error: Incompatible types in assignment (expression has type "Callable[[Any, Any, Any], Any]", variable has type "None")  [assignment]
Installing missing stub packages:
/venvs/mypy/bin/python -m pip install types-PyYAML


Found 1 error in 1 file (checked 1 source file)
------

@prjemian
Copy link
Contributor Author

The error from the lint CI is a puzzle since it does not point to th eline with the problem. Instead, it points to the last line in the file:

COMMAND_LIST_REGISTRY.command = execute_command_list

And the suggestion to upgrade pip or pip install types-PyYAML is all under the control of the github/super-linter@v4 actions workflow. Not sure how best to resolve.

Switch to github/super-linter/slim@v4 on advice.

@prjemian
Copy link
Contributor Author

The root cause appears to be a timing situation involving the EPICS PV monitor events. Some of the test code requires a new value from EPICS faster than the ophyd code has updated its in-memory cache of values as received by CA monitor events. In such cases, it was found that using SIGNAL.get(use_monitor=False) was effective to eliminate the random CI failures observed. The use_monitor=False kwarg forces an immediate CA get from the EPICS IOC. This produced the new value.

@prjemian
Copy link
Contributor Author

Trigger the workflow two more times to confirm no random failures.

@prjemian
Copy link
Contributor Author

Both failed.

Py3.9

apstools/devices/tests/test_positioner_soft_done.py::test_put_and_stop FAILED [ 21%]

=================================== FAILURES ===================================
______________________________ test_put_and_stop _______________________________

rbv = EpicsSignal(read_pv='gp:gp:float1', name='rbv', value=0.5, timestamp=1676590437.933418, tolerance=0.0001, auto_monitor=True, string=False, write_pv='gp:gp:float1', limits=False, put_complete=False)
prec = EpicsSignal(read_pv='gp:gp:float1.PREC', name='prec', value=4, timestamp=631[152](https://github.com/BCDA-APS/apstools/actions/runs/4199116508/jobs/7283603125#step:15:153)000.0, auto_monitor=True, string=False, write_pv='gp:gp:float1.PREC', limits=False, put_complete=False)
pos = PVPositionerSoftDoneWithStop(prefix='gp:gp:', name='pos', settle_time=0.0, timeout=None, read_attrs=['readback', 'setpoint'], configuration_attrs=['done', 'tolerance', 'target'], limits=None, egu='')

    def test_put_and_stop(rbv, prec, pos):
        assert pos.tolerance.get() == -1
        assert pos.precision == prec.get()
    
        def motion(rb_initial, target, rb_mid=None):
            rbv.put(rb_initial)  # make the readback to different
            c_sp = pos._sp_count
            pos.setpoint.put(target)
            assert pos._sp_count == c_sp + 1
            assert math.isclose(pos.readback.get(use_monitor=False), rb_initial, abs_tol=0.02)
            assert math.isclose(pos.setpoint.get(use_monitor=False), target, abs_tol=0.02)
            assert pos.done.get() != pos.done_value
            assert not pos.inposition
    
            if rb_mid is not None:
                # move the readback part-way, but move is not over yet
                rbv.put(rb_mid)
                assert math.isclose(pos.readback.get(use_monitor=False), rb_mid, abs_tol=0.02)
                assert not pos.inposition
    
                # force a stop now
                pos.stop()
                pos.cb_readback()
                assert pos.setpoint.get(use_monitor=False) == rb_mid
                assert pos.readback.get(use_monitor=False) == rb_mid
                assert pos.position == rb_mid
            else:  # interrupted move
                rbv.put(target)  # make the readback match
                rb_new = pos.readback.get(use_monitor=False)
                assert math.isclose(rb_new, target, abs_tol=0.02)
    
            assert pos.inposition
            assert pos.done_value is True
            pos.cb_readback()
            assert pos.done.get() is True
    
        motion(1, 0)  # ensure starting value at 0.0
        motion(0, 1)  # new setpoint
>       motion(1, 0, 0.5)  # interrupted move

/home/runner/work/apstools/apstools/apstools/devices/tests/test_positioner_soft_done.py:247: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

rb_initial = 1, target = 0, rb_mid = 0.5

    def motion(rb_initial, target, rb_mid=None):
        rbv.put(rb_initial)  # make the readback to different
        c_sp = pos._sp_count
        pos.setpoint.put(target)
        assert pos._sp_count == c_sp + 1
        assert math.isclose(pos.readback.get(use_monitor=False), rb_initial, abs_tol=0.02)
        assert math.isclose(pos.setpoint.get(use_monitor=False), target, abs_tol=0.02)
        assert pos.done.get() != pos.done_value
        assert not pos.inposition
    
        if rb_mid is not None:
            # move the readback part-way, but move is not over yet
            rbv.put(rb_mid)
            assert math.isclose(pos.readback.get(use_monitor=False), rb_mid, abs_tol=0.02)
            assert not pos.inposition
    
            # force a stop now
            pos.stop()
            pos.cb_readback()
>           assert pos.setpoint.get(use_monitor=False) == rb_mid
E           AssertionError: assert 1.0 == 0.5
E            +  where 1.0 = <bound method EpicsSignalBase.get of EpicsSignal(read_pv='gp:gp:float2', name='pos_setpoint', parent='pos', value=1.0, timestamp=[167](https://github.com/BCDA-APS/apstools/actions/runs/4199116508/jobs/7283603125#step:15:168)6590437.934744, tolerance=0.0001, auto_monitor=True, string=False, write_pv='gp:gp:float2', limits=False, put_complete=True)>(use_monitor=False)
E            +    where <bound method EpicsSignalBase.get of EpicsSignal(read_pv='gp:gp:float2', name='pos_setpoint', parent='pos', value=1.0, timestamp=1676590437.934744, tolerance=0.0001, auto_monitor=True, string=False, write_pv='gp:gp:float2', limits=False, put_complete=True)> = EpicsSignal(read_pv='gp:gp:float2', name='pos_setpoint', parent='pos', value=1.0, timestamp=1676590437.934744, tolerance=0.0001, auto_monitor=True, string=False, write_pv='gp:gp:float2', limits=False, put_complete=True).get
E            +      where EpicsSignal(read_pv='gp:gp:float2', name='pos_setpoint', parent='pos', value=1.0, timestamp=1676590437.934744, tolerance=0.0001, auto_monitor=True, string=False, write_pv='gp:gp:float2', limits=False, put_complete=True) = PVPositionerSoftDoneWithStop(prefix='gp:gp:', name='pos', settle_time=0.0, timeout=None, read_attrs=['readback', 'setpoint'], configuration_attrs=['done', 'tolerance', 'target'], limits=None, egu='').setpoint

/home/runner/work/apstools/apstools/apstools/devices/tests/test_positioner_soft_done.py:[232](https://github.com/BCDA-APS/apstools/actions/runs/4199116508/jobs/7283603125#step:15:233): AssertionError
=============================== warnings summary ===============================
../../../micromamba-root/envs/anaconda-test-env-py-3.9/lib/python3.11/site-packages/setuptools_scm/git.py:135
  /home/runner/micromamba-root/envs/anaconda-test-env-py-3.9/lib/python3.11/site-packages/setuptools_scm/git.py:135: UserWarning: "/home/runner/work/apstools/apstools" is shallow and may cause errors
    warnings.warn(f'"{wd.path}" is shallow and may cause errors')

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED apstools/devices/tests/test_positioner_soft_done.py::test_put_and_stop - AssertionError: assert 1.0 == 0.5
 +  where 1.0 = <bound method EpicsSignalBase.get of EpicsSignal(read_pv='gp:gp:float2', name='pos_setpoint', parent='pos', value=1.0, timestamp=1676590437.934744, tolerance=0.0001, auto_monitor=True, string=False, write_pv='gp:gp:float2', limits=False, put_complete=True)>(use_monitor=False)
 +    where <bound method EpicsSignalBase.get of EpicsSignal(read_pv='gp:gp:float2', name='pos_setpoint', parent='pos', value=1.0, timestamp=1676590437.934744, tolerance=0.0001, auto_monitor=True, string=False, write_pv='gp:gp:float2', limits=False, put_complete=True)> = EpicsSignal(read_pv='gp:gp:float2', name='pos_setpoint', parent='pos', value=1.0, timestamp=1676590437.934744, tolerance=0.0001, auto_monitor=True, string=False, write_pv='gp:gp:float2', limits=False, put_complete=True).get
 +      where EpicsSignal(read_pv='gp:gp:float2', name='pos_setpoint', parent='pos', value=1.0, timestamp=1676590437.934744, tolerance=0.0001, auto_monitor=True, string=False, write_pv='gp:gp:float2', limits=False, put_complete=True) = PVPositionerSoftDoneWithStop(prefix='gp:gp:', name='pos', settle_time=0.0, timeout=None, read_attrs=['readback', 'setpoint'], configuration_attrs=['done', 'tolerance', 'target'], limits=None, egu='').setpoint
!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
============= 1 failed, 133 passed, 1 warning in 111.56s (0:01:51) =============
Error: Process completed with exit code 1.

Py3.10

apstools/devices/tests/test_positioner_soft_done.py::test_move_and_stopped_early FAILED [ 21%]

=================================== FAILURES ===================================
_________________________ test_move_and_stopped_early __________________________

rbv = EpicsSignal(read_pv='gp:gp:float1', name='rbv', value=2.27, timestamp=[167](https://github.com/BCDA-APS/apstools/actions/runs/4199113730/jobs/7283596237#step:15:168)6590367.240589, tolerance=0.0001, auto_monitor=True, string=False, write_pv='gp:gp:float1', limits=False, put_complete=False)
pos = PVPositionerSoftDoneWithStop(prefix='gp:gp:', name='pos', settle_time=0.0, timeout=None, read_attrs=['readback', 'setpoint'], configuration_attrs=['done', 'tolerance', 'target'], limits=None, egu='')

    def test_move_and_stopped_early(rbv, pos):
        def motion(target, delay, interrupt=False):
            timed_pause(0.1)  # allow previous activities to settle down
    
            t0 = time.time()
            delayed_complete(pos, rbv, delay=delay)
            status = pos.move(target, wait=False)  # readback set by delayed_complete()
    
            timed_pause(delay / 2 if interrupt else delay + 0.1)
            dt = time.time() - t0
            # fmt: off
            rb_new = pos.readback.get(use_monitor=False)
            arrived = math.isclose(rb_new, target, abs_tol=pos.actual_tolerance)
            # fmt: on
            if interrupt:
                assert not status.done
                assert not status.success
                assert not arrived, f"{dt=:.3f}"
                pos.stop()
            else:
                assert status.done
                assert status.success
                assert arrived
            confirm_in_position(pos, dt)
    
        target = round(rand(2, 5), 2)
        motion(target, rand(0.4, 0.3))
>       motion(target - 1, rand(0.4, 0.3), True)

/home/runner/work/apstools/apstools/apstools/devices/tests/test_positioner_soft_done.py:299: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

target = 1.27, delay = 0.544[177](https://github.com/BCDA-APS/apstools/actions/runs/4199113730/jobs/7283596237#step:15:178)327374102, interrupt = True

    def motion(target, delay, interrupt=False):
        timed_pause(0.1)  # allow previous activities to settle down
    
        t0 = time.time()
        delayed_complete(pos, rbv, delay=delay)
        status = pos.move(target, wait=False)  # readback set by delayed_complete()
    
        timed_pause(delay / 2 if interrupt else delay + 0.1)
        dt = time.time() - t0
        # fmt: off
        rb_new = pos.readback.get(use_monitor=False)
        arrived = math.isclose(rb_new, target, abs_tol=pos.actual_tolerance)
        # fmt: on
        if interrupt:
>           assert not status.done
E           assert not True
E            +  where True = MoveStatus(done=True, pos=pos, elapsed=0.0, success=True, settle_time=0.0).done

/home/runner/work/apstools/apstools/apstools/devices/tests/test_positioner_soft_done.py:287: AssertionError
=============================== warnings summary ===============================
../../../micromamba-root/envs/anaconda-test-env-py-3.10/lib/python3.11/site-packages/setuptools_scm/git.py:135
  /home/runner/micromamba-root/envs/anaconda-test-env-py-3.10/lib/python3.11/site-packages/setuptools_scm/git.py:135: UserWarning: "/home/runner/work/apstools/apstools" is shallow and may cause errors
    warnings.warn(f'"{wd.path}" is shallow and may cause errors')

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED apstools/devices/tests/test_positioner_soft_done.py::test_move_and_stopped_early - assert not True
 +  where True = MoveStatus(done=True, pos=pos, elapsed=0.0, success=True, settle_time=0.0).done
!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
============= 1 failed, 135 passed, 1 warning in 101.31s (0:01:41) =============
Error: Process completed with exit code 1.

@prjemian
Copy link
Contributor Author

Decided to disable the super_linter workflow. Usefulness was offset by reports that did not make any sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Certain tests still fail randomly in CI
1 participant