Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Runners occasionally fail with "RuntimeError: dictionary changed size during iteration" #65082

Closed
2 of 9 tasks
vkotarov opened this issue Aug 29, 2023 · 7 comments
Closed
2 of 9 tasks
Assignees
Labels
Bug broken, incorrect, or confusing behavior

Comments

@vkotarov
Copy link

Description
Recently we have been observing occasional runner failures with one and the same exception that doesn't refer to runner code itself. No recent changes in runner-related configurations, code and environments. The failures result in "RuntimeError: dictionary changed size during iteration" exceptions to be thrown. This seems to happen with any runner that we have developed but doesn't happen on every run.

Setup
Internally developed runners that are executed from salt orchestrations to call external APIs.

  • on-prem machine
  • VM (Virtualbox, KVM, etc. please specify)
  • VM running on a cloud service, please be explicit and add details
  • container (Kubernetes, Docker, containerd, etc. please specify)
  • or a combination, please be explicit
  • jails if it is FreeBSD
  • classic packaging
  • onedir packaging
  • used bootstrap to install

Steps to Reproduce the behavior
Not easily reproducible as it is an intermittent issue.

Expected behavior
Runners are successfully executed without throwing exceptions

Screenshots

Exception occurred in runner myrunner.myfunction: Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/salt/client/mixins.py", line 349, in low
    for mod_name in self_functions.keys():
  File "/usr/lib64/python3.6/_collections_abc.py", line 720, in __iter__
    yield from self._mapping
  File "/usr/lib/python3.6/site-packages/salt/utils/lazy.py", line 117, in __iter__
    self._load_all()
  File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1055, in _load_all
    self._load_module(name)
  File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 849, in _load_module
    if hasattr(mod_named_context, "default"):
  File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 357, in __getattr__
    if self._load_module(name) and mod_name in self.loaded_modules:
  File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 822, in _load_module
    self.__clean_sys_path()
  File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 641, in __clean_sys_path
    importlib.invalidate_caches()
  File "/usr/lib64/python3.6/importlib/__init__.py", line 71, in invalidate_caches
    finder.invalidate_caches()
  File "<frozen importlib._bootstrap_external>", line 1063, in invalidate_caches
RuntimeError: dictionary changed size during iteration

Versions Report

salt --versions-report
Salt Version:
          Salt: 3004.2
 
Dependency Versions:
          cffi: 1.9.1
      cherrypy: unknown
      dateutil: 2.8.2
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 2.11.1
       libgit2: Not Installed
      M2Crypto: 0.35.2
          Mako: Not Installed
       msgpack: 0.6.2
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: 2.21
      pycrypto: 3.14.1
  pycryptodome: 3.14.1
        pygit2: Not Installed
        Python: 3.6.8 (default, Jun  9 2023, 11:59:08)
  python-gnupg: Not Installed
        PyYAML: 5.4.1
         PyZMQ: 21.0.0
         smmap: 5.0.0
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.3.3
 
Salt Extensions:
        SSEAPE: 8.8.0.7+1.g2c14d89
 
System Versions:
          dist: oracle 7.9 n/a
        locale: UTF-8
       machine: x86_64
       release: 5.4.17-2136.320.7.1.el7uek.x86_64
        system: Linux
       version: Oracle Linux Server 7.9 n/a

Additional context
Add any other context about the problem here.

@vkotarov vkotarov added Bug broken, incorrect, or confusing behavior needs-triage labels Aug 29, 2023
@anilsil
Copy link

anilsil commented Aug 29, 2023

@vkotarov can you test the issue with the latest 3006?

@anilsil anilsil added this to the Sulfur v3006.4 milestone Aug 29, 2023
@vkotarov
Copy link
Author

@vkotarov can you test the issue with the latest 3006?

Unfortunately, no. We are stuck to classic packaging on salt-masters due to SSC-related constrains.

@dmurphy18
Copy link
Contributor

@vkotarov Can you try with the latest classic package version 3005.2 ?, given you are running 3004.2

@vkotarov
Copy link
Author

@vkotarov Can you try with the latest classic package version 3005.2 ?, given you are running 3004.2

In theory - yes, but this is going to take a lot of time as SSC also needs to be upgraded.

Also, I'm trying to understand if this comment relates to something internal to the master that breaks it or to the "new-style" minion response that leads to master overload?

@frebib
Copy link
Contributor

frebib commented Aug 30, 2023

That comment is in reference to a patch I wrote to fix the issue you linked. Deltaproxy works fine in released/master versions of Salt to my knowledge. The master overload is a result of #61468 which is included in 3005.x. It was fixed the other day in #65053 which should be available in the next 3006.3 release and 3007 whenever that eventually drops.

@dmurphy18
Copy link
Contributor

@vkotarov Can you consider closing this, since it is fixed in Salt 3006.x. Salt 3005.x is about to EOL next month.

@vkotarov vkotarov closed this as not planned Won't fix, can't repro, duplicate, stale Jan 9, 2024
@vkotarov
Copy link
Author

vkotarov commented Jan 9, 2024

Closing as after master restart I no longer get these, too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior
Projects
None yet
Development

No branches or pull requests

5 participants