Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli: add top-level retry loop. #21

Merged
merged 1 commit into from
May 20, 2020
Merged

cli: add top-level retry loop. #21

merged 1 commit into from
May 20, 2020

Conversation

zevweiss
Copy link
Contributor

We've been hitting some intermittent crashes of the following
form recently:

Traceback (most recent call last):
  File "/usr/local/bin/packet-networking", line 11, in <module_
    sys.exit(cli())
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 722, in _ call _
    return self.main( *a rgs, _ kwargs)
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 697, in main
    ru = self.inuoke(ctx)
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 895, in inuoke
    return ctx.inuoke(self.callback, _ tx.params)
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 535, in inuoke
    return callback( *a rgs, _ kwargs)
  File "/usr/local/lib/python3.5/dist-packages/packetnetworking/cli.py", line 107, in cli
    tasks = builder.run(rootfs)
  File "/usr/local/lib/python3.5/dist-packages/packetnetworking/builder.py", line 67, in run
    return builder.run(rootfs _ ath)
  File "/usr/local/lib/python3.5/dist-packages/packetnetworking/distros/distro_builder.py", line 163, in run
    rendered_tasks = self.render()
  File "/usr/local/lib/python3.5/dist-packages/packetnetworking/distros/distro_builder.py", line 156, in render
    rendered_tasks[path] = template.render(self.context())
  File "/usr/local/lib/python3.5/dist-packages/jinja2/enuironment.py", line 1008, in render
    return self.enuironment.handle_exception(exc_info, True)
  File "/usr/local/lib/python3.5/dist-packages/jinja2/enuironment.py", line 780, in handle_exception
    reraise(exc_type, exc_ualue, tb)
  File "/usr/local/lib/python3.5/dist-packages/jinja2/_compat.py", line 37, in reraise
    raise ualue.with_traceback(tb)
  File "<template_", line 13, in top-leuel template code
jinja2.exceptions.UndefinedError: 'Non_ has no attribute 'addres_

Current speculation is that this is due to some sort of hegel race;
until the root cause is determined & fixed we're hoping this will keep
things running.

[As requested by @truongmd, CC also @dlaube, @dustinmiller1337. Note that I have basically no idea what I'm doing in this codebase, so please review accordingly; apologies if I've done something grossly wrong.]

@zevweiss zevweiss requested a review from mikemrm May 20, 2020 22:25
@zevweiss zevweiss force-pushed the add-retry-loop branch 2 times, most recently from 7c0dede to 36f1067 Compare May 20, 2020 22:31
packetnetworking/cli.py Outdated Show resolved Hide resolved
@zevweiss
Copy link
Contributor Author

I'm a bit confused as to why drone CI is failing; black seems to be complaining about a missing comma after the quiet parameter here, but from what I can see in the commit listed on that page (36f1067) the comma's there...anyone have any thoughts on what might be wrong there? It seems OK when I run it on that file locally:

[zev@packtop: packet-networking]% black --check --diff packetnetworking/cli.py 
All done! ✨ 🍰 ✨
1 file would be left unchanged.

@zevweiss
Copy link
Contributor Author

Hmm, I guess drone was just confused about which revision it was really looking at or something; it now seems to be placated.

@zevweiss zevweiss requested a review from truongmd May 20, 2020 22:59
packetnetworking/cli.py Outdated Show resolved Hide resolved
@mikemrm
Copy link
Contributor

mikemrm commented May 20, 2020

Hmm, I guess drone was just confused about which revision it was really looking at or something; it now seems to be placated.

Yeah, this is a common issue with drone. Sometimes it seems to pull back stale data. Typically re-running the build will resolve it.

@zevweiss
Copy link
Contributor Author

Yeah, this is a common issue with drone. Sometimes it seems to pull back stale data. Typically re-running the build will resolve it.

Ah, okay -- is there an easy way to trigger another run? I could have sworn I saw a "retry" button in its UI at some point, but now I can't find it...

@zevweiss zevweiss requested a review from mikemrm May 20, 2020 23:14
packetnetworking/cli.py Outdated Show resolved Hide resolved
packetnetworking/cli.py Outdated Show resolved Hide resolved
We've been hitting some intermittent crashes of the following
form recently:

    Traceback (most recent call last):
      File "/usr/local/bin/packet-networking", line 11, in <module>
        sys.exit(cli())
      File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 722, in __call__
        return self.main(*args, **kwargs)
      File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 697, in main
        rv = self.invoke(ctx)
      File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 895, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 535, in invoke
        return callback(*args, **kwargs)
      File "/usr/local/lib/python3.5/dist-packages/packetnetworking/cli.py", line 107, in cli
        tasks = builder.run(rootfs)
      File "/usr/local/lib/python3.5/dist-packages/packetnetworking/builder.py", line 67, in run
        return builder.run(rootfs_path)
      File "/usr/local/lib/python3.5/dist-packages/packetnetworking/distros/distro_builder.py", line 163, in run
        rendered_tasks = self.render()
      File "/usr/local/lib/python3.5/dist-packages/packetnetworking/distros/distro_builder.py", line 156, in render
        rendered_tasks[path] = template.render(self.context())
      File "/usr/local/lib/python3.5/dist-packages/jinja2/environment.py", line 1008, in render
        return self.environment.handle_exception(exc_info, True)
      File "/usr/local/lib/python3.5/dist-packages/jinja2/environment.py", line 780, in handle_exception
        reraise(exc_type, exc_value, tb)
      File "/usr/local/lib/python3.5/dist-packages/jinja2/_compat.py", line 37, in reraise
        raise value.with_traceback(tb)
      File "<template>", line 13, in top-level template code
    jinja2.exceptions.UndefinedError: 'None' has no attribute 'address'

Current speculation is that this is due to some sort of hegel race;
until the root cause is determined & fixed we're hoping this will keep
things running.
@zevweiss
Copy link
Contributor Author

Yeah, committed & pushed a little prematurely there...now updated with the fix.

@mikemrm
Copy link
Contributor

mikemrm commented May 20, 2020

Yeah, this is a common issue with drone. Sometimes it seems to pull back stale data. Typically re-running the build will resolve it.

Ah, okay -- is there an easy way to trigger another run? I could have sworn I saw a "retry" button in its UI at some point, but now I can't find it...

Yeah, due to the way this is configured using a drone matrix, you'd have to go into each of the python version and restart it by clicking the hamburger icon at the right and click restart.

@zevweiss zevweiss requested a review from mikemrm May 20, 2020 23:24
Copy link
Contributor

@mikemrm mikemrm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@zevweiss zevweiss merged commit 3da6ef9 into master May 20, 2020
@zevweiss zevweiss deleted the add-retry-loop branch May 20, 2020 23:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants