Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing to persist outcomes due to URL vs. DecodedURL confusion #619

Closed
twm opened this issue Apr 10, 2020 · 2 comments
Closed

Failing to persist outcomes due to URL vs. DecodedURL confusion #619

twm opened this issue Apr 10, 2020 · 2 comments
Labels

Comments

@twm
Copy link
Owner

twm commented Apr 10, 2020

An issue much like twisted/treq#282 is biting Yarrharr:

[yarrharr.fetch] Failed to persist 5 outcomes
  Traceback (most recent call last):
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/twisted/internet/defer.py", line 654, in _runCallbacks
      current.result = callback(current.result, *args, **kw)
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/twisted/internet/defer.py", line 1475, in gotResult
      _inlineCallbacks(r, g, status)
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
      result = result.throwExceptionIntoGenerator(g)
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
      return g.throw(self.type, self.value, self.tb)
  --- <exception caught here> ---
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/yarrharr/fetch.py", line 371, in poll
      yield deferToThread(persist_outcomes, outcomes)
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/twisted/python/threadpool.py", line 250, in inContext
      result = inContext.theWork()
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/twisted/python/threadpool.py", line 266, in <lambda>
      inContext.theWork = lambda: context.call(ctx, func, *args, **kw)
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/twisted/python/context.py", line 122, in callWithContext
      return self.currentContext().callWithContext(ctx, func, *args, **kw)
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/twisted/python/context.py", line 85, in callWithContext
      return func(*args,**kw)
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/yarrharr/fetch.py", line 630, in persist_outcomes
      outcome.persist(feed)
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/yarrharr/fetch.py", line 161, in persist
      self._upsert_article(feed, upsert)
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/yarrharr/fetch.py", line 240, in _upsert_article
      created.set_content(upsert.raw_title, upsert.raw_content)
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/yarrharr/models.py", line 240, in set_content
      self.content = content = sanitize.sanitize_html(raw_content)
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/yarrharr/sanitize.py", line 170, in sanitize_html
      return serializer.render(source)
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/html5lib/serializer.py", line 398, in render
      return "".join(list(self.serialize(treewalker)))
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/html5lib/serializer.py", line 265, in serialize
      for token in treewalker:
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/html5lib/filters/optionaltags.py", line 19, in __iter__
      for previous, token, next in self.slider():
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/html5lib/filters/optionaltags.py", line 10, in slider
      for token in self.source:
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/html5lib/filters/sanitizer.py", line 765, in __iter__
      for token in base.Filter.__iter__(self):
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/yarrharr/sanitize.py", line 456, in _wp_smileys
      for token in source:
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/yarrharr/sanitize.py", line 434, in _video_attrs
      for token in source:
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/yarrharr/sanitize.py", line 416, in _adjust_links
      for token in source:
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/yarrharr/sanitize.py", line 384, in __iter__
      for token in BaseFilter.__iter__(self):
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/yarrharr/sanitize.py", line 352, in __iter__
      ((None, 'href'), self._watch_url(url).to_text()),
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/yarrharr/sanitize.py", line 282, in _watch_url
      query=(('v', video_id),),
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/hyperlink/_url.py", line 838, in __init__
      for k, v in iter_pairs(query))
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/hyperlink/_url.py", line 838, in <genexpr>
      for k, v in iter_pairs(query))
    File "/usr/local/lib/yarrharr/lib/python3.5/site-packages/hyperlink/_url.py", line 475, in _textcheck
      % (''.join(delims), name, value))
  builtins.ValueError: one or more reserved delimiters &# present in query parameter value: 'Nd7exbDzU1c&amp'
@twm twm added the bug label Apr 10, 2020
@twm twm changed the title Failing to persist outcomes due to treq#282 Failing to persist outcomes due to URL vs. DecodedURL confusion Apr 10, 2020
@twm
Copy link
Owner Author

twm commented Apr 10, 2020

See also python-hyper/hyperlink#125

@twm
Copy link
Owner Author

twm commented Apr 10, 2020

Fixed in 5dfafff

@twm twm closed this as completed Apr 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant