Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use charset-normalizer instead of chardet #744

Merged
merged 3 commits into from
Apr 20, 2022

Conversation

pietermarsman
Copy link
Member

@pietermarsman pietermarsman commented Apr 4, 2022

Pull request

Fix #739.

Use charset-normalizer instead of chardet. It has a less restrictive
licence, is faster and supports more encodings.

How Has This Been Tested?

Ran test suite. Checked that other big projects are using it. It should
be a stand-in replacement.

Checklist

  • I have formatted my code with black.
  • I have added tests that prove my fix is effective or that my feature
    works
  • I have added docstrings to newly created methods and classes
  • I have optimized the code at least one time after creating the initial
    version
  • I have updated the README.md or verified that this
    is not necessary
  • I have updated the readthedocs documentation or
    verified that this is not necessary
  • I have added a concise human-readable description of the change to
    CHANGELOG.md

Copy link
Contributor

@pettzilla1 pettzilla1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

charset-normalizer Has been used as far as I can tell correctly and will allow the project to be used by many more projects

@pietermarsman pietermarsman merged commit 1bf3c42 into master Apr 20, 2022
@pietermarsman pietermarsman deleted the 739-use-chardet-normalizer branch April 20, 2022 19:42
Beants added a commit to HiTalentAlgorithms/pdfminer.six that referenced this pull request Apr 26, 2022
* commit '1bf3c42b59125f4491d863e1c11dca7ebbe96adc':
  Use charset-normalizer instead of chardet (pdfminer#744)
  Refactor ImageWriter and add method for exporting an image from bytes. (pdfminer#737)
  Log warning and continue gracefully if errors in cmap (pdfminer#731)
  Fix log.debug statement in lzw.py by ensuring that self.table is always set (pdfminer#732)
  Raise KeyError when name in name2unicode is not of type str (pdfminer#733)
  Convert fontname to str if it is bytes in HTMLConverter (pdfminer#734)
  Fix github actions tag regex
  Fix github actions tag regex
  Bump version
  Add github action for releasing to pypi if git tag is added. (pdfminer#727)
@musicinmybrain
Copy link
Contributor

As the maintainer of the python-pdfminer package in Fedora Linux and in EPEL, I have to work with the dependency versions that are in the distribution, so it’s nice to learn if there are known issues before I start loosening dependency specifications.

Is there a specific incompatibility that prompted pinning cryptography to 36.x, or is this just precautionary?

Thanks!

@pietermarsman
Copy link
Member Author

Nope, there was not. The dependency cap has been removed (yesterday) in here: #755 (comment).

Thanks for notifying us about this.

@musicinmybrain
Copy link
Contributor

Thanks!

@stefan6419846
Copy link

Jumping on this: Is there any need to actually pin cryptography to at least version 36? Judging from the commits here, it does not seem so and appears to place to much restrictions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GNU licence in dependencies
5 participants