Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 2.0 - Candidate #45

Merged
merged 38 commits into from
Jul 2, 2021
Merged

Version 2.0 - Candidate #45

merged 38 commits into from
Jul 2, 2021

Conversation

Ousret
Copy link
Member

@Ousret Ousret commented Jun 24, 2021

This package is reaching its two years of existence, now is a good time for a nice refresh.

Main improvements:

  • 4x to 5x faster than the v1.4
  • 2x faster than chardet v4.0

And up to 10x faster when the "preemptive" detector is active (default active)

  • Accent has been made on UTF-8 detection, as it is the most common one nowadays. Should perform rather instant.

🎨

  • Code refactoring to ease the readability and maintainability

  • Dropping cached_property for Python 3.5

After much consideration, this release won't drop python 3.5 as of yet.
This PR will ship with greater care around the CI/CD. Sorry in advance, but it is going to be strict to ensure backward compatibility.

…ormance gain | Maintainability Improvement | Base commit

This package is reaching its two years of existence, now is a good time for a nice refresh
@Ousret Ousret added documentation Improvements or additions to documentation enhancement New feature or request labels Jun 24, 2021
@Ousret Ousret changed the title ❇️ 🎉 Version 2.0 Version 2.0 - Candidate Jun 29, 2021
@Ousret
Copy link
Member Author

Ousret commented Jun 30, 2021

Performance tests using GHA runner.
Original claims are now confirmed and solid. On large/medium payload, the performance gap is huge.

On normal-sized files

------------------------------
--> Chardet Conclusions
   --> Avg: 0.08440268115942029s
   --> 99th: 0.61577s
   --> 95th: 0.33059s
   --> 50th: 0.03201s
------------------------------
--> Charset-Normalizer Conclusions
   --> Avg: 0.04982222222222222s
   --> 99th: 0.48197s
   --> 95th: 0.2635s
   --> 50th: 0.01967s

On medium-sized files

Apply x16 on the original file

------------------------------
--> Chardet Conclusions
   --> Avg: 1.1437113768115943s
   --> 99th: 9.66639s
   --> 95th: 4.76477s
   --> 50th: 0.30817s
------------------------------
--> Charset-Normalizer Conclusions
   --> Avg: 0.06866171497584542s
   --> 99th: 0.64688s
   --> 95th: 0.41864s
   --> 50th: 0.02136s

On large-sized files

Apply x32 on the original file

------------------------------
--> Chardet Conclusions
   --> Avg: 2.3235273188405796s
   --> 99th: 20.01236s
   --> 95th: 10.22639s
   --> 50th: 0.62519s
------------------------------
--> Charset-Normalizer Conclusions
   --> Avg: 0.07789222222222222s
   --> 99th: 0.67865s
   --> 95th: 0.45936s
   --> 50th: 0.02218s

@Ousret Ousret merged commit 642e717 into master Jul 2, 2021
@Ousret Ousret deleted the v2-proposal branch July 2, 2021 19:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Development

Successfully merging this pull request may close these issues.

1 participant