-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3.7 support #16
Merged
Merged
3.7 support #16
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Codecov Report
@@ Coverage Diff @@
## master #16 +/- ##
==========================================
- Coverage 92.47% 90.49% -1.99%
==========================================
Files 17 6 -11
Lines 851 852 +1
Branches 129 134 +5
==========================================
- Hits 787 771 -16
- Misses 31 39 +8
- Partials 33 42 +9
Continue to review full report at Codecov.
|
- Update Travis-CI, Appveyor, and tox config files. - Add tests that cover certain edge cases and Py37 situations. - Autoformat the tests.
Instead of having ~1 function per file, functions have been grouped into files with a higher level concept. In general there is now a 1:1 relationship between source and header files. Along with the refactor has been improved extraction of single numeric unicode characters that works for all python versions and is significantly more performant than the previous implementation. Also, some wonderful things have happened with the string parsing: - On Python >=3.6, if an underscore is detected in a string the underscores are preemptively removed. This way no processing of underscores is required in downstream parsing. - The issue that started #15 was actually solved! Essentially the Logic of _PyUnicode_TransformDecimalAndSpaceToASCII from Python 3.6 has been reimplemented (with fewer memory allocations to boot!).
If it looks like an octal with base 0 it should be illegal, UNLESS the number would be 0. That UNLESS was missed previously.
If a unicode object is stored internally like ASCII, it can be cast to a char* array directly instead of allocating new memory. This is a significant time saver. In cases where the internal unicode data is not already stored in ASCII form, the speed of the for loop to perform the conversion has been improved. This has been done by putting the simplest conditions to test at the top of the if statement, and also by applying the register modifier to the looping variables.
When allocating memory for char* arrays or converting to float, it is possible that some sort of memory error will occur. The intention is that these events will ALWAYS raise an exception. The top-level object conversion code now checks for errno==ENOMEM, and if so returns NULL no matter if exceptions are requested or not.
When using Python's built-in string->float converters, fastnumbers used to pre-validate that the input was OK in order to avoid the python built-in converter from setting an exception. The rational was that exception setting takes a long time. There were a couple of invalid assumptions with that logic. - Setting an exception does not take that long, and it's certainly not offset by the amount of time it takes to pre-validate. - The float built-in can be made to not raise an exception, which makes pre-validation a complete waste of time. This is still done for integers because there is no way to disable exception generation for PyLong_FromString, and generating an exception actually does take a long time for that function because of an unnecessary creation of a PyUnicode object just to build the exception.
The new algorithm tries to do as little work as possible, and uses memchr to find the decimal place instead of explicit searching.
1) All sign parsing happens in one place. 2) Gratuitous use of the register keyword. 3) Use of booleans to determine input validity instead of marking string locations.
Clang on OSX doesn't seem to emit many warnings...
SethMMorton
force-pushed
the
3.7-support
branch
8 times, most recently
from
August 1, 2018 05:47
21459c2
to
51ac4b6
Compare
There are two sources of error occurring. 1) For older versions of MSVC, the conversion in PyLong_FromDouble does not seem to be robust... the same value is not returned on subsequent calls. This is "fixed" simply by marking those functions as flaky and ignoring them if they fail on windows for Python 3.7 and 3.4. 2) For older versions of MSVC the smallest negative exponent that can be safely converted is smaller than it used to be. This may have been because of a separation of float parsing and sign handling. An attempt to unify them again has been made.
SethMMorton
force-pushed
the
3.7-support
branch
from
August 1, 2018 05:57
51ac4b6
to
d6444bb
Compare
- Remove extraneous includes and improve include order. - Move PyNumberType from objects.h to options.h. - Apply static keyword consistently. - typedef struct Options to Options. - Remove PyBool_from_bool.
All the failures are related to #17. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is intended to close issue #15. It ended up as a re-write of the internal structure to convert all input to a uniform data structure to make parsing simpler. The result was a net reduction of 10 files.