Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to 2byte unicode implementation like cpython does on linux #979

Open
wants to merge 3,342 commits into
base: master
Choose a base branch
from

Conversation

undingen
Copy link
Contributor

                                   upstream/master:   origin/2b_unicode:
           django_template3.py             2.2s (4)             2.1s (4)  -2.0%
                 pyxl_bench.py             1.8s (4)             1.8s (4)  -0.3%
     sqlalchemy_imperative2.py             2.3s (4)             2.3s (4)  -0.1%
                pyxl_bench2.py             1.3s (4)             1.3s (4)  +2.0%
       django_template3_10x.py            14.2s (4)            13.8s (4)  -3.4%
             pyxl_bench_10x.py            14.3s (4)            13.8s (4)  -3.3%
 sqlalchemy_imperative2_10x.py            17.5s (4)            17.5s (4)  +0.2%
            pyxl_bench2_10x.py            12.2s (4)            12.1s (4)  -0.2%
                       geomean                 5.2s                 5.1s  -0.9%

Daetalus and others added 30 commits August 29, 2015 12:42
Move closer towards exposing the public gc interface only in one file.
…gs in a tuple.

Before we created a tuple just to pass 1 or 2 args and then immediately extracted them
and destroyed the tuple again.
In order to allow rewrites.
I'm wondering if we can't enable this for more classes...
- use callattr directly instead of getattr+runtimeCall
- compvar: teach it that None is never nonzero
For marking collectors, the redudant visits no-op to avoid the
performance hit.
when calling a BoxedWrapperDescriptor don't create a BoxedWrapperObject
Notion of redundant visits to slowly move towards scanning everything
(motivated by namedtuple)

This involves two main changes:
- changing the calling convention to pass `globals` as an argument if needed
  (this only applies going into compiled code, it's already passed into the interpreter)
- changing the llvm irgenerator to use the new globals object
It does use the old parser, but it also forces the use
of the llvm tier for everything which usually ends up being
the more important part of the configuration.
With no modificaions in this commit, so that we can track changes.
Unmodified from llvm rev r230300.
- add note + license
- put in our namespace
- change header guards
- make it match our lint style
- don't apply our formatting
The default is 64, which is quite a lot for our uses.
By switching to our DenseMap/Set fork with that allows parameterizing
on this.

This is the same number that CPython uses.  We were previously
using llvm's default of 64, which is pretty high -- probably fine
for their use cases, but in Python programs with lots of sets/dicts
we spend a lot of time doing malloc() for all those 1kB+ allocations.
This also increases the time it takes to iterate over the dict/set,
since there are more empty buckets that have to be read + skipped.
Support non-module-globals in the llvm tier
Lower initial dict/set size from 64->8
one of the problems were that depending on if all files were all ready parsed we JITed more or less functions,
this changed the total number of jited functions which changed all the module names.
This lead to the result that in order to have all entries inside the cache we needed at least 3runs if the pyc files were old.

The whole name scheme should add some point get improved but for now it should be enough.
add tp_nextiter implementation for our iterators
Generate better module names to make the object cache more effective
kmod and others added 24 commits October 15, 2015 15:10
Remove unneeded pyston workarounds + fix misc cpython test errors
object cache: hash IR before running any opt passes
For types which don't have Py_TPFLAGS_HAVE_VERSION_TAG set we keep the old behaviour
typeLookup: guard on tp_version_tag instead of mro
This converts a lot of -(int) into (-int)
Cleanup the __str__, __repr__ support and let they handle unicode strings
We've been allocating slice objects for slicing operations, but
CPython's internal slice methods already take separate start+stop
arguments.  So if we see a slice, instead of creating the slice
and sending it off to getitem, try calling PySequence_GetSlice.

This will be slower for classes with user-defined __getitem__
functions that can handle slices; we can fix that by adding rewriting
to this new endpoint, but it seems to not matter too much right now.
Some stuff to enable "test_set"
Avoid creating most slice objects
LLVM tier: adjust num of IC slots depending on the num of times the IC got rewritten in the bjit tier
@kmod
Copy link
Collaborator

kmod commented Oct 23, 2015

Unless you feel pretty confident about this, we might want to hold off on it since I have no idea what kinds of subtle effects this might have :/

@undingen
Copy link
Contributor Author

While I liked the django perf improvement I'm fine with not merging this change :)
If this shows up the future as as a large perf change we can reevaluate if we want to risk it.

@kmod
Copy link
Collaborator

kmod commented Feb 2, 2016

It looks like we have an assert in the metaserver codebase that the Python build is UCS-4 :/

@kmod kmod added the idea label Feb 13, 2016
@kmod kmod force-pushed the master branch 2 times, most recently from 352fd89 to 6488a3e Compare October 28, 2020 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants