Switch to 2byte unicode implementation like cpython does on linux #979

undingen · 2015-10-22T10:47:03Z

                                   upstream/master:   origin/2b_unicode:
           django_template3.py             2.2s (4)             2.1s (4)  -2.0%
                 pyxl_bench.py             1.8s (4)             1.8s (4)  -0.3%
     sqlalchemy_imperative2.py             2.3s (4)             2.3s (4)  -0.1%
                pyxl_bench2.py             1.3s (4)             1.3s (4)  +2.0%
       django_template3_10x.py            14.2s (4)            13.8s (4)  -3.4%
             pyxl_bench_10x.py            14.3s (4)            13.8s (4)  -3.3%
 sqlalchemy_imperative2_10x.py            17.5s (4)            17.5s (4)  +0.2%
            pyxl_bench2_10x.py            12.2s (4)            12.1s (4)  -0.2%
                       geomean                 5.2s                 5.1s  -0.9%

Move closer towards exposing the public gc interface only in one file.

…gs in a tuple. Before we created a tuple just to pass 1 or 2 args and then immediately extracted them and destroyed the tuple again.

In order to allow rewrites. I'm wondering if we can't enable this for more classes...

BoxedWrapper optimizations

- use callattr directly instead of getattr+runtimeCall - compvar: teach it that None is never nonzero

For marking collectors, the redudant visits no-op to avoid the performance hit.

when calling a BoxedWrapperDescriptor don't create a BoxedWrapperObject

Complex improment

Notion of redundant visits to slowly move towards scanning everything

(motivated by namedtuple) This involves two main changes: - changing the calling convention to pass `globals` as an argument if needed (this only applies going into compiled code, it's already passed into the interpreter) - changing the llvm irgenerator to use the new globals object

It does use the old parser, but it also forces the use of the llvm tier for everything which usually ends up being the more important part of the configuration.

With no modificaions in this commit, so that we can track changes. Unmodified from llvm rev r230300.

- add note + license - put in our namespace - change header guards - make it match our lint style - don't apply our formatting

The default is 64, which is quite a lot for our uses.

By switching to our DenseMap/Set fork with that allows parameterizing on this. This is the same number that CPython uses. We were previously using llvm's default of 64, which is pretty high -- probably fine for their use cases, but in Python programs with lots of sets/dicts we spend a lot of time doing malloc() for all those 1kB+ allocations. This also increases the time it takes to iterate over the dict/set, since there are more empty buckets that have to be read + skipped.

Support non-module-globals in the llvm tier

Lower initial dict/set size from 64->8

one of the problems were that depending on if all files were all ready parsed we JITed more or less functions, this changed the total number of jited functions which changed all the module names. This lead to the result that in order to have all entries inside the cache we needed at least 3runs if the pyc files were old. The whole name scheme should add some point get improved but for now it should be enough.

add tp_nextiter implementation for our iterators

Generate better module names to make the object cache more effective

Reenable unboxed values

Remove unneeded pyston workarounds + fix misc cpython test errors

Remove an old assert

object cache: hash IR before running any opt passes

For types which don't have Py_TPFLAGS_HAVE_VERSION_TAG set we keep the old behaviour

typeLookup: guard on tp_version_tag instead of mro

…ings

This converts a lot of -(int) into (-int)

Cleanup the __str__, __repr__ support and let they handle unicode strings

We've been allocating slice objects for slicing operations, but CPython's internal slice methods already take separate start+stop arguments. So if we see a slice, instead of creating the slice and sending it off to getitem, try calling PySequence_GetSlice. This will be slower for classes with user-defined __getitem__ functions that can handle slices; we can fix that by adding rewriting to this new endpoint, but it seems to not matter too much right now.

Some stuff to enable "test_set"

Avoid creating most slice objects

…C got rewritten in the bjit tier

LLVM tier: adjust num of IC slots depending on the num of times the IC got rewritten in the bjit tier

rewrite instance_setattro

kmod · 2015-10-23T22:33:53Z

Unless you feel pretty confident about this, we might want to hold off on it since I have no idea what kinds of subtle effects this might have :/

undingen · 2015-10-27T18:10:52Z

While I liked the django perf improvement I'm fine with not merging this change :)
If this shows up the future as as a large perf change we can reevaluate if we want to risk it.

kmod · 2016-02-02T00:09:49Z

It looks like we have an assert in the metaserver codebase that the Python build is UCS-4 :/

Daetalus and others added 30 commits August 29, 2015 12:42

complex improvements

3159688

Move closer towards exposing the public gc interface only in one file.

7d5ea0c

Merge pull request pyston#875 from rudi-c/publicgc

8a1b2e0

Move closer towards exposing the public gc interface only in one file.

Add capi wrapper calling convention which does not require to pass ar…

cb36720

…gs in a tuple. Before we created a tuple just to pass 1 or 2 args and then immediately extracted them and destroyed the tuple again.

unicode_cls: set is_constant = true and is_user_defined = false

aacf932

In order to allow rewrites. I'm wondering if we can't enable this for more classes...

Merge pull request pyston#880 from undingen/wrapper

0a8385c

BoxedWrapper optimizations

When calling a BoxedWrapperDescriptor don't create a BoxedWrapperObject

0e98930

Optimize nonzero()

65e3ce5

- use callattr directly instead of getattr+runtimeCall - compvar: teach it that None is never nonzero

Add a notion of redudant visits, useful for moving collectors.

ee5286c

For marking collectors, the redudant visits no-op to avoid the performance hit.

Redundantly visit some fields of hidden classes that weren't visited.

592f854

Merge pull request pyston#881 from undingen/wrapper2

c5d2083

when calling a BoxedWrapperDescriptor don't create a BoxedWrapperObject

Merge pull request pyston#817 from Daetalus/complex_improment

0abd691

Complex improment

Merge pull request pyston#874 from rudi-c/redundantvisit

7c96b62

Notion of redundant visits to slowly move towards scanning everything

Rename 'old_parser' configuration to 'force_llvm'

db7da04

It does use the old parser, but it also forces the use of the llvm tier for everything which usually ends up being the more important part of the configuration.

Copy in llvm's DenseMap.h and DenseSet.h

4b115c1

With no modificaions in this commit, so that we can track changes. Unmodified from llvm rev r230300.

Match DenseMap/Set to our codebase

22950ec

- add note + license - put in our namespace - change header guards - make it match our lint style - don't apply our formatting

Two more ubenches

ec6a4c7

Parameterize DenseMap on the minimum size

b7ad951

The default is 64, which is quite a lot for our uses.

Support getting custom globals via introspection

7267c07

I have no idea how/where this could get out of sync

46101af

Merge pull request pyston#882 from kmod/perf3

5b0b1fe

Support non-module-globals in the llvm tier

Merge pull request pyston#886 from kmod/perf2

7eec7fa

Lower initial dict/set size from 64->8

add tp_nextiter implementation for our iterators

38d3b28

microoptimize getIsDefinedName + startswith

5c60ed7

Merge pull request pyston#888 from undingen/wrapper4

5963912

add tp_nextiter implementation for our iterators

Merge pull request pyston#887 from undingen/objectcache_2runs

e44735c

Generate better module names to make the object cache more effective

Add infrastructure to call GC handler on stack-bound objects.

5e89ecb

kmod and others added 24 commits October 15, 2015 15:10

Merge pull request pyston#969 from kmod/unboxed3

464c98e

Reenable unboxed values

Merge pull request pyston#944 from undingen/workarounds

4818b5a

Remove unneeded pyston workarounds + fix misc cpython test errors

Remove an old assert

eea94f3

Merge pull request pyston#973 from kmod/unboxed_fix

accd183

Remove an old assert

Merge pull request pyston#970 from undingen/cache_optimized

758a9b3

object cache: hash IR before running any opt passes

typeLookup: guard on tp_version_tag instead of mro + Box::getattr

3ca2326

For types which don't have Py_TPFLAGS_HAVE_VERSION_TAG set we keep the old behaviour

Merge pull request pyston#966 from undingen/method_cache2

a9118c1

typeLookup: guard on tp_version_tag instead of mro

enable test_set and add more tests to set.py

da5d8d0

make dict not recompute the hash value of the element in set

e8fd652

Cleanup the __str__, __repr__ support and let they handle unicode str…

1f22072

…ings

pypa: enable optimizations

dbd46f8

This converts a lot of -(int) into (-int)

some set stuff

7dc0be4

Merge pull request pyston#974 from undingen/str_cleanup

a419290

Cleanup the __str__, __repr__ support and let they handle unicode strings

Work around a pyopenssl test bug

de4d556

Merge pull request pyston#921 from Daetalus/test_set

d88d20e

Some stuff to enable "test_set"

Merge pull request pyston#972 from kmod/avoid_slices

c380bbc

Avoid creating most slice objects

LLVM tier: adjust num of IC slots depending on the num of times the I…

f09ceff

…C got rewritten in the bjit tier

Merge pull request pyston#977 from undingen/adjust_ICs_slots

5472b5e

LLVM tier: adjust num of IC slots depending on the num of times the IC got rewritten in the bjit tier

rewrite instance_setattro

26cbd0d

Merge pull request pyston#976 from undingen/instance_setattr

a4b0a15

rewrite instance_setattro

Switch to 2byte unicode implementation like cpython does on linux

4c101f1

add missing template instances

e6fc53a

Update section ordering

fcd11d3

kmod added the idea label Feb 13, 2016

kmod force-pushed the master branch 2 times, most recently from 352fd89 to 6488a3e Compare October 28, 2020 21:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch to 2byte unicode implementation like cpython does on linux #979

Switch to 2byte unicode implementation like cpython does on linux #979

undingen commented Oct 22, 2015

kmod commented Oct 23, 2015

undingen commented Oct 27, 2015

kmod commented Feb 2, 2016

Switch to 2byte unicode implementation like cpython does on linux #979

Are you sure you want to change the base?

Switch to 2byte unicode implementation like cpython does on linux #979

Conversation

undingen commented Oct 22, 2015

kmod commented Oct 23, 2015

undingen commented Oct 27, 2015

kmod commented Feb 2, 2016