Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cythonize CombinatorialFreeModuleElement #22632

Closed
tscrim opened this issue Mar 17, 2017 · 95 comments
Closed

Cythonize CombinatorialFreeModuleElement #22632

tscrim opened this issue Mar 17, 2017 · 95 comments

Comments

@tscrim
Copy link
Collaborator

tscrim commented Mar 17, 2017

We move CombintorialFreeModuleElement to its own class IndexedFreeModuleElement and cythonize it for (future) speed gains.

CC: @sagetrac-sage-combinat @nthiery @jdemeyer @fchapoton

Component: performance

Keywords: days85

Author: Travis Scrimshaw

Branch: 20cd0da

Reviewer: Nicolas M. Thiéry

Issue created by migration from https://trac.sagemath.org/ticket/22632

@tscrim tscrim added this to the sage-8.0 milestone Mar 17, 2017
@tscrim
Copy link
Collaborator Author

tscrim commented Mar 17, 2017

@tscrim
Copy link
Collaborator Author

tscrim commented Mar 17, 2017

New commits:

e918674Moving CombinatorialFreeModuleElement to own (Cython) file.

@tscrim
Copy link
Collaborator Author

tscrim commented Mar 17, 2017

Commit: e918674

@hivert
Copy link

hivert commented Mar 17, 2017

comment:3

I Travis,

I'm curious, is there any performance gain. I had the impression that storing everything in python dict simply kills all possibilities of optimization ?

We discussed during past sage days, to re-implement combinatorial free modules, using arrays with a global dict for all object used only during input and output and not during linear algebra.

Florent

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 17, 2017

Changed commit from e918674 to 6b9d88b

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 17, 2017

Branch pushed to git repo; I updated commit sha1. New commits:

6b9d88bDoing some additional tweaks.

@tscrim
Copy link
Collaborator Author

tscrim commented Mar 17, 2017

comment:5

There seems to be a little bit with doing basic operations:

sage: C = CombinatorialFreeModule(QQ, [1,2,3])
sage: x = sum(C.basis())
sage: %timeit [x + x for _ in range(1000)]
100 loops, best of 3: 2.57 ms per loop

vs old:

sage: C = CombinatorialFreeModule(QQ, [1,2,3])
sage: x = sum(C.basis())
sage: %timeit [x + x for _ in range(1000)]
100 loops, best of 3: 2.94 ms per loop

While it's not much, it is something like a 10-15% speedup, but it is something likely to be called a significant number of times. It's even more so for scalar multiplication:

sage: s = SymmetricFunctions(ZZ).s()
sage: x = s.an_element()
sage: %timeit [4*x for _ in range(1000)]
100 loops, best of 3: 2.49 ms per loop

vs old:

sage: %timeit [4*x for _ in range(1000)]
100 loops, best of 3: 3.42 ms per loop

I might even be able to get some more speed if I am smarter in how we create the results of, e.g., _add_.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 17, 2017

Branch pushed to git repo; I updated commit sha1. New commits:

f1fcdd9Do not go through _from_dict but directly create an element.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 17, 2017

Changed commit from 6b9d88b to f1fcdd9

@tscrim
Copy link
Collaborator Author

tscrim commented Mar 17, 2017

comment:7

That gives a very nice speed boost:

sage: C = CombinatorialFreeModule(QQ, [1,2,3])
....: x = sum(C.basis())
....: %timeit [x + x for _ in range(1000)]
....: 
1000 loops, best of 3: 1.84 ms per loop
sage: s = SymmetricFunctions(ZZ).s()
sage: x = s.an_element()
sage: %timeit [4*x for _ in range(1000)]
100 loops, best of 3: 2.05 ms per loop

Now those Python calls become relatively expensive, so we should avoid them.

@tscrim
Copy link
Collaborator Author

tscrim commented Mar 17, 2017

comment:8

Frédéric, I'm cc-ing you because you might want to note that I am taking care of the __cmp__ issue here as well.

@fchapoton
Copy link
Contributor

comment:9

many broken doctests..

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 25, 2017

Branch pushed to git repo; I updated commit sha1. New commits:

1d4aa1cFixing unpickling issues.
4ee3313Fixing Cython not inheriting `_rmul_` and `_lmul_` from different signatures.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 25, 2017

Changed commit from f1fcdd9 to 4ee3313

@tscrim
Copy link
Collaborator Author

tscrim commented Mar 25, 2017

comment:11

Fixed the pickling issues and the other failures (incompatible signatures for _lmul_ and _rmul_).

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 28, 2017

Changed commit from 4ee3313 to d87735d

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 28, 2017

Branch pushed to git repo; I updated commit sha1. New commits:

d87735dFixed trivial doctest failure.

@tscrim
Copy link
Collaborator Author

tscrim commented Apr 14, 2017

comment:13

Patchbot is green (essentially, doctest failure is independent).

@tscrim
Copy link
Collaborator Author

tscrim commented May 13, 2017

comment:14

ping

@nthiery
Copy link
Contributor

nthiery commented May 16, 2017

comment:15
diff --git a/src/doc/en/thematic_tutorials/tutorial-objects-and-classes.rst b/src/doc/en/thematic_tutorials/tutorial-objects-and-classes.rst
index 3f964bf..749ab5f 100644
--- a/src/doc/en/thematic_tutorials/tutorial-objects-and-classes.rst
+++ b/src/doc/en/thematic_tutorials/tutorial-objects-and-classes.rst
@@ -285,8 +285,7 @@ Some particular actions modify the data structure of ``el``::
        sage: el.__class__
         <class 'sage.combinat.free_module.CombinatorialFreeModule_with_category.element_class'>
        sage: el.__dict__
-       {'__custom_name': 'foo',
-        '_monomial_coefficients': {[1, 2, 3]: 1, [1, 3, 2]: 3}}
+       {'__custom_name': 'foo'}

Plausibly the explanation around that change needs to be updated a bit: there remains no "normal" attribute in this example. Maybe el.foo=1 could be issued before, and a comment added after that _monomial_coefficients is a Cython attribute?

In general, this seems to be calling for changing the running example for this tutorial. The current one is a bit complicated. But that's for a later ticket.

@nthiery
Copy link
Contributor

nthiery commented May 16, 2017

comment:16
@@ -242,7 +238,8 @@ class FreeAlgebraElement(AlgebraElement, CombinatorialFreeModuleElement):
             if self_on_left:
                 return Factorization([(self, 1)]) * scalar
             return scalar * Factorization([(self, 1)])
-        return super(FreeAlgebraElement, self)._acted_upon_(scalar, self_on_left)
+        ret = super(FreeAlgebraElement, self)._acted_upon_(scalar, self_on_left)
+        return ret
 

What's the point?

@nthiery
Copy link
Contributor

nthiery commented May 20, 2017

comment:63

Arg, the exact same _mul_ mistake has appeared a couple days ago in some other ticket.

I'll set back a positive review when the pdf doc compilation will be finished, in case there wouldbe something else.

Thanks Volker for reporting ...

@nthiery
Copy link
Contributor

nthiery commented May 20, 2017

comment:64

The only other error I get is:

[docpdf] l.42 \addto
[docpdf]            \extrasmagyar{\def\pageautorefname{page}}

And I assume this is related to an outdate latex install on my machine:

[docpdf] Package sphinx Warning: 
[docpdf] ******** ERROR !! PLEASE UPDATE titlesec.sty !!********
[docpdf] ******** THIS VERSION SWALLOWS SECTION NUMBERS.********.

So, putting this back to positive review. Travis, can you double check things compile on your machine?

@tscrim
Copy link
Collaborator Author

tscrim commented May 20, 2017

comment:66

pdf doc built for me too.

@vbraun
Copy link
Member

vbraun commented May 21, 2017

Changed branch from public/combinat/cythonize_CFM_element-22632 to 20cd0da

@jdemeyer
Copy link

Changed commit from 20cd0da to none

@jdemeyer
Copy link

comment:68

Question: what exactly is the purpose of the condition involving is_extension_type() in __make_element_class__?

I am trying to understand how/why is_extension_type is used in Sage. In other places in parent.pyx, it is to determine whether __class__ can be assigned to. But in __make_element_class__, it's not so clear to me.

(I know it's not directly related to this ticket, but git blame brought me here since this ticket made changes to __make_element_class__)

@nthiery
Copy link
Contributor

nthiery commented Jul 13, 2017

comment:69

Hi Jeroen!

In __make_element_class__, we need to decide how elements shall inherit from categories: either by subclassing P.Element (or the relevant class), or by using the getattr trick. The current rule is to subclass if P.Element is a Python class, and use getattr if P.Element is an extension type. is_extension_type is used to do the test.

@jdemeyer
Copy link

comment:70

Thanks. That answers the factual question ("how is it done?") but not really the reason.

There are many differences between Python and Cython classes and I'm trying to understand which property matters here. I am asking because I tried to always use the dynamic class (using inherit=True in __make_element_class__ even for Cython element classes) and there is not much that breaks.

These are the main differences between Python and Cython classes that I can think of:

  1. The fact that __class__ can be assigned to for Python classes (I don't think this is the reason here)

  2. The fact that Python classes have a __dict__ (this is no longer true: Python classes with __slots__ don't have a __dict__ by default and Cython classes do support __dict__ if you declare it as attribute)

  3. The fact that Python classes always support multiple inheritance, which may or may not work for Cython classes

  4. Pickling (although Cython 0.26 does support pickling of Cython classes)

  5. Various performance differences

  6. ...

@jdemeyer
Copy link

comment:71

Replying to @jdemeyer:

I am asking because I tried to always use the dynamic class (using inherit=True in __make_element_class__ even for Cython element classes) and there is not much that breaks.

To elaborate on this: after making this change

diff --git a/src/sage/structure/parent.pyx b/src/sage/structure/parent.pyx
index a387136..ea7c60c 100644
--- a/src/sage/structure/parent.pyx
+++ b/src/sage/structure/parent.pyx
@@ -571,7 +571,7 @@ cdef class Parent(sage.structure.category_object.CategoryObject):
         # By default, don't fiddle with extension types yet; inheritance from
         # categories will probably be achieved in a different way
         if inherit is None:
-            inherit = not is_extension_type(cls)
+            inherit = True
         if inherit:
             if name is None:
                 name = "%s_with_category"%cls.__name__

I ran all doctests in rings, categories and structure and the only non-trivial doctest failures were involving enumerated sets (src/sage/categories/examples/infinite_enumerated_sets.py, src/sage/categories/sets_cat.py, src/sage/categories/enumerated_sets.py)

@nthiery
Copy link
Contributor

nthiery commented Jul 14, 2017

comment:72

Hi Jeroen,

Thanks for running this interesting experiment!

I indeed would have been expecting things not too break too much. Systematically using dynamic classes would certainly simplify Sage's infrastructure. The only potential reason for not doing it is performance (speed and memory footprint). At the time we set this up, there was some consensual claim that systematically using a dynamic class would induce a performance loss for low granularity objects (e.g. finite field elements). Hence the above rule of thumb I implemented for deciding when to do what.

However I don't remember anyone doing some serious benchmarking to actually assess the validity of that claim. That would be very welcome! And this hints again at having some speed-regression test infrastructure for Sage.

Cheers,
Nicolas

PS: I could imagine code breaking in situations like:

cdef class MyElement:
    ....

cdef foo(MyElement x):
    x.bar()

where foo would be called on an instance of a dynamic subclass of MyElement, and foo be provided by some category.

Apparently we have no instance of that idiom in Sage :-)

@jdemeyer
Copy link

comment:73

Comments on this actual ticket: contrary to most element classes which are implemented in Cython, it seems that CombinatorialFreeModule_with_category.element_class actually does do the inheritance from the category in the "normal" way, using dynamic_class(), just like Python classes. So this is a good testcase for #23435.

@jdemeyer
Copy link

comment:74

In the light of #23435, is it important that CombinatorialFreeModule_with_category.element_class has a __dict__? In other words, is it important to support the following (example taken from src/doc/en/thematic_tutorials/tutorial-objects-and-classes.rst):

Some particular actions modify the data structure of ``el``::

    sage: el.rename("bla")
    sage: el
    bla

.. note::

    The class is stored in a particular attribute called ``__class__``,
    and the normal attributes are stored in a dictionary called ``__dict__``::

        sage: F = CombinatorialFreeModule(QQ, Permutations())
        sage: el = 3*F([1,3,2])+ F([1,2,3])
        sage: el.rename("foo")
        sage: el.blah = 42
        sage: el.__class__
        <class 'sage.combinat.free_module.CombinatorialFreeModule_with_category.element_class'>
        sage: el.__dict__
        {'__custom_name': 'foo',
         'blah': 42}

@jdemeyer
Copy link

comment:75

Another question about this ticket: why are the arithmetic methods of IndexedFreeModuleElement implemented as cdef instead of cpdef?

    cdef _add_(self, other):

Every other class (except for IndexedFreeModuleElement and LieAlgebraElement) implements those as cpdef _add_.

This is confusing because

sage: F = CombinatorialFreeModule(QQ, Permutations())
sage: el = 3*F([1,3,2])+ F([1,2,3])
sage: el._add_(el)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-31-91def7298b70> in <module>()
----> 1 el._add_(el)

TypeError: 'NotImplementedType' object is not callable

It also prevents a Python class inheriting from IndexedFreeModuleElement to override this _add_.

@tscrim
Copy link
Collaborator Author

tscrim commented Jul 15, 2017

comment:76

Replying to @jdemeyer:

Another question about this ticket: why are the arithmetic methods of IndexedFreeModuleElement implemented as cdef instead of cpdef?

    cdef _add_(self, other):

Every other class (except for IndexedFreeModuleElement and LieAlgebraElement) implements those as cpdef _add_.

That is not quite true. They are cdef in Element, and so I followed suit. I see now in the doc for Element, that it says it should be cpdef. Is it because Element is special in that they are essentially acting as pure virtual methods that they aren't cpdef?

@jdemeyer
Copy link

comment:77

Follow-up: #23440

@jdemeyer
Copy link

comment:78

Replying to @tscrim:

That is not quite true. They are cdef in Element, and so I followed suit. I see now in the doc for Element, that it says it should be cpdef. Is it because Element is special in that they are essentially acting as pure virtual methods that they aren't cpdef?

Yes, the Element base class is special. The arithmetic methods of Element do indeed act as "pure virtual" methods. I implemented them that way to allow fast calling in the Cython world while at the same time not existing in the Python world.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants