Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Structure code into submodules #15

Closed
fredrik-johansson opened this issue Apr 5, 2021 · 15 comments
Closed

Structure code into submodules #15

fredrik-johansson opened this issue Apr 5, 2021 · 15 comments

Comments

@fredrik-johansson
Copy link
Collaborator

Currently pyflint.pyx includes all .pyx files textually. This is a headache for development: changing any one file and recompiling forces a recompile of all of python-flint, which takes forever. There should be separate submodules that can be compiled independently.

@GiacomoPope
Copy link
Contributor

So, in #61 I did the absolute minimum to start exploring how we can start factoring this repository into submodules. The work so far has taken the base classes flint_* (e.g. flint_scalar) and put them into flint_base.flint_base. I have also taken the global context and placed this within flint_base.flint_context

For the remaining code, my thought was to do something like:

─ fmpz/
│  ├─ fmpz_types.pxd 
│  ├─ fmpz.pxd
│  ├─ fmpz.pyx
│  ├─ fmpz_vec.pxd
│  ├─ fmpz_vec.pyx
│  ├─ fmpz_mat.pxd
│  ├─ fmpz_mat.pyx
│  ├─ fmpz_poly.pxd
│  ├─ fmpz_poly.pyx
│  ├─ fmpz_mpoly.pxd
│  ├─ fmpz_mpoly.pyx
...

For each of the main types, then a user would do

from flint.fmpz import fmpz, fmpz_poly

The same would naturally extend to whatever functions we have.

@GiacomoPope
Copy link
Contributor

However, if we're going through the work of refactoring and building up sensible class structures, then I think we might want something like

cdef flint_elem:
cdef flint_scalar(flint_elem):
cdef generic_ring(flint_scalar):
cdef fmpz(generic_ring):
cdef fmpq(generic_ring):
...

and so maybe a good first step would be to include the generic ring bindings into submodules in a clean and well documented way, and then we can refactor the existing code to sit on top of all these generics?

@oscarbenjamin
Copy link
Collaborator

For each of the main types, then a user would do

from flint.fmpz import fmpz, fmpz_poly

This still makes it impossible to import fmpz without loading fmpz_poly etc.

I think that it is better to just have precisely two ways of importing things:

Either use fully qualified imports so

from flint.fmpz.fmpz_poly import fmpz_poly

or we provide a convenience layer:

from flint.interactive import *

The codebase should only ever use fully qualified imports internally and each module should only import the things that it needs so that one import does not recursively import everything else.

I suggested having levels here:
#55 (comment)

Probably all "level 1" things like these should be under a particular submodule like flint.types or something and then levels 0 and 2 could have separate submodules.

Some Python tooling (Sphinx, mypy etc) struggles with the idea that a module has the same name as something inside the module like:

from flint.fmpz.fmpz import fmpz

It is especially problematic if the flint/fmpz/__init__.py does from .fmpz import fmpz because it creates an ambiguity about whether flint.fmpz.fmpz means the module or the type. I suggest just systematically adding something like _type to the name so then it could just be something like:

from flint.types.fmpz_type import fmpz

It is a bit redundant but that is not a problem for the internals of the codebase as long as the import statements are systematic and predictable. It is also fine to have long import statements like this for downstream codebases as long as module names are stable over time so it is better to make sure that you would never regret putting something at top-level in future.

For interactive use it is not nice to have to fully qualify the imports, hence flint.interactive (perhaps a better name can be used).

@GiacomoPope
Copy link
Contributor

GiacomoPope commented Aug 21, 2023

OK -- this looks sensible to me. I personally will be offline for a few weeks, so will be delayed in any work on this but my rough plan would be:

  1. First remove all includes from the main file
    1. Create a corresponding .pxd for each .pyx for each current type. This will allow use to lighten _flint too, as we can have the type specific header info in these .pxd files.
    2. Add these files to the dir. types with the following structure
    ─ types/
    │  ├─ flint_types.pxd
    │  ├─ fmpz_type/
    │  │  ├─ fmpz.pxd
    │  │  ├─ fmpz.pyx
    │  ├─ fmpz_vec_type/
    │  │  ├─ fmpz_vec.pxd
    │  │  ├─ fmpz_vec.pyx
    │  ├─ fmpz_mat_type/
    │  │  ├─ fmpz_mat.pxd
    │  |  ├─ fmpz_mat.pyx
    │  ├─ fmpz_poly_type/
    │  │  ├─ fmpz_poly.pxd
    │  │  ├─ fmpz_poly.pyx
    │  ├─ fmpz_mpoly_type/
    │  │  ├─ fmpz_mpoly.pxd
    │  │  ├─ fmpz_mpoly.pyx
    │  ├─ fmpq_type/
    │  │  ├─ ...
    │  ├─ nnmod_type/
    │  │  ├─ ...
    
    1. Import all of these back into pyflint.pxd so that all current tests still work

Then, after this has been done, the PR can be modified so that the imports throughout the module work nicely at the python level?

@oscarbenjamin
Copy link
Collaborator

2. Add these files to the dir. types with the following structure

─ types/
│  ├─ fmpz_type/
│  │  ├─ fmpz.pxd
│  │  ├─ fmpz.pyx
│  ├─ fmpz_vec_type/
│  │  ├─ fmpz_vec.pxd
│  │  ├─ fmpz_vec.pyx

Would this create packages i.e with __init__.py for each of fmpz_type etc (not sure how this works in Cython)?

Also would it mean that there is an fmpz module with the same name as the type it contains e.g.:

from flint.types.fmpz_type.fmpz import fmpz

Perhaps simpler would just be

flint/
├─ types/
│  ├─ __init__.py
│  ├─ fmpz_type.pxd
│  ├─ fmpz_type.pyx
│  ├─ fmpz_vec_type.pxd
│  ├─ fmpz_vec_type.pyx

Then each .pyx becomes a runtime module in the types package and it is:

from flint.types.fmpz_type import fmpz

I'm not especially tied to any particular names here like types or _type but I do think that:

  • Everything should be in a subpackage analogous to types/ above (this is needed for forward compatibility in case any redesign of the public import interface might ever be wanted in future).
  • The module names should be different from the types (although in a predictable way).
  • The general naming scheme and structure should be simple and predictable both for source layout and for knowing how to write the import statement.
  • The overall structure should make it (at least in principle) possible that you could import only a subset of things without loading every module into memory at runtime.

@GiacomoPope
Copy link
Contributor

I think your suggestion is better and agree with your bullet points.

After the refactor, I think what we can do in pyflint.pyx is

from flint.types.fmpz_type cimport fmpz
fmpz = fmpz

and then the .pxd will export it as it currently is done. Of course, we then want to do this

from flint.interactive import *

Or

from flint.types import fmpz

in the python world (or something similar) but this will be solved as we go?

@oscarbenjamin
Copy link
Collaborator

f course, we then want to do this

from flint.interactive import *

That would be my suggestion. Again I'm not tied to "interactive" as a name. Some people would certainly want to use it in a non-interactive context.

Ideally it would be all like:

from flint.all import *

However "all" is a builtin in Python so that name is best avoided.

Or

from flint.types import fmpz

It would have to be

from flint.types.fmpz_type import fmpz

because if the flint/types/__init__.py imports fmpz etc from each submodule then it again becomes impossible to import anything without loading everything into memory.

Even if we don't want to make a change to from flint import * yet I think that it is important to verify that the structure would make it possible to avoid loading everything into memory so that change should be tested (and memory use etc checked) before committing to any structure now.

@GiacomoPope
Copy link
Contributor

In the context of #59 I'm back from my holiday and so can assist with the refactoring of the current code if this would be helpful. Once we have all come to a solid idea of a plan we should be able to all work together to make quicker work of the refactoring

@deinst
Copy link
Contributor

deinst commented Sep 4, 2023

If you could look at my fork and make any suggestions, they would be appreciated. I know the structure is all wrong and will spend some time today to rectify that by moving everything under a types subdirectory.

@deinst
Copy link
Contributor

deinst commented Sep 4, 2023

Now that I have caffinated, there are two simple questions that I basically punted.

  1. Where do we store the ctx object? flint_base.flint_context seems like the best place. Why do ctx and thectx both exist?
  2. What to do with FMPZ_UNKNOWN, FMPZ_REF, and FMPZ_TMP. It seems best to just cdef them as constants in _flint.pxd

@GiacomoPope
Copy link
Contributor

I don't have good answers for these:

  1. I have no idea, for me it seems like thectx should simply be ctx and there's no need for both pointers to the global context to be available.

  2. This seems sensible.

I had a look at your branch, as well as the movement of _XXX.pyx into /type/XXX.pyx I think it would be good to remove the need for

from flint._flint cimport *

and instead move the type specific imports from _flint into the type .pxd files so that the only remaining imports from _flint are the generic types which are used by many different types.

As an example, I was imagining moving

cdef extern from "flint/nmod_mat.h":
    ctypedef struct nmod_mat_struct:
        mp_limb_t * entries
        long r
        long c
        mp_limb_t ** rows
        nmod_t mod
    ctypedef nmod_mat_struct nmod_mat_t[1]
    mp_limb_t nmod_mat_entry(nmod_mat_t mat, long i, long j)
    long nmod_mat_nrows(nmod_mat_t mat)
    long nmod_mat_ncols(nmod_mat_t mat)
    void _nmod_mat_set_mod(nmod_mat_t mat, mp_limb_t n)
    void nmod_mat_init(nmod_mat_t mat, long rows, long cols, mp_limb_t n)
    void nmod_mat_init_set(nmod_mat_t mat, nmod_mat_t src)
    void nmod_mat_clear(nmod_mat_t mat)
    void nmod_mat_randtest(nmod_mat_t mat, flint_rand_t state)
    void nmod_mat_randfull(nmod_mat_t mat, flint_rand_t state)
    void nmod_mat_randrank(nmod_mat_t, flint_rand_t state, long rank)
    void nmod_mat_randops(nmod_mat_t mat, long count, flint_rand_t state)
    void nmod_mat_randtril(nmod_mat_t mat, flint_rand_t state, int unit)
    void nmod_mat_randtriu(nmod_mat_t mat, flint_rand_t state, int unit)
    void nmod_mat_print_pretty(nmod_mat_t mat)
    int nmod_mat_equal(nmod_mat_t mat1, nmod_mat_t mat2)
    int nmod_mat_is_zero(nmod_mat_t mat)
    int nmod_mat_is_empty(nmod_mat_t mat)
    int nmod_mat_is_square(nmod_mat_t mat)
    void nmod_mat_zero(nmod_mat_t mat)
    void nmod_mat_set(nmod_mat_t B, nmod_mat_t A)
    void nmod_mat_transpose(nmod_mat_t B, nmod_mat_t A)
    void nmod_mat_add(nmod_mat_t C, nmod_mat_t A, nmod_mat_t B)
    void nmod_mat_sub(nmod_mat_t C, nmod_mat_t A, nmod_mat_t B)
    void nmod_mat_neg(nmod_mat_t B, nmod_mat_t A)
    void nmod_mat_scalar_mul(nmod_mat_t B, nmod_mat_t A, mp_limb_t c)
    void nmod_mat_mul(nmod_mat_t C, nmod_mat_t A, nmod_mat_t B)
    void nmod_mat_mul_classical(nmod_mat_t C, nmod_mat_t A, nmod_mat_t B)
    void nmod_mat_mul_strassen(nmod_mat_t C, nmod_mat_t A, nmod_mat_t B)
    void nmod_mat_addmul(nmod_mat_t D, nmod_mat_t C, nmod_mat_t A, nmod_mat_t B)
    void nmod_mat_submul(nmod_mat_t D, nmod_mat_t C, nmod_mat_t A, nmod_mat_t B)
    mp_limb_t nmod_mat_det(nmod_mat_t A)
    long nmod_mat_rank(nmod_mat_t A)
    int nmod_mat_inv(nmod_mat_t B, nmod_mat_t A)
    void nmod_mat_solve_tril(nmod_mat_t X, nmod_mat_t L, nmod_mat_t B, int unit)
    void nmod_mat_solve_triu(nmod_mat_t X, nmod_mat_t U, nmod_mat_t B, int unit)
    long nmod_mat_lu(long * P, nmod_mat_t A, int rank_check)
    int nmod_mat_solve(nmod_mat_t X, nmod_mat_t A, nmod_mat_t B)
    int nmod_mat_solve_vec(mp_ptr x, nmod_mat_t A, mp_srcptr b)
    long nmod_mat_rref(nmod_mat_t A)
    long nmod_mat_nullspace(nmod_mat_t X, nmod_mat_t A)

From _flint.pxd into _nmod_mat.pxd

@oscarbenjamin
Copy link
Collaborator

  1. Where do we store the ctx object? flint_base.flint_context seems like the best place. Why do ctx and thectx both exist?

One name only exists in C and the other is a runtime Python object:

In [1]: import flint

In [2]: flint.ctx
Out[2]: 
pretty = True      # pretty-print repr() output
unicode = False    # use unicode characters in output
prec = 53          # real/complex precision (in bits)
dps = 15           # real/complex precision (in digits)
cap = 10           # power series precision
threads = 1        # max number of threads used internally

In [3]: flint.thectx
---------------------------------------------------------------------------
AttributeError 

Maybe it is possible to do without thectx if ctx is just declared without cdef in the first place or maybe that would slow things down or something.

2. What to do with FMPZ_UNKNOWN, FMPZ_REF, and FMPZ_TMP. It seems best to just cdef them as constants in _flint.pxd

Currently these are defined as DEF:

DEF FMPZ_UNKNOWN = 0
DEF FMPZ_REF = 1
DEF FMPZ_TMP = 2

That is deprecated but no alternative for cdef constants exists yet:
cython/cython#4310 (comment)

@deinst
Copy link
Contributor

deinst commented Sep 4, 2023

I'll leave the FMPZ constants as they are, but the endless warnings are annoying. It looks like the ultimate solution for DEF is still unresolved.

Yes, I will soon split _flint.pxd into separate module specific pieces. This is essentially level 0 of issue #55 . The various submodules will be libflint/fmpz.pxd, libflint/fmpz_poly.pxd, etc.

@GiacomoPope
Copy link
Contributor

Seems like you have this next refactor covered but lmk if I can help

@oscarbenjamin
Copy link
Collaborator

I think this is fixed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants