Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Upgrade igraph/C to 0.10 #840

Merged
merged 362 commits into from
Dec 11, 2023
Merged

feat: Upgrade igraph/C to 0.10 #840

merged 362 commits into from
Dec 11, 2023

Conversation

Antonov548
Copy link
Contributor

@Antonov548 Antonov548 commented Jun 13, 2023

based on: #768

  • error on NOTE in GHA
  • @param normalized TBD
  • review changes to get.incidence.dense()
  • Instances of #' if
  • skip("Waiting for #935")
  • Make glpk (and others) optional again with an anticonf script
  • Unskip tests which are freezing on Windows 🔴 🅱️
  • Unskip tests related to random seed, or refactor seeds to make them seed-free 🔴 🅱️ (@krlmlr)
  • Implement sparse in as_adjacency_matrix 🔴
  • Fix is_tree auto-generation (auto-generated code is currently manually modified) 🔴 🅱️
  • Finalize random number generation 🔴 (🅱️) (@krlmlr)
  • Review and merge this PR 🔴 🅱️ (@krlmlr)
  • Investigate failing revdepchecks 🔴 (🅱️) (@Antonov548)
  • Use appropriate types and API consistently:
    • Always use long vector API in R (e.g. xlength instead of length, check what else needs changes) 🟡
    • Make sure that all indices in R are transferred to C as double, then converted to igraph_integer_t. 🟡
    • All enums should be transferred as int, then converted to the appropriate enum type.
    • Stimulus: Check for overflow and NaN when doing double -> igraph_integer_t conversion.
    • Stimulus: Check that scalars are scalars during conversion (i.e. do not try to index into a zero-length vector with REAL(vec)[0]) — waiting on refactor: make_empty_graph() is now fully auto-generated #1068
    • Check matrix dimensions during conversion: row and column counts are limited to what an int can hold, but the number of matrix elements is not. 🟡
  • Make sure that error codes are checked and handled in all manually written and all auto-generated code. This is essential now that the error handler must return. Previously the error handler would just longjmp(), so there is likely quite a bit of sloppiness about this in the current code, which can potentially lead to crashes. Warning, the test suite won't catch these issues! 🟡
  • Move more functions to be auto-generated, so that they don't need to be updated for 0.10 manually, and so that they get all the type conversion checks without extra work.
  • Go through C/igraph's changelog, and update R docs for all functions that changed in 0.10. 🔴
  • Caching:
    • Make sure that the property cache is working
    • Expose igraph_invalidate_cache() and use it as appropriate in tests 🟡
  • ... ?

Closes #753

@Antonov548
Copy link
Contributor Author

@krlmlr
I have removed 32 mode from igraph.

Related to this I have question: How to deal with input and output vectors which in C igraph are 64 integer and on R should be double? Does it mean that we need to always copy such vectors with type conversion? Since currently we just have use memory mapping for input vectors and for output vectors just memory copy.

And should we keep logic of memory mapping for C double vectors?

@krlmlr
Copy link
Contributor

krlmlr commented Jun 13, 2023

I don't follow. AFAICT we no longer are using memory mapping in the main branch.

@krlmlr
Copy link
Contributor

krlmlr commented Jun 13, 2023

Oh, we do, in R_SEXP_to_vector() . Can we replace all these calls with R_SEXP_to_vector_copy() and make sure that the memory is properly freed? I think we should do this in a separate branch, off the main branch, before switching to 0.10.

@szhorvat
Copy link
Member

It makes sense to avoid a copy when handling vectors of doubles, otherwise performance may be severely impared in some cases. Consider weighted graphs: there's no reason to copy the weigths if they are already stored as doubles. Some algorithms will only access a few of the edges, so avoiding copying all edge weights will be a significant saving.

@krlmlr
Copy link
Contributor

krlmlr commented Jun 13, 2023

Oh, right! So, we need to be careful about which instances to change.

@Antonov548
Copy link
Contributor Author

Oh, right! So, we need to be careful about which instances to change.

Okay. I will start update of types conversion in new pull request.

@krlmlr
Copy link
Contributor

krlmlr commented Jun 19, 2023

What are the next steps here?

@Antonov548
Copy link
Contributor Author

Antonov548 commented Jun 19, 2023

What are the next steps here?

I think to have such next steps:

  • Merge with main branch
  • Fix compilation errors of C code based on old branch for igraph 0.10
  • Define next steps for R interface

@krlmlr
Copy link
Contributor

krlmlr commented Jun 29, 2023

Thanks. This still has weird compilation errors mentioning "igraph.h" on GitHub Actions. Does it work locally?

@Antonov548
Copy link
Contributor Author

Thanks. This still has weird compilation errors mentioning "igraph.h" on GitHub Actions. Does it work locally?

I have errors related to linking in R. On gitpod I also haven't problem with include.

@krlmlr
Copy link
Contributor

krlmlr commented Jun 29, 2023

Can't build locally, with weird errors. Trying in https://github.com/cynkra/rig-ubuntu now.

@Antonov548
Copy link
Contributor Author

Can't build locally, with weird errors. Trying in https://github.com/cynkra/rig-ubuntu now.

Can it be because different Makevars is used?

@krlmlr
Copy link
Contributor

krlmlr commented Jun 29, 2023

In the Docker image:

$ R CMD INSTALL . --no-byte-compile
...
** libs
using C compiler: ‘gcc (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0’
using C++ compiler: ‘g++ (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0’
make: Nothing to be done for 'all'.
installing to /root/R/x86_64-pc-linux-gnu-library/4.3/00LOCK-workspace/00new/igraph/libs
** R
** demo
** inst
** preparing package for lazy loading
** help
*** installing help indices
*** copying figures
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
Error: package or namespace load failed for ‘igraph’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/root/R/x86_64-pc-linux-gnu-library/4.3/00LOCK-workspace/00new/igraph/libs/igraph.so':
  /root/R/x86_64-pc-linux-gnu-library/4.3/00LOCK-workspace/00new/igraph/libs/igraph.so: undefined symbol: R_igraph_scg_adjacency
Error: loading failed
Execution halted
ERROR: loading failed
* removing ‘/root/R/x86_64-pc-linux-gnu-library/4.3/igraph’

@krlmlr
Copy link
Contributor

krlmlr commented Jun 29, 2023

Priorities:

  • Fix in Docker container
  • Fix in CI/CD
  • Fix in our respective development environments

We're on a good path! Thanks for your efforts.

@Antonov548
Copy link
Contributor Author

  • Fix in Docker container

I have compiling rigraph with 0.10 (Still some warnings which should be fixed).

Now it's time to fix R code and communication R with C. I have some weird errors when I try to run tests. @krlmlr Would be great if you can check it.

@krlmlr
Copy link
Contributor

krlmlr commented Jun 29, 2023

What do the errors look like? Does it happen in the Docker image?

@szhorvat
Copy link
Member

szhorvat commented Jun 29, 2023

R_SEXP_to_igraph() now looks like this:

int R_SEXP_to_igraph(SEXP graph, igraph_t *res) {

  res->n=R_igraph_get_n(graph);
  res->directed=R_igraph_get_directed(graph);
  R_igraph_get_from(graph, &res->from);
  R_igraph_get_to(graph, &res->to);
  R_igraph_get_oi(graph, &res->oi);
  R_igraph_get_ii(graph, &res->ii);
  R_igraph_get_os(graph, &res->os);
  R_igraph_get_is(graph, &res->is);

  /* attributes */
  REAL(VECTOR_ELT(VECTOR_ELT(graph, igraph_t_idx_attr), 0))[0] = 1; /* R objects refcount */
  REAL(VECTOR_ELT(VECTOR_ELT(graph, igraph_t_idx_attr), 0))[1] = 0; /* igraph_t objects */
  res->attr=VECTOR_ELT(graph, igraph_t_idx_attr);

  return 0;
}

Why are the fields copied one by one instead of just copying the entire igraph_t in one go, and then setting the attribute table? It is generally fine to do igraph_t g1, g2; g2 = g1 for as long as g2 and g1 are treated as read-only (!!).

Note that in 0.10 there's also a cache entry which must be set correctly.

Is there an problem I'm missing with a simple *res = *R_igraph_get_pointer(graph), then setting the attribute table?

@szhorvat
Copy link
Member

szhorvat commented Jun 29, 2023

Similarly, R_SEXP_to_igraph_copy() should use igraph_copy(), as this function knows how to copy the cache. You don't want to re-implement cache copying in the R interface, as the cache structure might change across even bugfix versions. It is not part of the ABI!

I am not really familiar with the R attribute handler. You'll need to make sure that the copy field of igraph_attribute_table_t is properly implemented for igraph_copy() to work correctly.

@Antonov548
Copy link
Contributor Author

Why are the fields copied one by one instead of just copying the entire igraph_t in one go, and then setting the attribute table?

It was just changed iterativly. Firstly with getters of fields, then with pointer to igraph. So it's just about refactoring I think. In general I guess it's possible what you suggested. But I would like to do this in seperate pull request instead of this one. Since this is not changes which are needed for 0.10 migration.

@szhorvat
Copy link
Member

Since this is not changes which are needed for 0.10 migration.

You need to handle the cache somehow before 0.10 will work correctly.

In fact, is it even necessary to copy one igraph_t into another? Would it not be sufficient to work with a single igraph_t?

The problem with making refactoring too incremental is that you won't see the forest for the tree. Some changes can't be decoupled from each other.

@Antonov548
Copy link
Contributor Author

The problem with making refactoring too incremental is that you won't see the forest for the tree. Some changes can't be decoupled from each other.

Oh, I see. For caching it's indeed required to change.

@krlmlr
Copy link
Contributor

krlmlr commented Jun 29, 2023

R_SEXP_to_igraph() already looks like this in the 1.5.0 version that is on CRAN now. If this is wrong, we should release a 1.5.1 indeed.

@szhorvat
Copy link
Member

R_SEXP_to_igraph() already looks like this in the 1.5.0 version that is on CRAN now. If this is wrong, we should release a 1.5.1 indeed.

0.9 does not use a cache, so there's no issues with setting or copying the cache. 1.5.0 is most likely working fine, just not the prettiest.

@krlmlr
Copy link
Contributor

krlmlr commented Jun 29, 2023

I don't mind fixing and releasing 1.5.1 so that the 0.10 version works correctly from the start.

@Antonov548
Copy link
Contributor Author

0.9 does not use a cache, so there's no issues with setting or copying the cache. 1.5.0 is most likely working fine, just not the prettiest.

I see at lease first tests from test suite is failing due to the cache in graph. I think we need to add cache support to rigraph. Not sure how we should manage cache on R side. But I will start look in this direction.

@szhorvat
Copy link
Member

The cache does not need to be managed explicitly. Just use public API functions to modify igraph_t as much as possible.

Is there any instance when this cannot be done, other than setting the attributes?

Also, is it possible to avoid using = on igraph_t objects? Can we just use the original igraph_t instead of always doing something equivalent to igraph_t g = *get_igraph_pointer()? I.e. can we just igraph_t *p = get_igraph_pointer() then simply operate with p?

@krlmlr
Copy link
Contributor

krlmlr commented Dec 6, 2023

@ntamas @szhorvat @GroteGnoom: I could use more eyes for the changes to rinterface.c (3200 lines changed): https://github.com/igraph/rigraph/pull/840/files#diff-44d6b09dbc4b7f024f1b2df8dcc51a6e3d489408a6f82def8dd49ffb1bcd5dc4 . I have reviewed everything else, looks good to me. Thanks!

@szhorvat
Copy link
Member

szhorvat commented Dec 8, 2023

I see some instances of vector_set / vector_get (or the vector_int equivalents) which were added by this PR. These should not be used. Always prefer VECTOR.

@szhorvat
Copy link
Member

szhorvat commented Dec 8, 2023

There are multiple instances of function calls without checking for errors.

@szhorvat
Copy link
Member

szhorvat commented Dec 8, 2023

There is no clarity about which functions return an error code an which don't. There are still a lot of int return types. It is unclear to me if this is because some of these are called by R directly, and need this, or they were just neglected to be updated. For example, R_igraph_SEXP_to_strvector_copy() has an int return code even though it seems this should be an igraph_error_t, it contains an unchecked igraph_strvector_init() inside, and is itself used in unchecked ways.

Something I mentioned before, and I believe is an absolute must before any 2.0 release, is to have additional clarity about which function is "top level", i.e. is called directly from R, and which function is called by other C function. Errors must be handled differently in these two, with IGRAPH_R_CHECK() and IGRAPH_CHECK(), respectively. I'd be surprised if there are no crash bugs at the moment due to this being too messy.

@szhorvat
Copy link
Member

szhorvat commented Dec 8, 2023

Lots of deprecated functions are still used. These should be brought up to date. Either this is very easy (so do it now) or it might require some non-trivial preparation (which means we need to become aware of it now).

@szhorvat
Copy link
Member

szhorvat commented Dec 8, 2023

That's all from me for now for rinterface_extra.c. My recommendation is to merge this PR right away (before fixing the issues I mention), and after that focus on cleaning up things on main so we can see through the chaos, starting with removing redundant copies of the C core.

@krlmlr
Copy link
Contributor

krlmlr commented Dec 8, 2023

Thanks. The 1.6.0 release is still being processed. Happy to merge ASAP after that.

@szhorvat
Copy link
Member

szhorvat commented Dec 8, 2023

Great! Do you mind if I make some fixes here until that happens, such as replacing deprecated functions?

@krlmlr
Copy link
Contributor

krlmlr commented Dec 8, 2023

Thanks. It makes no difference if we fix here or after merging t the main branch, as long as the builds in this branch remain clean.

@szhorvat
Copy link
Member

szhorvat commented Dec 8, 2023

Good. Looking forward to this being merged into main.

@krlmlr krlmlr changed the title igraph 0.10 feat: Upgrade igraph/C to 0.10 Dec 11, 2023
@krlmlr krlmlr marked this pull request as ready for review December 11, 2023 08:17
@krlmlr krlmlr merged commit 20fff35 into main Dec 11, 2023
12 checks passed
@krlmlr krlmlr deleted the igraph-0.10 branch December 11, 2023 08:19
@krlmlr
Copy link
Contributor

krlmlr commented Dec 11, 2023

igraph 1.6.0 is on CRAN. Entering anarchy.

All unresolved discussion points in this PR should become issues in https://github.com/igraph/rigraph/milestone/14.

@krlmlr
Copy link
Contributor

krlmlr commented Dec 11, 2023

Congrats @Antonov548 @szhorvat @ntamas, thanks for the great work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use igraph/C 0.10
5 participants