Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: Make CApi test cross-platform #2934

Merged
merged 1 commit into from
Nov 10, 2021
Merged

Conversation

toonn
Copy link
Contributor

@toonn toonn commented Nov 9, 2021

  • Closes #xxxx
  • Tests added
  • Added clear title that can be used to generate release notes
  • Fully documented, including updating docs/source/*.rst for new API

The CApi.open_plenty_of_contexts test has been failing for us on Darwin and this is due to the Linux-based assumption of having ~1024 available file descriptors.

My first solution was to rely on the OPEN_MAX value from limits.h but that didn't fix the issue for me, not sure why but its value is 10240 for me and it seems like some sort of hard limit, it's also not entirely accurate since the limit can vary during the runtime of a process.

So I'm relying on sysconf(_SC_OPEN_MAX) instead, could use getrlimit(2) but that's only available on XSI-compliant systems and I'm not aware of an advantage to doing so.

@toonn toonn force-pushed the cross-platform-test branch 2 times, most recently from 11eca03 to 4c392e8 Compare November 9, 2021 18:36
@toonn
Copy link
Contributor Author

toonn commented Nov 9, 2021

CI is failing for Linux but I'm at a loss why. I added a commit using getrlimit rather than sysconf because the man page reads:

_SC_OPEN_MAX
        The maximum number of open files per user id.

And I figured maybe the per-process limit can differ from the limit "per user id." But it doesn't seem to make a difference.

test/unit/test_c_api.cpp Outdated Show resolved Hide resolved
test/unit/test_c_api.cpp Outdated Show resolved Hide resolved
@toonn toonn force-pushed the cross-platform-test branch from 960ab1b to abb186a Compare November 9, 2021 20:53
@toonn
Copy link
Contributor Author

toonn commented Nov 9, 2021

I'm not sure why this is failing for the s390x and arm64 architectures. According to the IBM docs both sysconf and getrlimit should be supported.

@rouault
Copy link
Member

rouault commented Nov 9, 2021

I'm not sure why this is failing for the s390x and arm64 architectures

maybe you could printf() the value and/or issue "ulimit -a" in travis/linux_generic/install.sh ? This might be perhaps related to the container solution used on the CI server (LXD?).

@toonn toonn force-pushed the cross-platform-test branch from f877051 to f39176c Compare November 9, 2021 23:12
@rouault
Copy link
Member

rouault commented Nov 10, 2021

ok, so the limit on those s390x and arm64 conf is 1048576. What we could potentially do is : if rlim_cur > 1024, then set it to 1024 and call setrlimit( RLIMIT_NOFILE, &open_max_limits)

@rouault rouault closed this Nov 10, 2021
@rouault rouault reopened this Nov 10, 2021
@toonn toonn force-pushed the cross-platform-test branch from f39176c to cbfd807 Compare November 10, 2021 10:53
@toonn
Copy link
Contributor Author

toonn commented Nov 10, 2021

Ok, so here's where we get the nullptr:

RLIM 1048576, sysconf 1048576
1021690,

So it would seem the process already has ~20k FDs open or something. I'll try your setrlimit suggestion, though I'm not sure we'll be able to open files if there's already over 1024 FDs open.

An alternative would be to open FDs until we get a nullptr, check the error is relevant to FDs running out maybe, then back off to close 50 FDs (not sure why we leave ~50 available btw, there's a couple open already but I assume it's not exactly 49 so we end up with few but not just one available FD, maybe we should open the maximum then close just one?).

@toonn
Copy link
Contributor Author

toonn commented Nov 10, 2021

Setting the limit to 1024 seems to have worked. One advantage of the alternative solution would be that it doesn't rely on XSI compliance but if that's not considered important I'll clean up what I have now, @rouault?

@rouault
Copy link
Member

rouault commented Nov 10, 2021

'll clean up what I have now

sounds good. Fore example, you can avoid the use of the open_file_max variable and just use open_max_limits.rlim_cur

@toonn toonn force-pushed the cross-platform-test branch from ed93a18 to 0a4315b Compare November 10, 2021 12:28
@toonn
Copy link
Contributor Author

toonn commented Nov 10, 2021

Updated, kept a couple commits that might be informative if someone searches the history of this test. If you prefer I'm ok with squashing it all into one commit though, @rouault?

@rouault
Copy link
Member

rouault commented Nov 10, 2021

If you prefer I'm ok with squashing it all into one commit though, @rouault?

yes please

The test made an assumption of being able to open 1024 - 50 files. On
some platforms, like older Darwin, the default limit is only 256. To
avoid the issue entirely we retrieve the current limit for the process.

We decrease the OPEN_MAX limit if it's too high. On some platforms fopen
returned nullptrs before reaching the limit (-50) and this doesn't
happen if we decrease the limit to 1024.
@toonn toonn force-pushed the cross-platform-test branch from 0a4315b to e29acba Compare November 10, 2021 13:27
@toonn
Copy link
Contributor Author

toonn commented Nov 10, 2021

@rouault, squashed and added an explanatory comment of why we cap the limit.

@knl
Copy link

knl commented Nov 10, 2021

I came to this thread as I was seeing failure with this test on my macOS box under Nix. I'm happy that there is a working fix, thanks!

I'm wondering though, wouldn't it be a better solution to remove this test? IIUC, it tests if one can open certain number of file descriptors. However, the check is done during the test/build time, which might be running on a differently configured machine than the machine it'll run on. In other words, the test seems to exercise a behavior that is not needed for the running system.

@rouault
Copy link
Member

rouault commented Nov 10, 2021

IIUC, it tests if one can open certain number of file descriptors.

not really: it tests that creating many PROJ contexts will not open as many file descriptors as they are PROJ contexts, that is that the connection to proj.db is shared among them. This is perhaps not needed to check on every running system, but this is definitely something we want to check for non regression.

@rouault rouault merged commit ac113a8 into OSGeo:master Nov 10, 2021
github-actions bot pushed a commit that referenced this pull request Nov 10, 2021
test: Make CApi test cross-platform
toonn added a commit to toonn/nixpkgs that referenced this pull request Nov 10, 2021
One of the tests had a Linux-based assumption on the number of files
that can be opened. I have submitted this patch upstream here,
OSGeo/PROJ#2934, and it's been merged but I
don't expect to be able to update to a release before ZHF ends.

ZHF: NixOS#144627

Fixes NixOS#142875
a0x8o added a commit to a0x8o/PROJ that referenced this pull request Nov 10, 2021
test: Make CApi test cross-platform
veprbl pushed a commit to NixOS/nixpkgs that referenced this pull request Nov 10, 2021
One of the tests had a Linux-based assumption on the number of files
that can be opened. I have submitted this patch upstream here,
OSGeo/PROJ#2934, and it's been merged but I
don't expect to be able to update to a release before ZHF ends.

Fixes #142875
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants