Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Force ppc64le builds to use gcc 13 #139

Merged
merged 31 commits into from
Oct 20, 2023
Merged

Conversation

djhoese
Copy link
Contributor

@djhoese djhoese commented Oct 17, 2023

Replaces #137

Update: This PR was originally just for testing to find out why ppc64le builds were producing inconsistent results (ex. cs2cs). The problem was whittled down to gcc 12 being a problem. Rather than using the older gcc 11 (which works fine), this PR now changes the compiler to be gcc 13.

I've also added a test to the recipe that should fail if things go wrong again in a future version of the compiler.

Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

@conda-forge-webservices
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@djhoese
Copy link
Contributor Author

djhoese commented Oct 17, 2023

I made some guess as to how to get ppc64le to build on travis. Let me know (if the builds themselves don't) how this could/should theoretically be done. As mentioned in the other PR and issues, it seems there may be errors in the emulated ppc64le so we'll try native on travis.

@djhoese
Copy link
Contributor Author

djhoese commented Oct 17, 2023

Hm the output still shows the invalid result:

363301.33	546152.98 0.00

363301.33	546152.98 0.00

589057.58	-3.54 0.00

569722.40	4268814.27 0.00

569722.40	4268814.27 0.00

(the -3.54 in the above is bad).

I don't know enough about travis but I think the configuration I did is telling it to use ppc64le natively...right?

Edit:

hostname: 707dbe23-d8d4-4c61-a985-31c94d3bb919@75129.lxd-ppc64le-travis-ci-production-1-worker2-london

@xylar
Copy link
Contributor

xylar commented Oct 17, 2023

I don't know enough about travis but I think the configuration I did is telling it to use ppc64le natively...right?

That's my understanding, though my expertise is also somewhat limited.

@djhoese
Copy link
Contributor Author

djhoese commented Oct 17, 2023

Note: Windows is failing just because I was lazy setting the debug env vars.

@djhoese
Copy link
Contributor Author

djhoese commented Oct 17, 2023

Well...that's what I get for copying a patch from upstream main. Fixing up now...

@djhoese
Copy link
Contributor Author

djhoese commented Oct 17, 2023

@rouault Sorry to pull you into this PR, but it might just be easier to discuss this this way: I've added your test from OSGeo/PROJ#3923 as a patch to this feedstock. It fails on the travis ppc64le build. Conda-forge's CI scripts and build process is much more complicated than PROJ's travis setup. For reference:

PROJ:
https://app.travis-ci.com/github/OSGeo/PROJ/jobs/611673341

conda-forge:
https://app.travis-ci.com/github/conda-forge/proj.4-feedstock/jobs/611746723

The conda-forge build should be "native" but it does run through docker compared to PROJ's direct execution. @xylar let me know if there is some further option you can think of to run even more directly on the native ppc64le. I assume docker is smart enough to realize that it doesn't need to emulate the architecture at all (ppc running on ppc). The worker/build information shows that they are both running ppc64le.

Otherwise @rouault, conda-forge is still using:

https://github.com/conda-forge/proj.4-feedstock/blob/e622434c8fb2dd7134d582451d163fba7fab3943/recipe/define_OLD_BUGGY_REMQUO.patch

Do you imagine that this could be causing these ppc issues?

@rouault
Copy link

rouault commented Oct 17, 2023

Otherwise @rouault, conda-forge is still using:

https://github.com/conda-forge/proj.4-feedstock/blob/e622434c8fb2dd7134d582451d163fba7fab3943/recipe/define_OLD_BUGGY_REMQUO.patch

Do you imagine that this could be causing these ppc issues?

no, that's totally unrelated.

@rouault
Copy link

rouault commented Oct 17, 2023

Conda-forge's CI scripts and build process is much more complicated than PROJ's travis setup.

another difference is that PROJ's travis setup was using Ubuntu Bionic (18.04), so a much older build toolchain than conda-forge builds. I've updated OSGeo/PROJ#3923 to use Jammy (22.04)

@rouault
Copy link

rouault commented Oct 17, 2023

Tests also pass on Jammy: https://app.travis-ci.com/github/OSGeo/PROJ/builds/266655965 (the final failure is unrelated. The test suite success status is at https://app.travis-ci.com/github/OSGeo/PROJ/builds/266655965#L789)

@djhoese
Copy link
Contributor Author

djhoese commented Oct 18, 2023

@xylar or other conda-forge expert: Is there an existing script/configuration for running linux builds natively (no docker)? I don't see any here:

https://github.com/conda-forge/conda-smithy/blob/main/conda_smithy/templates/run_docker_build.sh.tmpl

So I'm trying to hack my own just to get something working in this feedstock that passes the ppc64le test. Then I'll work from there to figure out what's causing the differences between here and upstream tests (docker build itself, conda-forge dependencies, etc).

@xylar
Copy link
Contributor

xylar commented Oct 18, 2023

Is there an existing script/configuration for running linux builds natively (no docker)?

@djhoese, I don't think so, but I'm ready to be corrected by a conda-forge expert.

@djhoese
Copy link
Contributor Author

djhoese commented Oct 19, 2023

@conda-forge-admin restart ci

@djhoese
Copy link
Contributor Author

djhoese commented Oct 19, 2023

@ocefpaf same question for you if you're around and know the answer ^

@ocefpaf
Copy link
Member

ocefpaf commented Oct 19, 2023

@ocefpaf same question for you if you're around and know the answer ^

I'm quite removed from the compiler choices in conda-forge and maybe @isuruf can comment more. While I believe it is possible we need to be careful with compatibility.

With that said, I think I already said this but I'll say it again, we don't really make ppc64le a priority. I'm OK with just dropping it going forward and re-adding when a new compiler is available everywhere.

@djhoese
Copy link
Contributor Author

djhoese commented Oct 19, 2023

I don't have any personal stake in ppc support so I don't think I should have a say in the outcome. I was just annoyed that something that seemed to be working before was now not working.

I do think this failure is an important one. @rouault what is your opinion on this test (or one like it) being included in PROJ's tests? It doesn't seem to be conda-forge specific anymore, but rather gcc 12 as the problem. I'd also be fine cleaning up my patch in this repository to only have the test here and/or update the shell-level cs2cs test to fail when it gets the wrong result.

Thanks @isuruf!

@rouault
Copy link

rouault commented Oct 19, 2023

what is your opinion on this test (or one like it) being included in PROJ's tests?

why not, but it needs some care as we try to make PROJ's test not to rely too much on the network (some builds might have curl disabled). If that's indeed a compiler bug, it is hard to correlate that with functional tests...

@djhoese
Copy link
Contributor Author

djhoese commented Oct 19, 2023

Will the test you added in OSGeo/PROJ#3923 silently skip if network isn't available or does it cause the test to fail in those environments? I'm thinking what I might do is do a simple shell-level test in conda-forge for $(cs2cs ...) == $(cs2cs ...).

@rouault
Copy link

rouault commented Oct 19, 2023

or does it cause the test to fail in those environments?

it would fail

@djhoese djhoese changed the title Debug ppc64le failures Force ppc64le builds to use gcc 13 Oct 19, 2023
@djhoese djhoese marked this pull request as ready for review October 19, 2023 20:08
Copy link
Contributor

@xylar xylar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't say my proj expertise is deep but this fix seems very reasonable and I think we want to get it in as soon as possible.

@ocefpaf ocefpaf merged commit 1f931ac into conda-forge:main Oct 20, 2023
7 checks passed
@ocefpaf
Copy link
Member

ocefpaf commented Oct 20, 2023

Awesome debugging work @djhoese! And thanks to all that helped and reviewed this one!!

@dopplershift
Copy link
Member

Thanks for the hard work digging in @djhoese !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants