Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Adding build for ARM64 #30641

Merged
merged 5 commits into from
May 12, 2020
Merged

Conversation

ossdev07
Copy link
Contributor

@ossdev07 ossdev07 commented Jan 3, 2020

Added arm64 test support in travis-ci .
Modified environment creation by using archiconda instead of miniconda as miniconda is not supported in arm64 currently .

  • closes Add arm64 builds to CI #28986
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls make minimal changes this is changing almost everything

ci/run_tests.sh Outdated Show resolved Hide resolved
.travis.yml Outdated Show resolved Hide resolved
ci/run_tests.sh Outdated Show resolved Hide resolved
ci/run_tests.sh Outdated Show resolved Hide resolved
ci/setup_env.sh Outdated Show resolved Hide resolved
ci/setup_env.sh Outdated Show resolved Hide resolved
@simonjayhawkins simonjayhawkins added the CI Continuous Integration label Jan 3, 2020
Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we should be testing ARM, since NumPy is not testing it.

But looks like some old changes got mixed with this PR. And also, I don't think we need most of the if statements that you have. I'd start this PR from zero, you shouldn't need much more than the new build in .travis.yml, the dependencies file, and the downloading the Anaconda installer for ARM.

ci/setup_env.sh Outdated Show resolved Hide resolved
ci/setup_env.sh Outdated Show resolved Hide resolved
ci/setup_env.sh Outdated Show resolved Hide resolved
@datapythonista datapythonista changed the title added travis-ci support for arm64 CI: Adding build for ARM64 Jan 3, 2020
Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please address all the comments and then push? Thanks!

.travis.yml Outdated Show resolved Hide resolved
ci/run_tests.sh Outdated Show resolved Hide resolved
ci/run_tests.sh Outdated Show resolved Hide resolved
ci/run_tests.sh Outdated Show resolved Hide resolved
ci/setup_env.sh Outdated Show resolved Hide resolved
ci/setup_env.sh Outdated Show resolved Hide resolved
ci/setup_env.sh Outdated Show resolved Hide resolved
ci/setup_env.sh Outdated Show resolved Hide resolved
ci/setup_env.sh Outdated Show resolved Hide resolved
ci/setup_env.sh Outdated Show resolved Hide resolved
@WillAyd
Copy link
Member

WillAyd commented Feb 12, 2020

@ossdev07 still active? If so can you fix merge conflicts and address comments?

@ossdev07
Copy link
Contributor Author

Hi @WillAyd , I am working on it. I will let you know once I am done.

@ossdev07
Copy link
Contributor Author

ossdev07 commented Mar 3, 2020

Hi @WillAyd and @datapythonista,

  • I have removed archiconda and updated it to miniforge. It does not require sudo permissions to run conda commands as it is similar to miniconda. By this we can remove all the unwanted code as suggested previously.

  • For arm64 xvfb need to be installed seperately, so added in apt-addons in travis.

  • Flags "-n auto --dist=loadfile -s --strict" mentioned in pytest command in run_tests.sh are giving timeout issue in arm64 platform. So, I have removed the same for arm64 platform.

Please review and provide feedback if any more updates are required.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ossdev07 looks quite good now, added few more comments

ci/run_tests.sh Outdated Show resolved Hide resolved
ci/setup_env.sh Outdated Show resolved Hide resolved
ci/deps/travis-37-aarch64.yaml Outdated Show resolved Hide resolved
.travis.yml Outdated Show resolved Hide resolved
ci/deps/travis-37-arm64.yaml Show resolved Hide resolved
ci/setup_env.sh Show resolved Hide resolved
Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ossdev07 can you address the pending comments please, this is close now.

.travis.yml Outdated Show resolved Hide resolved
ci/setup_env.sh Show resolved Hide resolved
@ossdev07
Copy link
Contributor Author

ossdev07 commented Mar 9, 2020

@datapythonista ,

Sorry, for the extra commits that have been added. Rebase has gone wrong and all commits got added. The actual commit which needs to be committed is 887591e.

@datapythonista
Copy link
Member

Sorry, for the extra commits that have been added. Rebase has gone wrong and all commits got added. The actual commit which needs to be committed is 887591e.

You'll have to fix your branch for us to merge this. I wrote some time ago on what I usually do in these cases in this post: https://datapythonista.me/blog/useful-git-commands.html

Hope that helps.

@ossdev07
Copy link
Contributor Author

Sorry, for the extra commits that have been added. Rebase has gone wrong and all commits got added. The actual commit which needs to be committed is 887591e.

You'll have to fix your branch for us to merge this. I wrote some time ago on what I usually do in these cases in this post: https://datapythonista.me/blog/useful-git-commands.html

Hope that helps.

Thanks @datapythonista. I have resolved the conflicts in the commits and added the commit with changes suggested. Please review and provide feedback if anything needs to be updated.

ci/run_tests.sh Outdated Show resolved Hide resolved
ci/run_tests.sh Outdated Show resolved Hide resolved
.travis.yml Show resolved Hide resolved
ci/setup_env.sh Outdated Show resolved Hide resolved
@rhenwood-arm
Copy link

(I work with @ossdev07 ) : @ossdev07 as an independent task, can you make a note of the missing xvfb for Travis-CI in their community forums: https://travis-ci.community/

@ossdev07
Copy link
Contributor Author

(I work with @ossdev07 ) : @ossdev07 as an independent task, can you make a note of the missing xvfb for Travis-CI in their community forums: https://travis-ci.community/

@rhenwood-arm ,

I will take a note of it and will raise it in the travis community.

@datapythonista
Copy link
Member

Travis is red, can you have a look and see what's wrong please

@ossdev07
Copy link
Contributor Author

Travis is red, can you have a look and see what's wrong please

@datapythonista ,

After adding all flags in pytest command, it is failing with the timeout issue at different steps for each build for arm64 platform. That is the reason I have removed those three flags.

Could you please suggest if there is any resolution to run without these flags for a specific platform?

@datapythonista
Copy link
Member

IIRC the flags you changed were for pytest, and in this case pytest didn't even started, so that's surely not going to solve the problem.

The build was stalled during ci/submit_cython_cache.sh. Can you merge master, so we can see if the problem happens at the same stage. I'm wondering if Travis is reliable for ARM64...

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to merge master, lots of things are failing.

.travis.yml Outdated
@@ -33,6 +35,10 @@ matrix:
- env:
- JOB="3.7" ENV_FILE="ci/deps/travis-37.yaml" PATTERN="(not slow and not network and not clipboard)"

- arch: arm64
env:
- JOB="3.7" PYTEST_WORKERS=8 ENV_FILE="ci/deps/travis-37-arm64.yaml" PATTERN="(not slow and not network and not clipboard)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are likely to have much better performance with 2 workers here. I would guess that's what we have physically, and what we use on other processors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated to 2 workers as suggested. Please review.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it still using 8 on the arm build - make this 2

and taking 56 minutes which is too long

put the build in the allowed failures section

Copy link
Contributor Author

@ossdev07 ossdev07 May 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated to 2 workers in the previous commit, but the build was getting stalled and failing while running tests. So, I have moved back to 8 workers. Pandas compilation and test execution are taking more time to complete in ARM64 when compared to the AMD64 platform. I have added arm64 build in allowed_failures, but it is not getting triggered in Travis after the commit.

@ossdev07 ossdev07 force-pushed the pandas_ci_arm64 branch from 269b57b to e251fec Compare May 8, 2020 13:30
@jreback
Copy link
Contributor

jreback commented May 8, 2020

@cphang99
Copy link

cphang99 commented May 9, 2020

Do the workarounds described in #17792 need to be incorporated in order for the test suite to pass?

@cphang99
Copy link

cphang99 commented May 9, 2020

It looks like it does. many of the failing tests appear to be DateTime tests as described in #17792

More details in https://gitlab.com/libreml/libreml/-/issues/110#note_339463783

@ossdev07
Copy link
Contributor Author

these changes are breaking other builds: https://dev.azure.com/pandas-dev/pandas/_build/results?buildId=35039&view=logs&j=bef1c175-2c1b-51ae-044a-2437c76fc339&t=770e7bb1-09f5-5ebf-b63b-578d2906aac9&l=17

I have added global variable PYTEST_WORKERS in azure pipelines yml file and issues have been resolved with my azure account. Please review the changes and let me know if any more changes required.

@ossdev07
Copy link
Contributor Author

@datapythonista / @jreback / @WillAyd ,

All checks are passing as well now. Please review and let me know if anything needs to be updated.

@jreback
Copy link
Contributor

jreback commented May 11, 2020

this build takes 56 minutes which is way longer than the longest build

- dist: bionic
python: 3.9-dev
env:
- arch: arm64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have have to have it in both sections (allow_failures) and the top section; they must be identical.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. I have added it as suggested and all checks are passing as well. Please review and let me know if any more updates are required.

ossdev07 and others added 4 commits May 12, 2020 04:30
Signed-off-by: ossdev07 <ossdev@puresoftware.com>
Signed-off-by: ossdev07 <ossdev@puresoftware.com>
@ossdev07 ossdev07 requested a review from jreback May 12, 2020 05:53
@jreback jreback mentioned this pull request May 12, 2020
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. @TomAugspurger ok here

@jreback jreback added this to the 1.1 milestone May 12, 2020
@TomAugspurger
Copy link
Contributor

Thanks. So to confirm: we're putting it under allowed failures just because it takes longer? We'd consider any failures on that job to be a blocker to merging / cause for reverting if a failure is noticed?

@ossdev07 will you be available to respond to build / CI issues as they come up?

@ossdev07
Copy link
Contributor Author

Thanks. So to confirm: we're putting it under allowed failures just because it takes longer? We'd consider any failures on that job to be a blocker to merging / cause for reverting if a failure is noticed?

@ossdev07 will you be available to respond to build / CI issues as they come up?

Yes, I will be available to respond if any issues come up.

@jreback
Copy link
Contributor

jreback commented May 12, 2020

yeah i think under allowed failures for now until we can get the time to be more reasonable see #34115

@ossdev07 feel free to comment on the issue about building wheels for this (note we would have to actualy run this on the azure side and not here, is this possible?)

can you create an issue about moving this out of allowed failures (we can do this as soon as we see this working reasonbaly for a little)

@ossdev07
Copy link
Contributor Author

yeah i think under allowed failures for now until we can get the time to be more reasonable see #34115

@ossdev07 feel free to comment on the issue about building wheels for this (note we would have to actualy run this on the azure side and not here, is this possible?)

can you create an issue about moving this out of allowed failures (we can do this as soon as we see this working reasonbaly for a little)

Thanks for the approval. I will raise a ticket about moving this out of allowed failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants