Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

e2e tests are currently not working #3453

Closed
rootulp opened this issue May 9, 2024 · 5 comments · Fixed by #3487
Closed

e2e tests are currently not working #3453

rootulp opened this issue May 9, 2024 · 5 comments · Fixed by #3487
Assignees
Labels
WS: Maintenance 🔧 includes bugs, refactors, flakes, and tech debt etc

Comments

@rootulp
Copy link
Collaborator

rootulp commented May 9, 2024

Context

--> Running end to end tests
go run ./test/e2e 
test-e2e2024/0[5](https://github.com/celestiaorg/celestia-app/actions/runs/8997023702/job/24714448567#step:5:6)/08 06:01:04 No particular test specified. Running all tests.
test-e2e2024/05/08 0[6](https://github.com/celestiaorg/celestia-app/actions/runs/8997023702/job/24714448567#step:5:7):01:04 go run ./test/e2e <test_name> to run a specific test
test-e2e2024/05/08 06:01:04 Valid tests are: MinorVersionCompatibility, MajorUpgradeToV2, E2ESimple

test-e2e2024/05/0[8](https://github.com/celestiaorg/celestia-app/actions/runs/8997023702/job/24714448567#step:5:9) 06:01:04 === RUN MinorVersionCompatibility
test-e2e2024/05/08 06:01:04 no versions to test
exit status 1
make: *** [Makefile:140: test-e2e] Error 1

https://github.com/celestiaorg/celestia-app/actions/runs/8997023702/job/24714448567

test-e2e2024/05/15 16:35:44 Starting testnet
2024/05/15 16:37:50 Failed to start testnet: node val0 failed to start: forwarding port 26657: error forwarding port after 5 retries: timed out waiting for port forwarding to be ready
exit status 1

https://github.com/celestiaorg/celestia-app/actions/runs/9099118540/job/25011842760?pr=3487

2024/05/17 10:38:49 Failed to start testnet: node val0 failed to start: error waiting for instance 'val0-ee2a49ab' to be running: timeout while waiting for instance 'val0-ee2a49ab' to be running
exit status 1

https://github.com/celestiaorg/celestia-app/actions/runs/9126237806/job/25095937942?pr=3487

Problem

e2e tests are failing locally and also in the nightly workflow.

Proposal

Fix it so it passes

[Optional follow-up] Instead of running it nightly, run it on every PR and make it an optional check. Currently, no one notices when the nightly fails because we have no hooks set up to alert us of failures.

@rootulp rootulp added needs:priority WS: Maintenance 🔧 includes bugs, refactors, flakes, and tech debt etc labels May 9, 2024
@evan-forbes
Copy link
Member

We think this is caused atm by the step to fetch git tags is broken. When we run this test locally, it works.

@ninabarbakadze ninabarbakadze self-assigned this May 15, 2024
This was referenced May 15, 2024
@ninabarbakadze
Copy link
Member

Fixed the issue where it wasn't able to fetch tags but now i get a port forwarding error in MinorVersionCompatibilitiy.

The test is working locally but the error in ci could be happening because the gh pipeline workers are very limited. looking into this with devops @smuu

@evan-forbes
Copy link
Member

evan-forbes commented May 16, 2024

Instead of running it nightly, run it on every PR and make it an optional check. Currently, no one notices when the nightly fails because we have no hooks set up to alert us of failures.

imo, this does not have to be tied to fixing this test, and instead could be a separate issue. That separate issue could optionally be tied to enabling alerts in slack after something fails the tests when merging to main.

@rootulp
Copy link
Collaborator Author

rootulp commented May 16, 2024

Agreed, clarified in the OP

@ninabarbakadze
Copy link
Member

not being able to fetch tags was the least of our worries. we're not able to start testnet at first because of port forwarding now different knuu issues are coming up. you can view the runs in ci on my pr. working on resolving this with @smuu

@ninabarbakadze ninabarbakadze changed the title MinorVersionCompatibility fails in nightly workflow e2e tests are currently not working May 21, 2024
ninabarbakadze added a commit that referenced this issue May 28, 2024
## Overview

Fixes #3453 

I encountered a lot of flakiness while running knuu tests in ci(view
logs in issue description) to resolve them i added a few workarounds
with @smuu which broke other things in our existing suite. those changes
are now reverted since the tests started passing on main again after
scaling some of the clusters that were stuck. DevOps is working on
addressing flakiness on knuu.

E2E tests are green:
https://github.com/celestiaorg/celestia-app/actions/runs/9179315870/job/25241185362?pr=3487

My changes:
-  Fix git tag fetching issue
-  Extract e2e tests into its own yaml file
-  Separate MinorVersionCompatibility and MajorupgradeToV2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WS: Maintenance 🔧 includes bugs, refactors, flakes, and tech debt etc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants