Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create repository official Dockerfile #415

Merged
merged 10 commits into from
May 19, 2022

Conversation

rach-id
Copy link
Member

@rach-id rach-id commented May 17, 2022

This PR simplifies the main Dockerfile for Celestia-app.
It creates an image containing the Celestia-app binary and doesn't take into account the other development related to it. For example, under docker/ directory, we had a Dockerfile that was shaped to suite the ephemeral cluster needs. However, when tagged using GitHub actions, it was acting as the main Celestia-app docker image. Thus, if someone wants to use that image, they will have to follow a complex way of specifying the home directory and existing files. In the meanwhile, the official repository Dockerfile shouldn't be as opinionated as possible.

Also, I renamed docker/Dockerfile to docker/Dockerfile_ephemeral in case we need to keep that image for the ephemeral cluster. However, I suggest we add a new GitHub action that creates that image and tags it with a tag that reflect that it will be used for the ephemeral cluster, example : ghcr.io/celestiaorg/celestia-app-ephemeral:latest (Let me know if we want to add that action so that I add it to this PR).

This comes from the need of having a base Celestia-app docker image that will be built during integration tests.

@rach-id rach-id added enhancement New feature or request github_actions labels May 17, 2022
@rach-id rach-id requested review from Bidon15 and jbowen93 May 17, 2022 11:45
@rach-id rach-id self-assigned this May 17, 2022
@rach-id rach-id changed the base branch from qgb-integration to master May 17, 2022 11:52
@rach-id
Copy link
Member Author

rach-id commented May 17, 2022

The issue with this approach is priv_validator_state.json:

ERR CONSENSUS FAILURE!!! err="rename /opt/data/write-file-atomic-08895712196458512116 /opt/data/priv_validator_state.json: device or resource busy" module=consensus s
tack="goroutine 216 [running]:\nruntime/debug.Stack()\n\t/usr/local/go/src/runtime/debug/stack.go:24 +0x65\ngit.luolix.top/tendermint/tendermint/consensus.(*State).receiveRoutine.func2()\n\t/go/
pkg/mod/github.com/celestiaorg/celestia-core@v1.0.1-tm-v0.34.16/consensus/state.go:726 +0x4c\npanic({0x1978200, 0xc0022989c0})\n\t/usr/local/go/src/runtime/panic.go:1038 +0x215\ngit.luolix.top/t
endermint/tendermint/privval.(*FilePVLastSignState).Save(0x78)\n\t/go/pkg/mod/github.com/celestiaorg/celestia-core@v1.0.1-tm-v0.34.16/privval/file.go:139 +0x90\ngit.luolix.top/tendermint/tenderm
int/privval.(*FilePV).saveSigned(0xc0000fc600, 0xc0027bb700, 0x4106b4, 0x0, {0xc0027c0280, 0x80, 0x203000}, {0xc0022ac700, 0x40, 0x40})\n\t/go/pkg/mod/github.com/celestiaorg/celestia-core@v1
.0.1-tm-v0.34.16/privval/file.go:395 +0x88\ngit.luolix.top/tendermint/tendermint/privval.(*FilePV).signProposal(0xc008daec80, {0xc00906ded7, 0x7}, 0xc0000fc680)\n\t/go/pkg/mod/github.com/celesti
aorg/celestia-core@v1.0.1-tm-v0.34.16/privval/file.go:381 +0x2ba\ngit.luolix.top/tendermint/tendermint/privval.(*FilePV).SignProposal(0xc0000fc600, {0xc00906ded7, 0x0}, 0x0)\n\t/go/pkg/mod/githu
b.com/celestiaorg/celestia-core@v1.0.1-tm-v0.34.16/privval/file.go:265 +0x1e\ngit.luolix.top/tendermint/tendermint/consensus.(*State).defaultDecideProposal(0xc009272700, 0x1, 0x0)\n\t/go/pkg/mod
/github.com/celestiaorg/celestia-core@v1.0.1-tm-v0.34.16/consensus/state.go:1132 +0x2e2\ngit.luolix.top/tendermint/tendermint/consensus.(*State).enterPropose(0xc009272700, 0x1, 0x0)\n\t/go/pkg/m
od/github.com/celestiaorg/celestia-core@v1.0.1-tm-v0.34.16/consensus/state.go:1096 +0x6a3\ngit.luolix.top/tendermint/tendermint/consensus.(*State).enterNewRound(0xc009272700, 0x1, 0x0)\n\t/go/pk
g/mod/github.com/celestiaorg/celestia-core@v1.0.1-tm-v0.34.16/consensus/state.go:1019 +0xa67\ngit.luolix.top/tendermint/tendermint/consensus.(*State).handleTimeout(0xc009272700, {0xc000d923f0, 0
xc000d923f0, 0xd923f0, 0xc0}, {0x1, 0x0, 0x1, {0x181e0134, 0xeda15924b, ...}, ...})\n\t/go/pkg/mod/github.com/celestiaorg/celestia-core@v1.0.1-tm-v0.34.16/consensus/state.go:892 +0x4a5\ngith
ub.com/tendermint/tendermint/consensus.(*State).receiveRoutine(0xc009272700, 0x0)\n\t/go/pkg/mod/github.com/celestiaorg/celestia-core@v1.0.1-tm-v0.34.16/consensus/state.go:791 +0x6bf\ncreate
d by github.com/tendermint/tendermint/consensus.(*State).OnStart\n\t/go/pkg/mod/github.com/celestiaorg/celestia-core@v1.0.1-tm-v0.34.16/consensus/state.go:378 +0x12f\n"

Even when mounted with the correct permissions, it still panics. Thus, forcing users to either:

  • Run a custom script when starting the container. This means we should change entrypoint to cmd
  • Or, create a new image which creates that file somewhere.

Will keep on investigating this.

@rach-id
Copy link
Member Author

rach-id commented May 17, 2022

The current way the above problem is solved is via adding a custom script that creates the priv_validator_state.json file in case we're executing the start command and the file doesn't exist.
This can be a temporary solution until we investigate the above logs more.

@Bidon15
Copy link
Member

Bidon15 commented May 17, 2022

Noice. I think creating an action for epheremal cluster makes sense and will be cool if this can be done within this PR
Afterwards, we can edit the path in ephemeral-cluster repo to reflect latest changes made in this PR

Passing to @jbowen93 if he is ok with this approach

Bidon15
Bidon15 previously approved these changes May 17, 2022
Copy link
Member

@Bidon15 Bidon15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final say to @jbowen93

@rach-id
Copy link
Member Author

rach-id commented May 17, 2022

@Bidon15 Awesome. Will add the GitHub action for ephemeral cluster if Josh agrees to these changes and think they make sense.

docker/entrypoint.sh Outdated Show resolved Hide resolved
@adlerjohn
Copy link
Member

Is this ready for review?

@rach-id
Copy link
Member Author

rach-id commented May 18, 2022

@adlerjohn Waiting for Josh input to see whether to add a github action to this PR for the ephemeral cluster docker image or not.

Copy link
Contributor

@jbowen93 jbowen93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the delay in review. Left some comments. I'm fine adding a gh action for ephemeral to a later PR.

Dockerfile Show resolved Hide resolved
docker/entrypoint.sh Show resolved Hide resolved
@rach-id rach-id changed the title Simplify dockerfile Create repository official Dockerfile May 19, 2022
@jbowen93
Copy link
Contributor

Resolved comments and approved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants