Skip to content
This repository has been archived by the owner on Mar 3, 2023. It is now read-only.

upgrade to Apache Bookkeeper 4.7.3 to fix DistributedLog based stateful storage #3219

Merged
merged 2 commits into from
Mar 19, 2019

Conversation

simingweng
Copy link
Contributor

@simingweng simingweng commented Mar 16, 2019

This fixes #3210

  1. upgrade to apache/bookkeeper:4.7.3 to allow DlogStorage successfully persist stateful state to bookie
  2. update variety of components dependent on DistributedLog API to org.apache.distributedlog:distributedlog-core-shaded:4.7.3
  3. update those Kubernetes deployments that use Bookkeeper cluster

@simingweng
Copy link
Contributor Author

@nwangtw

@nwangtw nwangtw requested a review from jerrypeng March 19, 2019 16:27
@nwangtw
Copy link
Contributor

nwangtw commented Mar 19, 2019

LGTM.

We don't use dlog internally. @jerrypeng do you have time to double check? Thanks!

@nwangtw
Copy link
Contributor

nwangtw commented Mar 19, 2019

Thanks @jerrypeng ~

@nwangtw nwangtw merged commit c8ff3d6 into apache:master Mar 19, 2019
@the-scott-hand
Copy link
Contributor

has there been a new heron helm package released since these changes have been made? I would like to use the heron helm charts with apache bookkeeper 4.7.3, but I cant find any updated docs/packages. Just wanted to check before I resorted trying to package it myself. Thanks!

@nwangtw
Copy link
Contributor

nwangtw commented Jun 11, 2019

Not yet.

Dmitry is working on stmgr memory management and hopefully it fixes a mem leak. I think we can have another RC after the fix is merged: #3282

@the-scott-hand
Copy link
Contributor

cool thanks, should i expect that to be put here? https://storage.googleapis.com/heron-charts

@nwangtw
Copy link
Contributor

nwangtw commented Jun 11, 2019

Interesting. I have no idea about that chart. :D @kramasamy @jerrypeng @nlu90 @huijunw any idea?

@the-scott-hand
Copy link
Contributor

@joshfischer1108
Copy link
Member

That was Streamlio’s chart repository. I think we will have to do vote for an official release. Otherwise you will have to build from source and create your own distribution. We are in progress of updating the docs now. There is an open PR for the website that has about half of the work done with a checklist of things to do for updating our docs. Feel free to check it out. The PR is here -> #3280

@the-scott-hand
Copy link
Contributor

the-scott-hand commented Jun 11, 2019

If I did want to build the chart myself, can I just replace VERSION with 0.17.8, remove ".template" from the Chart.yaml and values.yaml, put it in a folder called heron and run helm package? or are there more steps? asking because I did that and I'm getting "release heron failed: the server could not find the requested resource", wondering if I'm missing something.

Also as a side note, I have rook-ceph set up and was previously using that for my bookkeeper storage. Is there a way in the heron helm chart to change the storage class for the journal-disk and ledger-disk volumeClaimTemplates from default using the values.yaml? I also am on a "bare-metal" install, so im not sure I get the option to use the volumeClaimTemplates at all

Thanks again

@nwangtw
Copy link
Contributor

nwangtw commented Jun 11, 2019

I am wondering who is the owner of that file and if there is a log of changes in that file?

@simingweng
Copy link
Contributor Author

simingweng commented Jun 11, 2019

I did adapt the relevant Helm templates in this PR. So, theoretically, you should be able to install the chart from locally. But, the tricky part is that there's no post-0.17.8 official release available in public, which means no heron post-0.17.8 docker image in public.

The reason you will need post-0.17.8 Heron docker image is because there're other changes across the code base to migrate to newer Bookkeeper Client dependencies.

So, what we did in our deployment is to build heron RC release, push the docker image built to our own Artifactory repository, also push the generated maven artifacts there. Then, in our k8s cluster (we're on-promise), we configure the cluster to include our own Artifactory docker image repo.

Then we maintain a slightly different version of Helm chart based on the bare metal flavor of the templates available in this code base, because we're on-promise, and we don't need to care about GKE, AWS. So, our in house helm chart is simplified, but fairly close to the one in the source.

@the-scott-hand
Copy link
Contributor

sorry if this isnt the right place for this discussion at this point. I was working through the issue, I've hit a couple issues looking at the code on master that maybe have been fixed in another branch I dont know about.

  1. Baremetal refers to ledger disk, everything else refers to data-disk
    trying to run helm chart with baremetal fails because it doesnt know what data-disk is

So I changed it to minikube and got past that error as a temp fix

  1. my "the server could not find the requested resource" problem
    I've narrowed my problem down to the line below
    {{- $platform := .Values.platform -}}

my theory, is that the"-" in "{{- $platform" and the one on the other side of ".Values.apiServerMemory -}}" removes all white space between the comments and "apiVersion: v1". The problem, I think, is that this removes so much whitespace that "apiVersion: v1" actually gets commented out, like so:
"# under the License.apiVersion: v1"
removing the first "-" from "{{- $platform" caused my error to go away, as well as copying the "apiVersion: v1" line and pasting it directly below itself.

That at least gets me to the point where I can helm install and get new errors

@the-scott-hand
Copy link
Contributor

Added storageClassName: nameOfMyStorageClass directly after each of these lines with the same indentation to use rook-ceph for my storage, ended up needing it for bookie and zookeeper. still using "minikube" as my platform so I get the volumeClaimTemplate stuff.

accessModes: [ "ReadWriteOnce" ]

accessModes: [ "ReadWriteOnce" ]

accessModes: [ "ReadWriteOnce" ]

@the-scott-hand
Copy link
Contributor

the-scott-hand commented Jun 11, 2019

This line was also an issue for me:

command: ['sh', '-c', '/opt/bookkeeper/bin/dlog admin bind -l /ledgers -s {{ .Release.Name }}-zookeeper {{ .Values.zookeeper.clientPort }}:2181 -c distributedlog://{{ .Release.Name }}-zookeeper {{ .Values.zookeeper.clientPort }}/heron']

This seemed to work better for me:
command: ['sh', '-c', '/opt/bookkeeper/bin/dlog admin bind -l /ledgers -s {{ .Release.Name }}-zookeeper:{{ .Values.zookeeper.clientPort }} -c distributedlog://{{ .Release.Name }}-zookeeper:{{ .Values.zookeeper.clientPort }}/heron']

@the-scott-hand
Copy link
Contributor

sorry again if im spamming this, I just want to keep track of these somewhere.

On a somewhat more personal preference note, I don't like using the latest tag so that I can better keep track of what versions of things I'm running. It would be a "nice to have" to let the kubectl image be configurable in the values yaml as well

@nwangtw
Copy link
Contributor

nwangtw commented Jun 11, 2019

No worries. If you like, feel free to file an issue and we can continue the discussion.

agree that "latest" can be tricky.

@the-scott-hand
Copy link
Contributor

netcat (nc) is either not installed or not in the path of the container apache/bookkeeper:4.7.3, so the init container steps below fail:

initContainers:
# Wait until the zookeeper pods are up and running
- name: bookie-init-container
image: {{ .Values.bookkeeper.image }}
command:
- sh
- -c
- >-
while true; do
status=$(echo ruok | nc {{ .Release.Name }}-zookeeper {{ .Values.zookeeper.clientPort }});
writestatus=$(echo isro | nc {{ .Release.Name }}-zookeeper {{ .Values.zookeeper.clientPort }})
if [ "$status" = "imok" ] && [ "$writestatus" == "rw" ]; then
break;
fi;
echo 'Zookeeper {{ .Release.Name }}-zookeeper:{{ .Values.zookeeper.clientPort }} not ready';
sleep 4;
done

tried using the section from the minikube bookkeeper, I get "Error not able to create the directory /bookkeeper/data/journal" and "Error formatting journal directory failed"
https://github.com/apache/incubator-heron/blob/master/deploy/kubernetes/minikube/bookkeeper.yaml#L54-L62

ended up removing that section, that seemed to work

@the-scott-hand
Copy link
Contributor

the-scott-hand commented Jun 11, 2019

This line needs to change from org.apache.heron.uploader.dlog.DLUploader to com.twitter.heron.uploader.dlog.DLUploader ( see #2913 (comment))

https://github.com/apache/incubator-heron/blob/master/deploy/kubernetes/helm/templates/tools.yaml#L136

@the-scott-hand
Copy link
Contributor

so after those changes, everything comes up healthy with a helm install! I now get an error when i try to submit a example topology where it says "Path length must be > 0". any thoughts?

@nwangtw
Copy link
Contributor

nwangtw commented Jun 12, 2019

Which topology? Maybe be good to start with a basic one, like word count.
Also it could be helpful to see if the topology runs ok locally.

Another guess might be the config.

sreev pushed a commit to sreev/incubator-heron that referenced this pull request Apr 9, 2020
…ul storage (apache#3219)

* upgrade to bookkeeper 4.7.3

* update kubernetes deployment manifest files
nicknezis pushed a commit that referenced this pull request Sep 14, 2020
…ul storage (#3219)

* upgrade to bookkeeper 4.7.3

* update kubernetes deployment manifest files
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DistributedLog based stateful storage backend is not working
5 participants