Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[agentrunner] Split the docker image and clean up the logs #404

Merged
merged 2 commits into from
Sep 13, 2023

Conversation

eolivelli
Copy link
Member

Summary:

Changes to the docker images:

  • split the build of the langstream-runtime image into two steps (langstream-runtime-base and langstream-runtime)
  • langstream-runtime-base contains only the OS, the JDK, Python and some system tools
  • with this split rebuilding the images locally and uploading them to minikube is faster as you have to not rebuild 1GB of docker image every time

Changes to logging:

  • remove all the logging due to classloading
  • add logging every 5 seconds about the number of records in the pipeline and the amount of memory used

<groupId>io.netty</groupId>
<artifactId>netty-transport-native-unix-common</artifactId>
<version>${netty.version}</version>
</dependency>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is commented code

@@ -58,8 +27,8 @@ ENV NLTK_DATA="/app/nltk_data"

# Install python runtime deps
RUN cd /app && pipenv requirements --categories="packages full" > /app/requirements.txt \
&& python3 -m pip install -r /app/requirements.txt \
&& python3 -m nltk.downloader -d /app/nltk_data punkt averaged_perceptron_tagger
&& python3 -m pip install -r /app/requirements.txt \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this stuff takes a lot and it's likely to never change during development, can we put it in the base image ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we cannot do it

because of this line, that depends on the langstream-runtime build (the "maven" directory)

ADD maven/Pipfile.lock /app/Pipfile.lock

cc @cbornet

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should isolate the python stuff in it’s own maven subproject.

@nicoloboschi nicoloboschi merged commit 3bebc25 into main Sep 13, 2023
8 checks passed
@nicoloboschi nicoloboschi deleted the impl/agent-runner-backpressure branch September 13, 2023 14:43
benfrank241 pushed a commit to vectorize-io/langstream that referenced this pull request May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants