Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: upsert not working because sqlite3 version #3425

Merged
merged 16 commits into from
Jul 18, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM python:3.9.16-slim
FROM python:3.10.12-slim

# Environment Variables
ENV ARGILLA_HOME_PATH=/var/lib/argilla
Expand Down
58 changes: 37 additions & 21 deletions docker/quickstart.Dockerfile
Original file line number Diff line number Diff line change
@@ -1,35 +1,51 @@
FROM docker.elastic.co/elasticsearch/elasticsearch:8.5.3

ENV DEBIAN_FRONTEND=noninteractive
# TODO(gabrielmbmb): update this `Dockerfile` to multi-staged build to reduce the image size
FROM argilla/argilla-server:${ARGILLA_VERSION:-latest}

USER root

# Create a directory where Elasticsearch and Argilla will store their data
# We will use this directory as a volume to persist data between container restarts (mainly in HF spaces)
RUN mkdir /data
RUN chown -R elasticsearch:elasticsearch /data
RUN apt-get update && apt-get install -y \
apt-transport-https \
gnupg \
wget

# Install Elasticsearch signing key
RUN wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | gpg --dearmor -o /usr/share/keyrings/elasticsearch-keyring.gpg

# Add Elasticsearch repository
RUN echo "deb [signed-by=/usr/share/keyrings/elasticsearch-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | tee /etc/apt/sources.list.d/elastic-8.x.list

# Copy Argilla distribution files
COPY scripts/* /
COPY quickstart.requirements.txt /packages/requirements.txt
COPY dist/*.whl /packages/

RUN apt update && \
apt install -y curl git python3.9 python3.9-dev python3.9-distutils gcc gnupg apache2-utils sudo openssl systemctl && \
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py && \
python3.9 get-pip.py \
# Install Argilla
&& pip3 install -r /packages/requirements.txt && \
RUN \
# Create an user to run the Argilla server and Elasticsearch
useradd -ms /bin/bash argilla && \
# Create a directory where Elasticsearch and Argilla will store their data
mkdir /data && \
# Install Elasticsearch and configure it
apt-get update && apt-get install -y elasticsearch=8.8.2 && \
frascuchon marked this conversation as resolved.
Show resolved Hide resolved
mkdir /usr/share/elasticsearch/config && \
echo "cluster.name: \"docker-cluster\"\nnetwork.host: 0.0.0.0\npath.data: \"/data/elasticsearch\"bootstrap.system_call_filter: false" > /usr/share/elasticsearch/config/elasticsearch.yml && \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if is possible but we could use just the defined ARGILLA_HOME as the parent path for the elasticsearch data. $ARGILLA_HOME/elasticsearch

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just bypass the ARGILLA_HOME definition and force the value to /data

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some lines below we're doing ENV ARGILLA_HOME_PATH=/data/argilla

chown -R argilla:argilla /usr/share/elasticsearch /etc/elasticsearch /var/lib/elasticsearch /var/log/elasticsearch && \
chown argilla:argilla /etc/default/elasticsearch && \
# Install quickstart image dependencies
pip install -r /packages/requirements.txt && \
chmod +x /start_quickstart_argilla.sh && \
for wheel in /packages/*.whl; do pip install "$wheel"[server]; done && \
rm -rf /packages && \
rm -rf /var/lib/apt/lists/* \
# This line add context to this image. This solution should be improved
&& echo -e "{ \"deployment\": \"quickstart\" }" \
> /usr/local/lib/python3.9/dist-packages/argilla/server/static/deployment.json
# Give ownership of the data directory to the argilla user
chown -R argilla:argilla /data && \
# Clean up
apt-get clean && \
rm -rf /var/lib/apt/lists/* && \
rm -rf /packages

# echo -e "{ \"deployment\": \"quickstart\" }" \
# > /usr/local/lmib/python/dist-packages/argilla/server/static/deployment.json

USER elasticsearch
USER argilla

RUN echo "path.data: /data/elasticsearch" >> /usr/share/elasticsearch/config/elasticsearch.yml
ENV ELASTIC_CONTAINER=true

ENV OWNER_USERNAME=owner
ENV OWNER_PASSWORD=12345678
Expand Down
12 changes: 6 additions & 6 deletions docker/scripts/start_quickstart_argilla.sh
Copy link
Member

@frascuchon frascuchon Jul 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at some point of this file we can force the ARGILLA_HOME_PATH to /data

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're forcing it to /data/argilla

Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,16 @@
set -e

echo "Starting Elasticsearch"
elasticsearch 1>/dev/null 2>/dev/null &
/usr/share/elasticsearch/bin/elasticsearch 1>/dev/null 2>/dev/null &

echo "Waiting for elasticsearch to start"
sleep 30

echo "Running database migrations"
python3.9 -m argilla database migrate
python -m argilla database migrate

echo "Creating owner user"
python3.9 -m argilla users create \
python -m argilla users create \
--first-name "Owner" \
--username "$OWNER_USERNAME" \
--password "$OWNER_PASSWORD" \
Expand All @@ -21,7 +21,7 @@ python3.9 -m argilla users create \
--workspace "$ARGILLA_WORKSPACE"

echo "Creating admin user"
python3.9 -m argilla users create \
python -m argilla users create \
--first-name "Admin" \
--username "$ADMIN_USERNAME" \
--password "$ADMIN_PASSWORD" \
Expand All @@ -30,15 +30,15 @@ python3.9 -m argilla users create \
--workspace "$ARGILLA_WORKSPACE"

echo "Creating annotator user"
python3.9 -m argilla users create \
python -m argilla users create \
--first-name "Annotator" \
--username "$ANNOTATOR_USERNAME" \
--password "$ANNOTATOR_PASSWORD" \
--role annotator \
--workspace "$ARGILLA_WORKSPACE"

# Load data
python3.9 /load_data.py "$OWNER_API_KEY" "$LOAD_DATASETS" &
python /load_data.py "$OWNER_API_KEY" "$LOAD_DATASETS" &

# Start Argilla
echo "Starting Argilla"
Expand Down