Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

65 - Containerized development environment #87

Merged
merged 8 commits into from
May 11, 2023
Merged

Conversation

GPortas
Copy link
Contributor

@GPortas GPortas commented May 6, 2023

What this PR does / why we need it:

Adds a containerized development environment for Dataverse frontend development.

Which issue(s) this PR closes:

Special notes for your reviewer:

It is interesting to explore ways to optimize the environment in the future, as the run-env.sh script can be quite slow when the Dataverse Frontend image has to be rebuilt. For example: on the first script execution or when updating package.json dependencies (causing the image to be rebuilt).

This environment depends on the Dataverse image push mechanism (See: IQSS/dataverse#9447).

Suggestions on how to test this:

Inside the dev-env folder, run the following command:

./run-env <DATAVERSE_BRANCH>

Choose an existing tag/branch name from https://github.com/orgs/gdcc/packages/container/package/dataverse

Follow the new README instructions to access the environment once deployed and to remove it.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

N/A

Is there a release notes update needed for this change?:

New containerized development environment.

Additional documentation:

No

@GPortas GPortas self-assigned this May 6, 2023
@GPortas GPortas marked this pull request as ready for review May 8, 2023 10:24
@GPortas GPortas removed their assignment May 8, 2023
dev_dataverse:
container_name: 'dev_dataverse'
hostname: dataverse
image: gdcc/dataverse:${DATAVERSE_BRANCH_NAME}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I suggest here that you modify this slightly?

Suggested change
image: gdcc/dataverse:${DATAVERSE_BRANCH_NAME}
image: ${REGISTRY}/gdcc/dataverse:${DATAVERSE_BRANCH_NAME}

And provide $REGISTRY as docker.io by default in the .env? You will need to switch that to ghcr.io to receive the preview images from pull requests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Since it is a dev environment and the usual thing will be to pull images of backend feature branches, I think it is more appropriate to set ghcr.io as the default registry value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really cool, I finally have dataverse running locally 🎉

I just have a comment about these same lines.

I tried ./run-env.sh 9444-push-images but since the branch was merged the script couldn't clone the branch.

Then I ran ./run-env.sh develop but since the image doesn't exist there was an error.

So I had to change the docker-compose to get the image:

image: docker.io/gdcc/dataverse:unstable

I know there is a line in the README with a disclaimer about this, but what if I want to run the develop branch because the changes in the backend were already merged? Could we provide an easy way to point to develop?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I tried this:

export REGISTRY=docker.io
./run-env.sh unstable

But I get this output:

INFO - Setting up Dataverse on branch unstable...
INFO - Removing current environment if exists...
Removing network dev-env_dataverse
WARNING: Network dev-env_dataverse not found.
Removing network dev-env_default
WARNING: Network dev-env_default not found.
INFO - Cloning Dataverse backend repository...
Cloning into 'dataverse'...
fatal: Remote branch unstable not found in upstream origin
INFO - Running docker containers...
...
INFO - Bootstrapping dataverse...
./run-env.sh: line 38: ./scripts/dev/docker-final-setup.sh: No such file or directory
INFO - Cleaning up repository...
...

So I guess we might need different variables for the branch name to clone (DATAVERSE_BRANCH_NAME) vs. the tag name in the registry (DATAVERSE_IMAGE_TAG?).

For now I'm ok with giving people (us, for now) one of the following workarounds:

I suspect that for quite a while frontend devs will be pointing at images with new APIs they need, so I'm ok with deferring this until later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mechanism is closely related to the ghcr.io registry and its PR image tags, designed to be used when you want to test a particular SPA feature you are developing that depends on a backend feature branch of Dataverse.

We do not find this relationship between branches names and tags in the docker.io registry, so the scripts do not work without applying the modifications you mention.

We can extend the script logic with new variables, as you suggest, to support these cases as well. Anyway, I'm happy to see this approved as it is for now. Let's see what Melina, Ellen or other developers find more useful to improve the mechanism.

@GPortas GPortas self-assigned this May 8, 2023
@GPortas GPortas removed their assignment May 8, 2023
@mreekie
Copy link

mreekie commented May 9, 2023

Test - temporarily added a random extra issue to closes list.

@mreekie mreekie linked an issue May 9, 2023 that may be closed by this pull request
5 tasks
@mreekie mreekie removed a link to an issue May 9, 2023
5 tasks
@mreekie
Copy link

mreekie commented May 9, 2023

testing - removed the extra issue.

@pdurbin pdurbin self-assigned this May 9, 2023
Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works great! Approved!

I did leave some feedback. I think we can further refine this in the future but this is a great step in the right direction!

Comment on lines +18 to +21
echo "INFO - Running docker containers..."
docker-compose -f "./docker-compose-dev.yml" up -d --build

echo "INFO - Waiting for containers to be ready..."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, I had an existing containerized dev env so I'm getting these errors:

 => [3/4] COPY package.json ./                                                                                    0.0s
 => [4/4] RUN npm install                                                                                       155.4s
 => exporting to image                                                                                           32.3s
 => => exporting layers                                                                                          32.3s
 => => writing image sha256:a74728f7351267800afd342ddfa2ce75cd075bab2dda9cc8a1414a7136cca1aa                      0.0s
 => => naming to docker.io/library/dev-env_dev_frontend                                                           0.0s

Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
Pulling dev_nginx (nginx:stable)...
stable: Pulling from library/nginx
9e3ea8720c6d: Pull complete
ee7feb8b89d4: Pull complete
3726de4affbb: Pull complete
4b5188c33a72: Pull complete
3c90f9dd7b85: Pull complete
2246bda193a8: Pull complete
Digest: sha256:b1a2c7bcc61be621eae24851a976179bfbc72591e43c1fb340f7497ff72128ff
Status: Downloaded newer image for nginx:stable
Creating dev_postgres ... 
Creating dev_smtp             ... error
Creating dev_solr_initializer ... 

Creating dev_postgres         ... error
in use by container "625132d69842de36ceb48e97802bf572c09cebad36fc180c680448707339c268". You have to remove (or rename) that container to be able to reuse that name.

ERROR: for dev_postgres  Cannot create container for service dev_postgres: Conflict. The container name "/dev_postgres" is already in use by container "cbf5acc6b793fa032268eb2371f934af8603ec094c0e03a371985cbec507db46". You have to remove Creating dev_solr_initializer ... error

ERROR: for dev_solr_initializer  Cannot create container for service dev_solr_initializer: Conflict. The container name "/dev_solr_initializer" is already in use by container "991489f935adac2e3bf6dd129f561e10de097fd493319c6606480463237671dd". You have to remove (or rename) that container to be able to reuse that name.

ERROR: for dev_smtp  Cannot create container for service dev_smtp: Conflict. The container name "/dev_smtp" is already in use by container "625132d69842de36ceb48e97802bf572c09cebad36fc180c680448707339c268". You have to remove (or rename) that container to be able to reuse that name.

ERROR: for dev_postgres  Cannot create container for service dev_postgres: Conflict. The container name "/dev_postgres" is already in use by container "cbf5acc6b793fa032268eb2371f934af8603ec094c0e03a371985cbec507db46". You have to remove (or rename) that container to be able to reuse that name.

ERROR: for dev_solr_initializer  Cannot create container for service dev_solr_initializer: Conflict. The container name "/dev_solr_initializer" is already in use by container "991489f935adac2e3bf6dd129f561e10de097fd493319c6606480463237671dd". You have to remove (or rename) that container to be able to reuse that name.
ERROR: Encountered errors while bringing up the project.
INFO - Waiting for containers to be ready...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this:

$ ./rm-env.sh 
WARNING: The DATAVERSE_BRANCH_NAME variable is not set. Defaulting to a blank string.
Removing network dev-env_dataverse
Removing network dev-env_default

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that didn't help. I think I need to docker rm these three manually:

ERROR: for dev_smtp  Cannot create container for service dev_smtp: Conflict. The container name "/dev_smtp" is already in use by container "625132d69842de36ceb48e97802bf572c09cebad36fc180c680448707339c268". You have to remove (or rename) Creating dev_postgres         ... error

ERROR: for dev_postgres  Cannot create container for service dev_postgres: Conflict. The container name "/dev_postgres"Creating dev_solr_initializer ... error
(or rename) that container to be able to reuse that name.

ERROR: for dev_solr_initializer  Cannot create container for service dev_solr_initializer: Conflict. The container name "/dev_solr_initializer" is already in use by container "991489f935adac2e3bf6dd129f561e10de097fd493319c6606480463237671dd". You have to remove (or rename) that container to be able to reuse that name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running docker rm on these manually fix it. I was able to bring up the dev env.

Comment on lines +57 to +58
echo "INFO - Creating sample data..."
python3 create_sample_data.py
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some sample data was created (screenshot below, honestly this is probably enough) but then the script exited early with this error:

Creating dataset cause-of-death.json in dataverse king
Dataset doi:10.5072/FK2/J3IUDU created.
<Response [201]>
data/dataverses/king/datasets/cause-of-death/files
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  391k  100   496  100  391k    370   292k  0:00:01  0:00:01 --:--:--  294k
{'status': 'OK', 'data': {'files': [{'description': '', 'label': 'adjacency_subset.sav', 'restricted': False, 'version': 1, 'datasetVersionId': 5, 'dataFile': {'id': 18, 'persistentId': '', 'filename': 'adjacency_subset.sav', 'contentType': 'application/x-spss-sav', 'filesize': 400569, 'description': '', 'storageIdentifier': 'local://1880138b119-b5ae7b172c21', 'rootDataFileId': -1, 'md5': '719388b2dfe326bd926a81cce8621211', 'checksum': {'type': 'MD5', 'value': '719388b2dfe326bd926a81cce8621211'}, 'creationDate': '2023-05-09'}}]}}
Lock found for dataset id 17... sleeping...
<Response [200]>
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0 4842k  100   183    0     0   7878      0 --:--:-- --:--:-- --:--:-- 12200
Traceback (most recent call last):
  File "/Users/pdurbin/github/iqss/dataverse-frontend/dev-env/dataverse-sample-data/create_sample_data.py", line 76, in <module>
    resp = api.upload_file(dataset_pid, "'" + filepath + "'")
  File "/Users/pdurbin/github/iqss/dataverse-frontend/dev-env/dataverse-sample-data/venv/lib/python3.10/site-packages/pyDataverse/api.py", line 1035, in upload_file
    resp = json.loads(result.stdout)
  File "/usr/local/Cellar/python@3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/local/Cellar/python@3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/Cellar/python@3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
INFO - Cleaning up repository...
HMDC-beamish:dev-env pdurbin$ 

Screenshot 2023-05-09 at 11-56-35 Root

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


A containerized environment, oriented to local development, is available to be run from the repository.

This environment contains a dockerized instance of the Dataverse backend with its dependent services (database, mailserver, etc), as well as an npm development server running the SPA frontend (With code autoupdating).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wanted to note that the autoupdating is working great!

I added "FOOBAR" to the Hello Dataverse page and as soon as I saved the file the text appeared, like this:

Screen Shot 2023-05-09 at 12 06 28 PM

Here was the change:

$ git diff
diff --git a/src/sections/hello-dataverse/HelloDataverse.tsx b/src/sections/hello-dataverse/HelloDataverse.tsx
index 52091d7..a4bb4f5 100644
--- a/src/sections/hello-dataverse/HelloDataverse.tsx
+++ b/src/sections/hello-dataverse/HelloDataverse.tsx
@@ -8,6 +8,7 @@ export function HelloDataverse() {
   return (
     <section className={styles.container}>
       <h2 className={styles.title}>{t('title')}</h2>
+      <h1>FOOBAR</h1>
       <img src={logo} className={styles.logo} alt={t('altImage')} />
       <p>
         <Trans t={t} i18nKey="description" components={{ 1: <code /> }} />

dev_dataverse:
container_name: 'dev_dataverse'
hostname: dataverse
image: gdcc/dataverse:${DATAVERSE_BRANCH_NAME}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I tried this:

export REGISTRY=docker.io
./run-env.sh unstable

But I get this output:

INFO - Setting up Dataverse on branch unstable...
INFO - Removing current environment if exists...
Removing network dev-env_dataverse
WARNING: Network dev-env_dataverse not found.
Removing network dev-env_default
WARNING: Network dev-env_default not found.
INFO - Cloning Dataverse backend repository...
Cloning into 'dataverse'...
fatal: Remote branch unstable not found in upstream origin
INFO - Running docker containers...
...
INFO - Bootstrapping dataverse...
./run-env.sh: line 38: ./scripts/dev/docker-final-setup.sh: No such file or directory
INFO - Cleaning up repository...
...

So I guess we might need different variables for the branch name to clone (DATAVERSE_BRANCH_NAME) vs. the tag name in the registry (DATAVERSE_IMAGE_TAG?).

For now I'm ok with giving people (us, for now) one of the following workarounds:

I suspect that for quite a while frontend devs will be pointing at images with new APIs they need, so I'm ok with deferring this until later.

@pdurbin pdurbin removed their assignment May 9, 2023
@kcondon kcondon self-assigned this May 11, 2023
@kcondon kcondon merged commit 0caa83b into develop May 11, 2023
@kcondon kcondon deleted the feature/dev-environment branch May 11, 2023 19:36
jayanthkomarraju pushed a commit to jayanthkomarraju/dataverse-frontend that referenced this pull request May 31, 2024
65 - Containerized development environment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Setup local development environment for Dataverse backend
6 participants