Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨♻️ (⚠️ devops) Remove 5gb limit when uploading data via nodeports #2993

Merged
merged 91 commits into from
May 5, 2022
Merged
Show file tree
Hide file tree
Changes from 60 commits
Commits
Show all changes
91 commits
Select commit Hold shift + click to select a range
24f90e4
added r_clone and r_clone_utils
Apr 20, 2022
7dca9b7
added node_ports_v2_r_clone
Apr 20, 2022
ede7db6
refactor settings
Apr 20, 2022
965ef2a
Merge remote-tracking branch 'upstream/master' into remove-5gb-limit-…
Apr 20, 2022
f641299
refactor settings
Apr 20, 2022
1391d21
refactor tests
Apr 20, 2022
33d8bbf
nodeports supports optional r_clone_settings
Apr 20, 2022
a33cfb0
pylint
Apr 20, 2022
002ad5f
pylint
Apr 20, 2022
66da40d
moved endpoint where it makes sense
Apr 21, 2022
e3730f9
moved rclone endpoint and settings
Apr 21, 2022
94e4c5f
new way of importing
Apr 21, 2022
a7be73b
added tests and fixed existing tests for nodeports
Apr 22, 2022
db6be1f
extended stprage service
Apr 22, 2022
dace16c
Merge remote-tracking branch 'upstream/master' into remove-5gb-limit-…
Apr 22, 2022
1789bc1
fix tests
Apr 22, 2022
efc1d1d
pylint
Apr 22, 2022
77c9b33
dynamic-sidecar now requires nodeports
Apr 22, 2022
76a0528
fix simcore-sdk requiremetns
Apr 22, 2022
8409f38
properly formatting URL
Apr 24, 2022
c61de67
using http in urls
Apr 24, 2022
365b34b
moved requirements
Apr 24, 2022
df29fd5
injecting mandatory settings
Apr 24, 2022
49e48f9
test refactoring
Apr 24, 2022
92d8b62
fix env vars for rlcone
Apr 24, 2022
a3b595b
adding r_clone
Apr 24, 2022
5c564d6
Merge remote-tracking branch 'upstream/master' into remove-5gb-limit-…
Apr 24, 2022
aea65b9
fix unit test simore-sdk
Apr 25, 2022
8113043
fix directr-v2 integration tests
Apr 25, 2022
dad78df
fixed first barch of directorv-2 unittest fixes
Apr 25, 2022
3032ab3
pylint
Apr 25, 2022
1fabc74
second btach of fixes director-v2
Apr 25, 2022
123495d
fixed other tests
Apr 26, 2022
14be36c
moved tests
Apr 26, 2022
fcd0ad0
refactor to use storage_client module
Apr 26, 2022
f9af13d
replacing caching
Apr 26, 2022
2b80e15
removed r_clone exra settings
Apr 26, 2022
b557dac
moved region to settings
Apr 26, 2022
24c1236
rename
Apr 26, 2022
5093401
more refactoring
Apr 26, 2022
e847cbd
revert
Apr 26, 2022
b74cca0
Merge remote-tracking branch 'upstream/master' into remove-5gb-limit-…
Apr 26, 2022
3237d15
file metadata update check permissions
Apr 26, 2022
caf5b17
refactor command exception
Apr 26, 2022
ffb00bb
command is now a list not a string
Apr 26, 2022
039cd4e
moving from list to *args
Apr 26, 2022
69db8bd
removed test dir which no longer exists
Apr 26, 2022
a13b90e
fix issues with encoding invalid characters
Apr 26, 2022
c17f082
removing defaults
Apr 26, 2022
3b1c38e
Merge remote-tracking branch 'upstream/master' into remove-5gb-limit-…
Apr 27, 2022
486d481
refactor to use exising upload_link fetching
Apr 27, 2022
eb6f940
removed unused from specs
Apr 27, 2022
81ace91
pylint
Apr 27, 2022
c75fbc0
removed old comment
Apr 27, 2022
e43cb28
pylint
Apr 27, 2022
5e18032
dynamic-sidecar work also gets transferred via rclone
Apr 27, 2022
f5204d4
add missing file
Apr 27, 2022
8bf58c5
directory watcher disabled during docker-compse up
Apr 27, 2022
14abcdc
removed log message
Apr 27, 2022
a53eff7
Merge remote-tracking branch 'upstream/master' into remove-5gb-limit-…
Apr 27, 2022
5f2ea52
Merge remote-tracking branch 'upstream/master' into remove-5gb-limit-…
Apr 29, 2022
8bcce57
addind middleware to log HTTP errors serialized to client
Apr 29, 2022
6f5cecc
removing logs since they get automatically captured
Apr 29, 2022
1160e57
returning AnyUrl
Apr 29, 2022
8e7fb41
more changes
Apr 29, 2022
9d6acd0
rename and quote shell
Apr 29, 2022
4455007
Merge remote-tracking branch 'upstream/master' into remove-5gb-limit-…
Apr 29, 2022
9af8adb
added quotes and optimized checks
Apr 29, 2022
656e499
fixing tests
Apr 29, 2022
de6c151
renaming
Apr 29, 2022
0af0f33
removed delete file metedata
Apr 29, 2022
01f56eb
removes file and metadata in case of upload error
Apr 29, 2022
b02d8bb
added nosec
Apr 29, 2022
46d4f7a
fix pylint
Apr 29, 2022
9f2707e
new dict format migration
May 2, 2022
181d3eb
update requirements
May 2, 2022
c34badc
refacror installation
May 2, 2022
5d18d78
@sanderegg always upload instead of raising error
May 2, 2022
2091408
Merge remote-tracking branch 'upstream/master' into remove-5gb-limit-…
May 3, 2022
7c6cd29
fix tests after merge
May 3, 2022
c1248a7
Merge remote-tracking branch 'upstream/master' into remove-5gb-limit-…
May 3, 2022
1bd4839
Merge remote-tracking branch 'upstream/master' into remove-5gb-limit-…
May 4, 2022
448bce1
using full parameter names
May 4, 2022
7b226fa
repalced helper with validator
May 4, 2022
b109667
using full parameter names
May 4, 2022
5a922dc
fix issue with minio S3 endpoint format
May 4, 2022
ac3e214
using full parameter names
May 4, 2022
fb14b4c
Merge branch 'remove-5gb-limit-nodeports' of github.com:GitHK/osparc-…
May 4, 2022
f9b9116
Merge remote-tracking branch 'upstream/master' into remove-5gb-limit-…
May 5, 2022
5b9e167
changing endpoint formatting
May 5, 2022
32fda96
moved s3_region
May 5, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .env-devel
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ S3_BUCKET_NAME=simcore
S3_ENDPOINT=172.17.0.1:9001
S3_SECRET_KEY=12345678
S3_SECURE=0
R_CLONE_S3_PROVIDER=MINIO
R_CLONE_PROVIDER=MINIO

SCICRUNCH_API_BASE_URL=https://scicrunch.org/api/1
SCICRUNCH_API_KEY=REPLACE_ME_with_valid_api_key
Expand Down
34 changes: 29 additions & 5 deletions api/specs/storage/openapi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -257,11 +257,11 @@ paths:
required: true
schema:
type: string
requestBody:
content:
application/json:
schema:
$ref: "#/components/schemas/FileMetaData"
- name: user_id
in: query
required: true
schema:
type: string
responses:
"200":
description: "Returns file metadata"
Expand All @@ -271,6 +271,30 @@ paths:
$ref: "#/components/schemas/FileMetaEnvelope"
default:
$ref: "#/components/responses/DefaultErrorResponse"
delete:
summary: Removes a file's meta data entry
operationId: delete_file_meta_data
parameters:
- name: fileId
in: path
required: true
schema:
type: string
- name: location_id
in: path
required: true
schema:
type: string
- name: user_id
in: query
required: true
schema:
type: string
responses:
"204":
description: "Removes the file meta data entry from the database"
default:
$ref: "#/components/responses/DefaultErrorResponse"

/locations/{location_id}/files/{fileId}:
get:
Expand Down
3 changes: 1 addition & 2 deletions ci/github/unit-testing/director-v2.bash
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,13 @@ test() {
--color=yes --cov-report=term-missing --cov-report=xml --cov-config=.coveragerc \
-v -m "not travis" services/director-v2/tests/unit --ignore=services/director-v2/tests/unit/with_dbs \
--asyncio-mode=auto \
--ignore=services/director-v2/tests/unit/with_swarm
# these tests cannot be run in parallel
pytest --log-format="%(asctime)s %(levelname)s %(message)s" \
--log-date-format="%Y-%m-%d %H:%M:%S" \
--cov=simcore_service_director_v2 --durations=10 --cov-append \
--color=yes --cov-report=term-missing --cov-report=xml --cov-config=.coveragerc \
--asyncio-mode=auto \
-v -m "not travis" services/director-v2/tests/unit/with_swarm services/director-v2/tests/unit/with_dbs
-v -m "not travis" services/director-v2/tests/unit/with_dbs
}

# Check if the function exists (bash specific)
Expand Down
4 changes: 2 additions & 2 deletions packages/pytest-simcore/src/pytest_simcore/minio_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# pylint: disable=unused-variable

import logging
from typing import Dict, Iterator
from typing import Any, Dict, Iterator

import pytest
from _pytest.monkeypatch import MonkeyPatch
Expand Down Expand Up @@ -44,7 +44,7 @@ def _ensure_remove_bucket(client: Minio, bucket_name: str):
@pytest.fixture(scope="module")
def minio_config(
docker_stack: Dict, testing_environ_vars: Dict, monkeypatch_module: MonkeyPatch
) -> Dict[str, str]:
) -> Dict[str, Any]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very minor: python 3.9 allows to use dict instead of Dict

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice to know!
@colinRawlings you were wondering about None in typing. This is similar.

assert "pytest-ops_minio" in docker_stack["services"]

config = {
Expand Down
17 changes: 17 additions & 0 deletions packages/settings-library/src/settings_library/r_clone.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
from enum import Enum

from pydantic import Field
from .base import BaseCustomSettings
from .s3 import S3Settings


class S3Provider(str, Enum):
AWS = "AWS"
CEPH = "CEPH"
MINIO = "MINIO"


class RCloneSettings(BaseCustomSettings):
R_CLONE_S3: S3Settings = Field(auto_default_from_env=True)
R_CLONE_PROVIDER: S3Provider
R_CLONE_REGION: str = Field("us-east-1", description="S3 region to use")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: why not move this to S3Settings?

Copy link
Contributor Author

@GitHK GitHK Apr 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I wanted to put it inside the S3Settings, it is something only used by rclone.
If in the future this must be moved to S3Settings I have a plant to refactor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while I agree it's currently only used by RClone, it is still a S3 setting. and it will be used in AWS deployments cause currently we probably use some default. Therefore I would put it there. Can you please check how storage connects to AWS S3 then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Storage uses Minio as seen below. It does not use the region, which by default None as a parameter. Not specifying the region in s3 clients equals to using us-east-1

def __init__(
self,
endpoint: str,
access_key: str = None,
secret_key: str = None,
secure: bool = False,
):
self.__metadata_prefix = "x-amz-meta-"
self.endpoint = endpoint
self.access_key = access_key
self.secret_key = secret_key
self.secure = secure
self.endpoint_url = ("https://" if secure else "http://") + endpoint
try:
self._minio = Minio(
endpoint, access_key=access_key, secret_key=secret_key, secure=secure
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whether the S3_REGION is used or not is up to the client code. so I do not think this should create any problem.

16 changes: 12 additions & 4 deletions packages/settings-library/src/settings_library/s3.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,20 @@
from typing import Optional

from .base import BaseCustomSettings
from functools import cached_property


class S3Settings(BaseCustomSettings):
S3_ENDPOINT: str = "minio:9000"
S3_ACCESS_KEY: str = "12345678"
S3_SECRET_KEY: str = "12345678"
S3_ENDPOINT: str
S3_ACCESS_KEY: str
S3_SECRET_KEY: str
S3_ACCESS_TOKEN: Optional[str] = None
S3_BUCKET_NAME: str = "simcore"
S3_BUCKET_NAME: str
S3_SECURE: bool = False

@cached_property
def endpoint(self) -> str:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pair review comment: check if still required

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved this to a validator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spent too much time on this. Requires much more time to refactor and make working with validator. Will leave as is.

if not self.S3_ENDPOINT.startswith("http"):
scheme = "https" if self.S3_SECURE else "http"
return f"{scheme}://{self.S3_ENDPOINT}"
return self.S3_ENDPOINT
47 changes: 47 additions & 0 deletions packages/settings-library/src/settings_library/utils_r_clone.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
import configparser
from copy import deepcopy
from io import StringIO
from typing import Dict

from .r_clone import RCloneSettings, S3Provider

_COMMON_ENTRIES: Dict[str, str] = {
"type": "s3",
"access_key_id": "{access_key}",
"secret_access_key": "{secret_key}",
"region": "{aws_region}",
"acl": "private",
}

_PROVIDER_ENTRIES: Dict[S3Provider, Dict[str, str]] = {
# NOTE: # AWS_SESSION_TOKEN should be required for STS
S3Provider.AWS: {"provider": "AWS"},
S3Provider.CEPH: {"provider": "Ceph", "endpoint": "{endpoint}"},
S3Provider.MINIO: {"provider": "Minio", "endpoint": "{endpoint}"},
}


def _format_config(entries: Dict[str, str]) -> str:
config = configparser.ConfigParser()
config["dst"] = entries
with StringIO() as string_io:
config.write(string_io)
string_io.seek(0)
return string_io.read()


def get_r_clone_config(r_clone_settings: RCloneSettings) -> str:
provider = r_clone_settings.R_CLONE_PROVIDER
entries = deepcopy(_COMMON_ENTRIES)
entries.update(_PROVIDER_ENTRIES[provider])

r_clone_config_template = _format_config(entries=entries)

# replace entries in template
r_clone_config = r_clone_config_template.format(
endpoint=r_clone_settings.R_CLONE_S3.endpoint,
access_key=r_clone_settings.R_CLONE_S3.S3_ACCESS_KEY,
secret_key=r_clone_settings.R_CLONE_S3.S3_SECRET_KEY,
aws_region=r_clone_settings.R_CLONE_REGION,
)
return r_clone_config
28 changes: 28 additions & 0 deletions packages/settings-library/tests/test_utils_r_clone.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# pylint: disable=redefined-outer-name

import pytest
from settings_library.r_clone import RCloneSettings, S3Provider
from settings_library.utils_r_clone import _COMMON_ENTRIES, get_r_clone_config


@pytest.fixture(params=list(S3Provider))
def r_clone_settings(request, monkeypatch) -> RCloneSettings:
monkeypatch.setenv("R_CLONE_PROVIDER", request.param)
monkeypatch.setenv("S3_ENDPOINT", "endpoint")
monkeypatch.setenv("S3_ACCESS_KEY", "access_key")
monkeypatch.setenv("S3_SECRET_KEY", "secret_key")
monkeypatch.setenv("S3_BUCKET_NAME", "bucket_name")
monkeypatch.setenv("S3_SECURE", False)
return RCloneSettings()


def test_r_clone_config_template_replacement(r_clone_settings: RCloneSettings) -> None:
r_clone_config = get_r_clone_config(r_clone_settings)
print(r_clone_config)

assert "{endpoint}" not in r_clone_config
assert "{access_key}" not in r_clone_config
assert "{secret_key}" not in r_clone_config

for key in _COMMON_ENTRIES.keys():
assert key in r_clone_config
2 changes: 2 additions & 0 deletions packages/simcore-sdk/requirements/_base.in
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,10 @@
--constraint ../../../requirements/constraints.txt
--requirement ../../../packages/postgres-database/requirements/_base.in
--requirement ../../../packages/service-library/requirements/_base.in
--requirement ../../../packages/settings-library/requirements/_base.in
--requirement ../../../packages/models-library/requirements/_base.in

aiocache
aiofiles
aiohttp
aiopg[sa]
Expand Down
19 changes: 15 additions & 4 deletions packages/simcore-sdk/requirements/_base.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
#
# pip-compile --output-file=requirements/_base.txt --strip-extras requirements/_base.in
#
aiocache==0.11.1
# via -r requirements/_base.in
aiodebug==2.3.0
# via -r requirements/../../../packages/service-library/requirements/_base.in
aiofiles==0.8.0
Expand All @@ -15,6 +17,7 @@ aiohttp==3.8.1
# -c requirements/../../../packages/models-library/requirements/../../../requirements/constraints.txt
# -c requirements/../../../packages/postgres-database/requirements/../../../requirements/constraints.txt
# -c requirements/../../../packages/service-library/requirements/../../../requirements/constraints.txt
# -c requirements/../../../packages/settings-library/requirements/../../../requirements/constraints.txt
# -c requirements/../../../requirements/constraints.txt
# -r requirements/_base.in
aiopg==1.3.3
Expand All @@ -34,6 +37,8 @@ attrs==20.3.0
# jsonschema
charset-normalizer==2.0.12
# via aiohttp
click==8.1.2
# via typer
dnspython==2.2.1
# via email-validator
email-validator==1.1.3
Expand Down Expand Up @@ -71,13 +76,15 @@ pydantic==1.9.0
# -c requirements/../../../packages/models-library/requirements/../../../requirements/constraints.txt
# -c requirements/../../../packages/postgres-database/requirements/../../../requirements/constraints.txt
# -c requirements/../../../packages/service-library/requirements/../../../requirements/constraints.txt
# -c requirements/../../../packages/settings-library/requirements/../../../requirements/constraints.txt
# -c requirements/../../../requirements/constraints.txt
# -r requirements/../../../packages/models-library/requirements/_base.in
# -r requirements/../../../packages/service-library/requirements/_base.in
# -r requirements/../../../packages/settings-library/requirements/_base.in
# -r requirements/_base.in
pyinstrument==4.1.1
# via -r requirements/../../../packages/service-library/requirements/_base.in
pyparsing==3.0.7
pyparsing==3.0.8
# via packaging
pyrsistent==0.18.1
# via jsonschema
Expand All @@ -87,15 +94,17 @@ pyyaml==5.4.1
# -c requirements/../../../packages/postgres-database/requirements/../../../requirements/constraints.txt
# -c requirements/../../../packages/service-library/requirements/../../../requirements/constraints.txt
# -c requirements/../../../packages/service-library/requirements/./constraints.txt
# -c requirements/../../../packages/settings-library/requirements/../../../requirements/constraints.txt
# -c requirements/../../../requirements/constraints.txt
# -r requirements/../../../packages/service-library/requirements/_base.in
six==1.16.0
# via jsonschema
sqlalchemy==1.4.32
sqlalchemy==1.4.35
# via
# -c requirements/../../../packages/models-library/requirements/../../../requirements/constraints.txt
# -c requirements/../../../packages/postgres-database/requirements/../../../requirements/constraints.txt
# -c requirements/../../../packages/service-library/requirements/../../../requirements/constraints.txt
# -c requirements/../../../packages/settings-library/requirements/../../../requirements/constraints.txt
# -c requirements/../../../requirements/constraints.txt
# -r requirements/../../../packages/postgres-database/requirements/_base.in
# aiopg
Expand All @@ -104,9 +113,11 @@ tenacity==8.0.1
# via
# -r requirements/../../../packages/service-library/requirements/_base.in
# -r requirements/_base.in
tqdm==4.63.1
tqdm==4.64.0
# via -r requirements/_base.in
typing-extensions==4.1.1
typer==0.4.1
# via -r requirements/../../../packages/settings-library/requirements/_base.in
typing-extensions==4.2.0
# via
# aiodebug
# pydantic
Expand Down
1 change: 1 addition & 0 deletions packages/simcore-sdk/requirements/_test.in
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ pytest-xdist
pytest-lazy-fixture

# mockups/fixtures
aioboto3
aioresponses
alembic
click
Expand Down
Loading