-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨♻️ (⚠️ devops) Remove 5gb limit when uploading data via nodeports #2993
✨♻️ (⚠️ devops) Remove 5gb limit when uploading data via nodeports #2993
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2993 +/- ##
=======================================
Coverage 79.7% 79.7%
=======================================
Files 693 698 +5
Lines 29049 29215 +166
Branches 3744 3755 +11
=======================================
+ Hits 23156 23294 +138
- Misses 5058 5082 +24
- Partials 835 839 +4
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
storage does not use RClone right? why is it now installed in the CI? or I missed something?
Also I think you still have async-cache installed in the director-v2/
@@ -44,7 +44,7 @@ def _ensure_remove_bucket(client: Minio, bucket_name: str): | |||
@pytest.fixture(scope="module") | |||
def minio_config( | |||
docker_stack: Dict, testing_environ_vars: Dict, monkeypatch_module: MonkeyPatch | |||
) -> Dict[str, str]: | |||
) -> Dict[str, Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very minor: python 3.9 allows to use dict
instead of Dict
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice to know!
@colinRawlings you were wondering about None
in typing. This is similar.
formatted_error = "".join( | ||
[f"\n{_SEP}{k}{_SEP}\n{v}" for k, v in fields.items()] | ||
) | ||
logger.debug("Error serialized to client:%s", formatted_error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: as far as I see, this creates a text error. Could we not have a JSON formatted one like everywhere else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have it formatted in an easy readable way. This is intended to be used for development/debugging. I'm not sure that having it formatted as json will help us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is just a matter of consistency actually. it will also prevent us from having to manage both text/json cases when parsing the responses
class RCloneSettings(BaseCustomSettings): | ||
R_CLONE_S3: S3Settings = Field(auto_default_from_env=True) | ||
R_CLONE_PROVIDER: S3Provider | ||
R_CLONE_REGION: str = Field("us-east-1", description="S3 region to use") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while I agree it's currently only used by RClone, it is still a S3 setting. and it will be used in AWS deployments cause currently we probably use some default. Therefore I would put it there. Can you please check how storage connects to AWS S3 then?
packages/simcore-sdk/src/simcore_sdk/node_ports_common/filemanager.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! thanks and kudos!
formatted_error = "".join( | ||
[f"\n{_SEP}{k}{_SEP}\n{v}" for k, v in fields.items()] | ||
) | ||
logger.debug("Error serialized to client:%s", formatted_error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is just a matter of consistency actually. it will also prevent us from having to manage both text/json cases when parsing the responses
class RCloneSettings(BaseCustomSettings): | ||
R_CLONE_S3: S3Settings = Field(auto_default_from_env=True) | ||
R_CLONE_PROVIDER: S3Provider | ||
R_CLONE_REGION: str = Field("us-east-1", description="S3 region to use") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whether the S3_REGION is used or not is up to the client code. so I do not think this should create any problem.
S3_SECURE: bool = False | ||
|
||
@cached_property | ||
def endpoint(self) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pair review comment: check if still required
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved this to a validator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spent too much time on this. Requires much more time to refactor and make working with validator. Will leave as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only checked devops-changes, not code. Devops changes are fine and acknowledged.
…simcore-forked into remove-5gb-limit-nodeports
Kudos, SonarCloud Quality Gate passed!
|
@mrnicegyu11 @Surfict
Environment variables changes:
R_CLONE_S3_PROVIDER
toR_CLONE_PROVIDER
director-v2
(make sure they exist on the deployment):What do these changes do?
To avoid hitting a wall when uploading more than 5GB via AWS S3
rclone
is used to upload data to ports.nodeports_v2
is provided with anr_clone_settings
instance it will userclone
to upload data to s3s3_link
used by r_clone is generated bystorage
storage
is called (using the same code as the previous method)rclone
upload fails, metadata entries will be removed (trying to reduce trash)dynamic-sidecar
now has a mandatoryr_clone_settings
fields.BONUS:
docker-compose up
)Related issue/s
How to test
jupyter-math:2.0.5
service.dynamic-sidecar
you will seer_clone
being usedstorage
service calls to the metadata endpoint will be issued to:GET v0/locations/0/files/{s3_object}/s3/link
andPATCH v0/locations/0/files/_S3_OBJECT_/metadata
Checklist
make openapi-specs
,git commit ...
and thenmake version-*
)