-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add embargo re-design doc #1772
Conversation
@AlmightyYakob - a few folks have asked for unembargoing a subset of files. they haven't gone into discussing whether this becomes a new dandiset (probably initial approach) or whether a dandiset contains both open and embargoed files (future approach, also related to restricted data in dandiset). also, in the current implementation, one can add a public file to an embargoed dandiset and it won't copy it over to the embargoed bucket. and this doesn't need to be implemented right away, but this use case may help think about design and ReBAC that we will likely need to implement in the future. |
As far as I can tell, this design allows for that use case, as public read access is permitted on a per-object basis. |
@satra, I assume these ideas are not blocking development of Jake's proposal. If you would like, we can make mention of these things in Jake's doc as future directions to address, especially if that lets Jake get moving on this proposed work. |
no, these are non-blocking. the only comment i have is on the worker job for unembargoing. is there something that catches failure modes there, and what happens when it fails/partially fails? |
The S3 Batch job returns a report on if there were any failures per object, and what the reason was (after the auto-retrying of trivial errors). So we would be able to see if there was a problem with anything. |
perhaps it should return those in some form to the user in the validation page. |
I don't think so. Failures in this process indicate an error internally, and should be hidden from the user (we could just say "there was an error un-embargoing this data). There's nothing the user can do with the details of the failure to resolve the situation, and would just lead to more clutter, confusion, etc. If it were to fail, we'd get a sentry error about it, or in the worst case scenario, the user could say "hey this failed", and we'd look at the details of it to see what happened. |
isn't this the same thing we are seeing with some of the validation logic, and nothing informs us or the user really when failures exist (e.g. stuck in pending validation). hence it would be good to have some kind of timeout or process so the users don't have to manually check when the unembargoing is done. i.e. something that monitors the process and has some sanity checks in it based on size/number of objects being unembargoed. the human thing hasn't been very robust given the number of issues i have myself filed about a weird state. i agree the user shouldn't have to bother with this, but the system should have something that notifies/handles errors or stuck states for us. |
🚀 PR was released in |
This PR adds the design doc that outlines the new embargo design, with unified bucket storage.