Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add setting to ignore backup files or manage them in a separate bucket #11

Open
brendanheywood opened this issue Dec 12, 2016 · 1 comment · Fixed by #619
Open

Add setting to ignore backup files or manage them in a separate bucket #11

brendanheywood opened this issue Dec 12, 2016 · 1 comment · Fixed by #619

Comments

@brendanheywood
Copy link
Contributor

brendanheywood commented Dec 12, 2016

Backups are massive and constantly churn and get deleted. We want the flexibility to map these to a different object store which might different permissions / performance / cost profiles

@brendanheywood brendanheywood changed the title Add setting to ignore backup files Add setting to ignore backup files or manage them in a separate bucket Jan 24, 2021
@matthewhilton
Copy link
Contributor

Have had a big think about this, comparing three different options:

Method Pros Cons
Separate bucket ✅ Very safe
✅ Very clear what objects are what
❌ Quite a bit of extra config and api changes required, because API assumes objects are all stored in 1 location
❌ Increases chance of hitting bucket limit
Prefix subdirectory in single bucket ✅ Can be used in access control + lifecycle policies
✅ Easier to see in the dashboard (rather than clicking on every object individually to see metadata)
❌ Quite a bit of API changes required, since the API assumes objects all have a single deterministic path e.g. bucket + content hash, we would need to change to have a list of 'likely' location e.g. /, /backup, etc... Because during the migration process objects could realistically be in either.
❌ Added complexity - conflates object dir + metadata - divergence from the Moodle sitedata directory structure
❌ Can only add 1 dimension of metadata, unless we go really wild with multi layer directories
❌ Moodle likely needs delete perms (at least during migration process) in the entire bucket, since renaming aka moving is a copy and delete operation, which somewhat defeats the purpose.
❌ Once you turn it on you can't really backout, since things will be shuffled around in the bucket
Object tags ✅ Can support multiple tags. E.g. environment, type, last access
✅ Keep files themselves + metadata separate - No change to where objects are stored, and remains equivalent with sitedata.
✅ Can be used in access control + lifecycle policies
✅ Moodle does not need delete object permission, only object tagging permission
✅ Easy to add / remove metadata, just clear the tags and re-run process with new definitions.
✅ Easier to run as a complete separate process. E.g. a process that classifies and tags objects after the fact, rather than requiring them to be set at upload time (otherwise incur cost and complexity to move them around)
✅ Easy backout - just turn off classifier and objects + their location remain the exact same.
❌ Extra cost, but this is quite small (e.g. aws is 1c per 10k objects per month, so 10 million objects = $10 / month

I'm of the opinion currently that object tags are the best way to go:

  • Simpler - keeps the data location the same (big win) and we can just tack on a more or less separate file classifier module (think type=backup,env=prod,infilelessbackup=true etc...) to handle the tags
  • Cost - a tiny bit more (assuming equivalent API call costs) but realistically is tiny compared to the savings benefits
  • Features - More than 1 dimension of data - IMO I think this is the biggest win. If we are going to spend all this time trying to move these files, we might as well be able to expand on it in the future.
  • Risk - Easier backout is a big plus

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants