Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update File Storage to be able to Scale to Multiple Servers #1716

Closed
ElijahLynn opened this issue May 7, 2020 · 5 comments
Closed

Update File Storage to be able to Scale to Multiple Servers #1716

ElijahLynn opened this issue May 7, 2020 · 5 comments
Assignees
Labels
Content API / export [CMS feature] DevOps CMS team practice area Epic Issue type Infrastructure Objective 1 2020: Develop CMS infrastructure for evolving API, e2e testing and demo needs.

Comments

@ElijahLynn
Copy link
Contributor

ElijahLynn commented May 7, 2020

Follow up from #1360.

Problem
Currently we perform a pre-deployment task to backup the state of /sites/default/files on the instance about to be replaced. Then we restore that state to the new instance after it is up. This works fine for a 1 instance deployment which is what we are currently using. However, it causes a problem for the launch of a high-availability (HA) improvement. The problem is:

  • Making it so that if more than one instance is launched behind the ELB then users uploading files will have the new files only go to one instance and not the other(s). The state is not tracked over multiple instances and it gets out of sync, which is not acceptable.

Possible Solution
The solution that @indytechcook and I discussed is to implement the Drupal module S3 Filesystem (dgo.to/s3fs). Which will keep the "state" in an S3 bucket and sync it to the other servers.

Notes
We used to use EFS for this solution, which worked fine until we needed a faster file system for our unique implementation of Tome Sync (dgo.to/tome). EFS was too slow to allow us to TAR the sites/default/files folder up, which was where we store the JSON exports that Tome creates. There is more info in other tickets but just know that EFS is not an acceptable solution, as we need a fast SSD to do our operations now.

@indytechcook
Copy link
Contributor

indytechcook commented May 13, 2020

fyi, the drush commands of s3fs require drush 9+

Linking #1230

@indytechcook
Copy link
Contributor

I setup a test bucket and have files being transferred up to s3 from land. Here is the process:

bucket: dsva-vagov-ci-cms-test-files
Added config to settings.local.php

$config['s3fs.settings']['use_instance_profile'] = TRUE;
$config['s3fs.settings']['credentials_file'] = '/app/credentials';
$config['s3fs.settings']['bucket'] = 'dsva-vagov-ci-cms-test-files';
$config['s3fs.settings']['region'] = 'us-gov-west-1';
$config['s3fs.settings']['root_folder'] = 'neil';
$config['s3fs.settings']['public_folder'] = 'public';
$config['s3fs.settings']['private_folder'] = 'private';

To use MFA:

  • First update MFA token.
  • Copy credentials file which is normally stored at ~/.aws/credentials to the project root directory.
  • Now lando can connect to S3/AWS.

@indytechcook indytechcook added the Epic Issue type label May 15, 2020
@indytechcook indytechcook changed the title Use S3FS for file assets storage Update File Storage to be able to Scale to Multiple Servers May 15, 2020
@indytechcook indytechcook removed this from the CMS Sprint 6 milestone May 15, 2020
@indytechcook
Copy link
Contributor

Looking into s3fs there are several edge cases that will need to be solved including CEX tar performance and imagecache generation. The best solution would be using FS-CACHE which @ElijahLynn briefly looked into during CEX implementation. We need to take a deeper look into the performance issues we saw. If we find FS-CACHE is not going to work then we will move on to S3fs and use some information from #1759 (comment).

@ElijahLynn ElijahLynn added this to the CMS Q2 milestone May 26, 2020
@ElijahLynn
Copy link
Contributor Author

Leaving a discussion here that @indytechcook and I had about AWS Lustre FSx (not in GovCloud yet).

https://dsva.slack.com/archives/CT4GZBM8F/p1590094576409200

@kevwalsh kevwalsh modified the milestones: CMS Q2, CMS Q3 2020 Jul 7, 2020
@jefflbrauer jefflbrauer modified the milestones: CMS Q3 2020, CMS Q4 2020 Jul 7, 2020
@cmaeng cmaeng added the Infrastructure Objective 1 2020: Develop CMS infrastructure for evolving API, e2e testing and demo needs. label Sep 15, 2020
@oksana-c oksana-c removed this from the CMS Q4 2020 milestone Jan 3, 2021
@oksana-c
Copy link
Contributor

@ElijahLynn @olivereri , now that we moved back to EFS, is this epic obsolete?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Content API / export [CMS feature] DevOps CMS team practice area Epic Issue type Infrastructure Objective 1 2020: Develop CMS infrastructure for evolving API, e2e testing and demo needs.
Projects
None yet
Development

No branches or pull requests

6 participants