Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server migration – convert .htaccess to Nginx conf #1197

Closed
adamziel opened this issue Apr 4, 2024 · 28 comments
Closed

Server migration – convert .htaccess to Nginx conf #1197

adamziel opened this issue Apr 4, 2024 · 28 comments

Comments

@adamziel
Copy link
Collaborator

adamziel commented Apr 4, 2024

playground.wordpress.net is shared hosting and sometimes becomes unusably slow. There's a new server @brandonpayton set up, but it runs Nginx and not Apache.

We need to express the below rules using Nginx. These could be useful:

AddEncoding x-gzip .gz

<FilesMatch "index\.html">
Header unset ETag
Header set Cache-Control "max-age=0, no-cache, no-store, must-revalidate"
</FilesMatch>
<FilesMatch "index\.js|blueprint-schema\.json|logger.php|wp-cli.phar|wordpress-importer.zip">
Header set Access-Control-Allow-Origin "*"
Header unset ETag
Header set Cache-Control "max-age=0, no-cache, no-store, must-revalidate"
</FilesMatch>

SetEnv ENV_VARIABLE ****

AddType application/wasm .wasm
AddType application/octet-stream .data

<FilesMatch "iframe-worker.html$">
  Header set Origin-Agent-Cluster: ?1
</FilesMatch>
<FilesMatch "store.zip$">
  SetEnv no-gzip 1
  SetEnv no-brotli 1
  Header set Access-Control-Allow-Origin: *
</FilesMatch>

RewriteEngine on
RewriteRule ^scope:.*?/(.*)$ $1 [NC]
RewriteRule plugin-proxy$ /plugin-proxy.php [NC]
RedirectMatch 301 /wordpress-browser.html /

RewriteCond %{HTTP_REFERER} ^https://developer/\.wordpress\.org/
RewriteRule wordpress.html /index.html [R=302,L]

RewriteCond %{HTTP_REFERER} ^https://wordpress/\.org/
RewriteRule wordpress.html /index.html [R=302,L]
@brandonpayton
Copy link
Member

With the new site, we don't have direct access to nginx config, but we may be able to get there with a combination of

  • PHP-handled redirects
  • Requesting support of additional default MIME types

The site is currently set up as a non-WP site, but we could create a different WP-based site if it turns out we'd like to use other features like a page cache based on memcached.

@bgrgicak
Copy link
Collaborator

bgrgicak commented Apr 5, 2024

There is also one .htaccess in the root folder on the server which would need to be migrated.

@brandonpayton brandonpayton self-assigned this Apr 11, 2024
@brandonpayton
Copy link
Member

So far, it looks like we will need to use PHP to handle requests so to dynamically handle different URI and to add custom HTTP headers. Using .htaccess is certainly more convenient, but it is not supported. PHP will be slower than directly serving static files, but we have edge caching available that should speed things up.

It's easy to turn on edge caching for a site, but I need to look into how we can automate cache invalidation when deploying Playground updates.

For now, I'm working on handling the different requests with PHP. Once that is working, we can worry about edge caching.

@brandonpayton
Copy link
Member

brandonpayton commented Apr 18, 2024

Serving Playground from the test site seems to be working OK:
https://https://playground-dot-wordpress-dot-net.atomicsites.blog/

Because we want to customize HTTP headers and perform redirects and do not have the ability to customize nginx config, all files including static files are served via PHP using a platform feature custom-redirects.php.
https://gist.github.com/brandonpayton/9e3da0845791f6e5c833013d83cf86d6#file-custom-redirects-php

It's super verbose compared to htaccess, but it works and is surprisingly fast IMO given that PHP is also serving every static asset -- at least when it isn't under heavy load.

Before we move this to production, we can enable edge caching so those responses can be served from the cache.

What is left:

  • Create a separate deployment workflow that deploys to the test site.
  • Turn on edge caching and verify it is working
  • Update new deployment workflow to purge the edge cache when it completes

@bgrgicak
Copy link
Collaborator

// TODO: Set these
/*
SetEnv GITHUB_TOKEN --secret--

Does this need to be set in PHP or is there a more secure way? I know that VIP allows admins to set env variables.

@brandonpayton
Copy link
Member

brandonpayton commented Apr 19, 2024

Does this need to be set in PHP or is there a more secure way? I know that VIP allows admins to set env variables.

@bgrgicak, on WP Cloud, there is are APIs for setting and accessing secrets, so I was thinking we'd use those.

Btw, the comments at the end of custom-redirects.php are just parts of the htaccess file that haven't been addressed yet. Maybe that was clear, but I feel like explaining since they just look like cruft at the end :)

@adamziel
Copy link
Collaborator Author

adamziel commented Apr 19, 2024

@brandonpayton let's start a new issue to create and document a Playground serving flow that uses just PHP dev server or... Playground itself :-) it's not a high priority work, but let's still track it. It's cool how we can handle all the rewrite rules headers etc. using just PHP. I'm already thinking through scenarios where it makes everything self-contained and removes even a webserver dependency.

@brandonpayton
Copy link
Member

Some updates:

For the 429s, we might need to request a greater allowance for these requests, which will hopefully be OK once edge caching is enabled. Also, we might consider carefully diffing files between website builds and requesting cache invalidation only for changed files.

@brandonpayton
Copy link
Member

@adamziel, sure! Created #1294. Please feel free to add more context if I missed anything. 🙇

@brandonpayton
Copy link
Member

I am currently looking into how to avoid the 429s and to get the edge cache to work for requests for Chrome. It is priming the cache for 2+ request for the same resource via curl, but when requesting in Chrome, the dev tools Network tab just shows cache misses. It's later in the day, so perhaps what is going on will be obvious to fresh eyes in the morning.

@brandonpayton
Copy link
Member

It is priming the cache for 2+ request for the same resource via curl, but when requesting in Chrome, the dev tools Network tab just shows cache misses.

I think the hideExperimentalNotice cookie is probably causing edge cache misses. Will look at what we can do for this.

@brandonpayton
Copy link
Member

I think the hideExperimentalNotice cookie is probably causing edge cache misses. Will look at what we can do for this.

Safari does not allow using WebStorage in private windows, so if we want Playground to run in that context (and I would guess we want it to run in as many contexts as reasonable), we cannot rely upon localStorage.

One possibility is to augment the fetch request for static files to omit credentials.

@adamziel
Copy link
Collaborator Author

One possibility is to augment the fetch request for static files to omit credentials.

Sounds like a great idea

@brandonpayton
Copy link
Member

brandonpayton commented Apr 23, 2024

One possibility is to augment the fetch request for static files to omit credentials.

Sounds like a great idea

One downside is that this may cause issues for anyone wanting to host a Playground behind some kind of auth requirement. If we need to do it, it seems fine for now, and we could later add configurability so creds can still be relayed by Service Worker fetch.

That said, I've been thinking, and we may be able to have nginx serve most of our static files directly. We don't really have special rules for most files, so I think we may be able to deploy most static files in a way that nginx can find them directly without involving PHP. And when we do want to add custom headers we can omit those files from their usual location and have PHP serve them with custom headers instead.

@adamziel, is there a reason we cannot rewrite "scope:xyz/" paths in the Service Worker so that doesn't need handled on the web server? If we can do that, I think we can cut PHP out of the loop on the web server for most static file requests, and that should address the 429 errors we are seeing due to rate limiting (which can be due to using too many PHP workers at a time).

@adamziel
Copy link
Collaborator Author

One downside is that this may cause issues for anyone wanting to host a Playground behind some kind of auth requirement. If we need to do it, it seems fine for now, and we could later add configurability so creds can still be relayed by Service Worker fetch.

Good point, let's open a new issue to keep track of this. Someone will report it sooner or later, let's prioritize then.

we may be able to deploy most static files in a way that nginx can find them directly without involving PHP

That sounds good, too!

@adamziel, is there a reason we cannot rewrite "scope:xyz/" paths in the Service Worker so that doesn't need handled on the web server?

I thought we did that already? Do requests with scope:xyz in them make it to the server?

@brandonpayton
Copy link
Member

Good point, let's open a new issue to keep track of this. Someone will report it sooner or later, let's prioritize then.

I created a PR to omit credentials from the Service Worker requests for static files here. When that is merged, I will create an issue to acknowledge credentials are omitted in case anyone has a problem with that.

@adamziel, is there a reason we cannot rewrite "scope:xyz/" paths in the Service Worker so that doesn't need handled on the web server?

I thought we did that already? Do requests with scope:xyz in them make it to the server?

Interesting! The Chrome dev tools show that as the effective URL, and even in the request's "Initiator" sub-tab, it looked like the final URL included the scope:xyz. But when I log requests on the server, all I see for scope mentions are HTTP Referer headers like:

'HTTP_REFERER' => 'https://playground-dot-wordpress-dot-net.atomicsites.blog/scope:0.8919731437847068/',

we may be able to deploy most static files in a way that nginx can find them directly without involving PHP

That sounds good, too!

Great! My plan is to:

  • Rework the code that performs redirects and adds custom headers so we can ask whether a given path needs special treatment
  • Create a playground-files directory on the web server to contain all Playground files
  • Mirror directory structure of playground-files in /srv/htdocs/ dir which is the web root
  • Symlink all files that do not need special treatment into their corresponding dir under /srv/htdocs so nginx locates and serves those directly
  • Rely on nginx to defer to PHP when files are not found on disk so we can continue customizing headers for specific files.

We can do all of this in a working dir outside of the web root and then rsync everything into place under /srv/htdocs/.

🤞 I think this will take care of the 429 errors, and it should be faster regardless.

brandonpayton added a commit that referenced this issue Apr 25, 2024
## What is this PR doing?

This PR adds a workflow for deploying the playground.wordpress.net
website to WP Cloud

Related to #1197

## What problem is it solving?

Sometimes the website has availability issues due to its current shared
hosting environment. Switching to WP Cloud is intended to address those
issues.

## Testing Instructions

- Make sure the new workflow completed successfully
- Test Playground using the temporary staging URL
https://playground-dot-wordpress-dot-net.atomicsites.blog/

NOTE: There are some outstanding issues with rate-limiting that break
certain aspects of WordPress Playground on the test site, but Playground
should still load the WP home page. The rate-limiting issues will be
addressed separately, and possibly with the host itself rather than in a
follow-up PR.
@brandonpayton
Copy link
Member

I have scripts written to do the above mentioned setup on the host. It is mostly working, but there is a confusing bug where PHP claims that str_ends_with() does not exist even though PHP_VERSION reflects 8.1.

Planning to continue this work in the morning.

@brandonpayton
Copy link
Member

brandonpayton commented Apr 26, 2024

After merging #1331 and the fix #1333, Nginx is now serving most Playground files directly without involving PHP. The 429s have disappeared, and I am seeing more edge cache HITs.

The last thing AFAIK is to get the logger working. This should be straightforward. The platform has an API for setting and retrieving secrets, and we can use the same custom-redirects-lib.php file to notice requests to logger.php and set secret-related environment vars ahead of time (so logger.php doesn't need to change). I am planning to finish that in the morning.

To see the current state of the site, check out:
https://playground-dot-wordpress-dot-net.atomicsites.blog/

@brandonpayton
Copy link
Member

#1337 wired up the error logger on the new WP Cloud site.

A few things left:

  • Redirects to / for /wordpress.html requests with wordpress.org "Referer" are not working properly. This was derived from our existing htaccess file. @adamziel is this still needed?
  • Add a brief README to website-deployment/ to explain our setup on WP Cloud
  • Move mime-types.php into a shared JSON file and generate mime-types.php during deployment process

The only possible MUST before switching the playground.wordpress.net domain over to WP Cloud is the wordpress.html redirect. We probably also want to do more thorough testing before making the switch.

@adamziel
Copy link
Collaborator Author

Redirects to / for /wordpress.html requests with wordpress.org "Referer" are not working properly. This was derived from our existing htaccess file. @adamziel is this still needed?

@brandonpayton we’d have to patch and re-deploy this line (and potentially another one in the same repo):

https://github.com/WordPress/wporg-wasm/blob/ce5c76146eb46578a8a80239cfb2f05cd7ac7dfe/source/wp-content/themes/wporg-wasm/src/wasm-demo/src/components/playground.js#L14

@brandonpayton
Copy link
Member

brandonpayton commented Apr 27, 2024

@brandonpayton we’d have to patch and re-deploy this line (and potentially another one in the same repo):

@adamziel, I think we should be able to fix the redirect on our side as well. It's just broken at the moment. I can take a quick look at fixing the redirect, and if that doesn't work, we can adjust the behavior under wporg-wasm.

@adamziel
Copy link
Collaborator Author

adamziel commented Apr 30, 2024

@brandonpayton cool! FYI that code powers the Playground demo embedded on https://w.org/playground.

brandonpayton added a commit that referenced this issue Apr 30, 2024
)

## What is this PR doing?

This PR solves an issue where `/wordpress.html` was not being served by
PHP because it was not moved into `files-to-serve-via-php` during
website deploy. And it was not being moved because the
should-serve-via-PHP check relied upon the current referer, which is
something we don't know at deploy time.

## How is the problem addressed?

This PR solves the issue by having the maybe-redirect function return a
declaration of its intent to redirect for specific referers. Then we can
see there is need for special treatment at deploy time, and the request
handler can see the declaration and act upon it at request time.

## Testing Instructions

- Tested manually via SSH on
playground-dot-wordpress-dot-net.atomicsites.blog.
- Will also test post-deploy by manually running the deploy workflow for
WP Cloud and confirming the fix afterward

Related to #1197
@brandonpayton
Copy link
Member

brandonpayton commented Apr 30, 2024

@adamziel, thank you for the tip about how the redirect is used by https://w.org/playground.

I fixed the redirect to work. There is a secondary bug where the redirect may not happen when edge-caching has already cached a non-redirecting /wordpress.html response, and that is being fixed under #1351.

It looks like https://w.org/playground is embedding an iframe referring to wasm.wordpress.net. Is there any difference between wasm.wordpress.net and playground.wordpress.net?

brandonpayton added a commit that referenced this issue Apr 30, 2024
## What is this PR doing?

This PR disables edge caching for conditionally redirected resources.

## What problem is it solving?

The WP Cloud edge cache seems to avoid caching some redirects. But for
conditionally redirected resources like `/wordpress.html`, the edge
cache can cache the resource when not redirected, and after that, PHP is
no longer given a chance to conditionally redirect.

## How is the problem addressed?

If a resource has conditional redirects, we now explicitly disable
caching for that resource.

## Testing Instructions

- Tested manually on playground-dot-wordpress-dot-net.atomicsites.blog

Related to #1197
@adamziel
Copy link
Collaborator Author

Is there any difference between wasm.wordpress.net and playground.wordpress.net?

both point to the same server, that redirect is the only difference

@brandonpayton
Copy link
Member

Here is what I believe remains before switching to the new site:

  • Learn what kind of GitHub token is used for plugin-proxy.php, and obtain one for use with the WP Cloud site (thinking that it is poor practice to reuse secrets in different places, though if this token is just allowed access to public repos, reusing should be totally fine)
  • Update deployment workflow to create mime-types.php from the shared mime-types.json file (low priority but might as well be done before the WP Cloud site becomes production)
  • Thoroughly test new site -- test it personally from various angles and then post a call for testing

@WordPress/playground-maintainers any other thoughts on this?

brandonpayton added a commit that referenced this issue May 3, 2024
## What is this PR doing?

This PR eliminates duplication in the code base where we maintained
separate MIME type mappings for PHPRequestHandler and for files served
via PHP on WP Cloud.

Related to #1197

## How is the problem addressed?

This PR generates `mime-types.php` from shared JSON when deploying the
website to WP Cloud.

## Testing Instructions

- Temporarily adjust WP Cloud deploy workflow to be tested under this PR
- Confirm the workflow completes without error and that
playground-dot-wordpress-dot-net.atomicsites.blog loads properly
brandonpayton added a commit that referenced this issue May 4, 2024
## What is this PR doing?

Adds secrets on demand for PHP endpoints that need them.

Related to #1197

## What problem is it solving?

We are not yet relaying secrets required by the endpoints
`plugin-proxy.php` and `oauth.php`.

## How is the problem addressed?

This PR adds secrets as environment variables when requests for those
endpoints are processed.

## Testing Instructions

- Test briefly on the WP Cloud staging site
- Test use of the updates to identify files-to-serve-via-php during
deployment
@brandonpayton
Copy link
Member

Deduplication of MIME type mappings is done. I also discovered oauth.php used secrets and made an update for that.

What is left is:

  • Work out whether to reuse secrets or create new ones in the case of auth tokens
  • Thoroughly test new site -- test it personally from various angles and then post a call for testing

brandonpayton added a commit that referenced this issue May 10, 2024
## What is this PR doing?

The custom-redirects-lib.php file is setting different environment
variable names than are needed for the GitHub export oauth script.

Related to #1197

## Testing Instructions

- Tested manually on playground-dot-wordpress-dot-net.atomicsites.blog
@brandonpayton
Copy link
Member

brandonpayton commented May 10, 2024

@brandonpayton
Copy link
Member

We migrated the website to WP Cloud today and this can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

3 participants