Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Nginx ssl.no-default.conf to drop requests for unknown hosts #888

Merged
merged 4 commits into from
Mar 7, 2018

Conversation

fullyint
Copy link
Contributor

The only new material in this PR is the new ssl.no-default.conf.j2.

The other changed LOC are primarily just moving the tags: nginx-sites tasks (unchanged) from the nginx role to the wordpress-setup role. This avoids the problem that could occur if ssl.no-default.conf is enabled in the nginx role: The subsequent nginx reload in the letsencrypt role could otherwise potentially run before the cert listed in ssl.no-default.conf has been created. Nginx would fail to reload due to no such file.

I would have preferred to not move these tasks, but maybe there's still order in the sense that now tasks creating Nginx confs for "no-default" sites are right next to tasks creating Nginx confs for wordpress_sites.

The ssl.no-default.conf follows the h5bp ssl.no-default but adds the deferred option to the listen directive (notes in #885) and adds ssl_certificate and ssl_certificate_key.

Without ssl_certificate and ssl_certificate_key in the default server block, https sites fail:

  • in browser: "The connection was reset" ERR_CONNECTION_RESET
  • curl: curl: (35) Server aborted the SSL handshake

Why doesn't h5bp's ssl.no-default include ssl_certificate and ssl_certificate_key?

It's implicit that if you want to use ssl.no-default you're already using ssl for another server block, in which case you'd already have an ssl cert and can use that if necessary
-- h5bp/server-configs-nginx#177 (comment)

Which cert and key will it use? By default, it will pick the cert and key from one of the ssl-enabled sites. However, users may specify which site's cert and key to use in group_vars:
ssl_default_cert_site: example.com

The ssl_certificate and ssl_certificate_key section is the same as in wordpress-site.conf.j2 except that it uses _item, avoiding any clash with the item resulting from the template task's with_items: "{{ nginx_sites_confs }}".

This default ssl server block does not validate client certificates, avoiding complications with SNI.

@@ -22,7 +22,7 @@
changed_when: false
when: site_uses_letsencrypt
with_dict: "{{ wordpress_sites }}"
tags: [wordpress, wordpress-setup, nginx-includes]
tags: [wordpress, wordpress-setup, nginx-includes, nginx-sites]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Why this task needs those tags?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a user runs server.yml with --tags <any tag above>, Trellis may template Nginx confs that use the letsencrypt_cert_ids variable (e.g., in this PR's ssl.no-default.conf.j2 or in wordpress-site.conf.j2). The letsencrypt_cert_ids variable definition relies on the above task's registered var generate_cert_ids. If the task above does not also run for the tags above, letsencrypt_cert_ids will be malformed and the Nginx conf templating tasks will fail.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

ssl_trusted_certificate {{ nginx_path }}/ssl/{{ _item.key }}.cert;
ssl_certificate_key {{ nginx_path }}/ssl/{{ _item.key }}.key;
{% endif -%}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it return 444 if we include includes.d/{{ item.key }}/*.conf; here? The goal is to allow third party roles to inject ssl_certificate and ssl_certificate_key here.

  {% block includes_d -%}
  include includes.d/{{ item.key }}/*.conf;
  {% endblock -%}

Copy link
Contributor Author

@fullyint fullyint Sep 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default, it will pick the cert and key from one of the ssl-enabled sites. However, users may specify which site's cert and key to use in group_vars:
ssl_default_cert_site: example.com

So, users can specify which site's cert and key to use from wordpress_sites. But it seems that you anticipate that users or third-party roles may want to specify a cert and key other than any in wordpress_sites. I guess it's possible. I hadn't considered it.

My first idea to accommodate this scenario would be add to this PR the vars ssl_certificate_default and ssl_certificate_key_default for users or roles to define. In that case, this ssl.no-default.conf could use those values if defined or fall back on the cert and key for a site from wordpress_sites, as in the code above.

Considerations for integration with third-party roles are important but are not fresh in my mind. Would vars ssl_certificate_default and ssl_certificate_key_default be sufficient? Or does the include includes.d/... approach seem superior?

My first inclination is to avoid adding include includes.d/... to this ssl.no-default.conf. Given that this server block is filling requests for unknown hosts, my sense is that it should just get rid of the request with as little processing as possible (i.e., avoid all the potential processing of confs in includes.d). If there are examples of strong use cases for includes here, perhaps we should give users the option to specify includes paths:

# some group_vars location
# corresponding templates in local `nginx-includes` directory
nginx_no_default_include_paths:
  - includes.d/no-default/foo.conf
# in `no-default.conf.j2` and `ssl.no-default.conf.j2`
{% for path in nginx_no_default_include_paths | default([]) %}
include {{ path }};
{% endfor %}

I'm reluctant to add all this complexity unless it is likely to be used. But if likely, then 👍

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am over-concerned because this block only runs for unknown domains. Thus, we can't give it a valid ssl certificate for most of the cases.

Thanks!

Copy link
Collaborator

@tangrufus tangrufus Sep 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Questions:

  • why not ssl_verify_client off; explicitly?
  • why skipping ssl_certificate, ssl_trusted_certificate and ssl_certificate_key and let Nginx get them form another server block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the Nginx default ssl_verify_client off;, does it help to add the explicit directive?

When those other directives were absent, the real ssl-enabled wordpress_sites seemed to fail:

Without ssl_certificate and ssl_certificate_key in the default server block, https sites fail:

  • in browser: "The connection was reset" ERR_CONNECTION_RESET
  • curl: curl: (35) Server aborted the SSL handshake

In any case, I don't think those other server blocks would match an unknown host.

If there were only one site, never a need for different certs/keys for different hosts, then we could specify the cert and key just once in an http block. But I assume we want to enable multiple ssl-enabled wordpress_sites with separate certs.

These issues are new to me however, so I'm eager for suggestions.

Copy link
Collaborator

@tangrufus tangrufus Sep 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About ssl_verify_client off;, over-concerned again, sorry.

When those other directives were absent, the real ssl-enabled wordpress_sites seemed to fail

Those other directives are could be absent even with this patch. Edge case: All sites using third party ssl providers. The if/else conditions never matched.

Possible solution: Always generate self-signed cert for ssl.no-default.conf as h5bp/server-configs-nginx#177 suggested.

This opens another question: Is it good to enable ssl.no-default.conf even no site using SSL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it good to enable ssl.no-default.conf even no site using SSL?

I imagine the answer is "no," which is why this PR adds ssl.no-default.conf.j2 to the nginx_sites_confs list with the option enabled: "{{ sites_using_ssl != [] }}". The result is that ssl.no-default.conf is only symlinked into sites-enabled when at least one of the wordpress_sites has SSL enabled.

I hadn't considered the edge case that someone would get a third party cert and key on the server through means other than the Trellis built-in Let's Encrypt, manual, or self-signed options. If it were common, perhaps always adding a self-signed cert to ssl.no-default.conf would be good, or it could be an argument for adding the ssl_certificate_default and ssl_certificate_key_default vars mentioned above.

I figure that such would be a rare edge case, however, and this PR in its current form still allows users to customize the list of nginx_sites_confs, optionally replacing the default ssl.no-default.conf.j2 with some-custom-no-default.conf.j2.

Based on h5bp/server-configs-nginx/sites-available/ssl.no-default.
Adds ssl_certificate and ssl_certificate_key, required for Nginx
to load properly. Users can choose which site's cert and key are
used by defining `ssl_default_cert_site: example.com`.
@fullyint
Copy link
Contributor Author

fullyint commented Mar 5, 2018

I added a commit with the change that @tangrufus 💚 suggested in the discussion above:

Those other directives are could be absent even with this patch. Edge case: All sites using third party ssl providers. The if/else conditions never matched.

Possible solution: Always generate self-signed cert for ssl.no-default.conf as h5bp/server-configs-nginx#177 suggested.

Thus ssl-related third party roles need not be cognizant of ssl.no-default.conf.

Implementation notes:

  • if no sites create a self-signed cert, this creates one for throw-away example.com
  • ssl.no-default.conf uses the first existing self-signed cert, if any, else the example.com cert

git logistics:

  • Had to rebase for travis to succeed. Branch was old, not compatible with Ansible 2.4.
  • Commits should be squashed upon merge.

@fullyint
Copy link
Contributor Author

fullyint commented Mar 5, 2018

tags. I replaced some long tags lists with always. This way the diffie-hellman and letsencrypt cert IDs tasks will always run despite any --tags, but still only if the task's when condition is true. This seems better than trying to track and add any tags that these tasks would need as things evolve over time, as we've been doing. Using the always tag is also a little more conservative in case third-party roles might need those tasks to run even when users specify --tags galaxy_role_only.

no_default.cert I changed to always using a new no_default.cert instead of only using it when no real site's cert is available. My original goal was to avoid creating an extra cert if we could just use an existing one. This was an example of me falling victim to a pattern of complicating things while trying to enable Trellis to potentially run fewer tasks.

Creating this self-signed no_default.cert is not costly, taking less than one second. It makes for slightly more understandable ansible code. Users will find it more straightforward that ssl.no-default.conf always uses the no_default.cert instead of sometimes using a real site's cert.

Copy link
Member

@swalkinshaw swalkinshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question but looks great 🎉

ssl_default_site:
no_default:
site_hosts:
- canonical: example.com
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little odd to hardcode this is a 2nd place especially since it's the production value only. Is this actually needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(comment was addressed in internal discussion)

@fullyint
Copy link
Contributor Author

fullyint commented Mar 6, 2018

I reverted the tags from always back to lists of tag names. Sorry I didn't reflect much before making that change. I hadn't considered the implications of tagging always and I realize that in fact, I really don't want these tasks to run always.

Example. Suppose a user has a fresh server and runs --tags common for some reason. The Diffie-Hellman task will try to run due to its tag always but its relevant /etc/nginx/ssl dir won't exist yet and the task will fail.

Added changelog entry. I think this one is finally good to go. Should be squashed.

@fullyint fullyint merged commit f38b782 into master Mar 7, 2018
@fullyint fullyint deleted the nginx-ssl-no-default branch March 7, 2018 04:48
primozcigler added a commit to proteusthemes/pt-ops that referenced this pull request Mar 13, 2018
* trellis/master:
  Ensure Diffie-Hellman group is generated for Let's Encrypt (roots#964)
  Fix raw_vars feature to properly handle int values (roots#959)
  Update Ansible default plugin paths in config files (roots#958)
  Add Nginx ssl.no-default.conf to drop requests for unknown hosts (roots#888)
  Disable memcached UDP support by default
  Git: Ignore `vagrant.local.yml`
  Add sponsors [ci skip]
  Set Vagrant's ansible compatibility_mode
  Check and stop php7.1-fpm service
  Update to PHP 7.2
  Fix failed_when in template_root check with wp-cli 1.5.0 (roots#948)
  Bump Ansible version_tested_max to 2.4.3.0 (roots#945)
  Update wp-cli to 1.5.0 (roots#944)
  Update `vagrant_box_version` to `>= 201801.02.0`
  Update `vagrant_box_version` to `<= 201801.02.0`
  Update Vagrant requirement to v2.0.1.
  Support for IPv6 in Letsencrypt challenges
  Allow Trellis VMs to be created/provisioned in Windows Subsystem for Linux.
swalkinshaw added a commit that referenced this pull request May 28, 2020
…icate_common_name

Fix #888 & #1189 to comply with RFC2606
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants