-
-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.0.0-rc.1 #1
base: master
Are you sure you want to change the base?
v1.0.0-rc.1 #1
Conversation
This is essentially a complete re-write and expansion on the original functionality of `v0.1.0` - modular schema support, allows for easy future expansions - support `RFC7239` & `X-Forwarded-*` as primary trusted schemas - allows for implementers to choose schemas to process by passing an `options` object - this *is* a breaking api change, where `v0.1.0` returns an array of IPs, `v1.0.0` returns an object representing parameters of `RFC 7239` - the version increment is required to allow dependants libraries/implementers to continue to operate safely with out version breakage *(assuming version resolving occurs with `npm`) - [`RFC 7239` Parameters](http://tools.ietf.org/html/rfc7239#section-5) - coupled with support for less common schemas, named in association with vendor usage: - cloudflare - fastly - microsoft - nginx - zscaler
host: req.headers && req.headers.host ? req.headers.host : undefined, | ||
port: req.connection.remotePort.toString(), | ||
ports: [req.connection.remotePort.toString()], | ||
proto: req.connection.encrypted ? 'https' : 'http' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this actually works in Node.js 0.6 and the others? I only ask because the test is just a mock, so we have to verify manually if we cannot add a real test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was relying on the travis test to be honest, I managed to manually test all the way down to 0.8, then gave up on getting 0.6 installed on my machine. I'll get that working today and confirm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are only as honest as your tests are--since you are using mocks, then you are not actually testing anything (you have to test a real req
for Travis CI to be of any help to you).
addrs: [req.connection.remoteAddress], | ||
by: null, | ||
host: req.headers && req.headers.host ? req.headers.host : undefined, | ||
port: req.connection.remotePort.toString(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a disconnected socket, req.connection.remotePort
is undefined
and this will throw. This needs to be handled.
it's actually a bug in your PR here that it relieved. the issue is that you are using a simple mock for Here is the follow-though using Node.js 0.10.24:
|
Another concern I have is why does this PR seem to be significantly slower? So this module does not have a benchmark suite, but
The matching benchmark is the closest one that measures just the As you can see above, based on those three benchmarks, performance dropped significantly. Since this module would theoretically be called on every request to a web server, it has a direct impact on the maximum req/sec a single Node.js instance can serve. I understand, of course, that the current implement was very simply, but I did not expect to see it drop all the way down to 55k ops/sec. My main question is: can we do better? What are the bottlenecks in this code? Can v8 optimize functions, or does it keep bailing out for some reason? This note on performance is not technically a blocker, though, but just something I noted and think one of us should at least glance into :) |
function isSecure (req) { | ||
try { | ||
var cf = JSON.parse(req.headers['cf-visitor']) | ||
return cf.scheme !== undefined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is broken; according to https://support.cloudflare.com/hc/en-us/articles/200170536-How-do-I-redirect-HTTPS-traffic-with-Flexible-SSL-and-Apache- the value {"scheme":"http"}
is valid and this routine would incorrectly mark the request as secure.
that is concerning, I did not expect much change in the way of performance, so it never occurred to me to apply a benchmark, I'll take a deeper dive into this and find the bottleneck. thanks for all the great feedback, I'll make the appropriate changes. |
Absolutely! I wanted to get your the feedback quickly :) I noticed the repo was last changed in 2014, so I just make a few mundane updates to Also, with the significant code, please feel free to also add your name for 2015 in the LICENSE file. |
return forwarded | ||
} | ||
|
||
splitMap(header, ELEMENT_SEPARATOR, function parseElement (el) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this will fail to parse the header Forwarded: for="foo,bar"
properly, opening a vector for security issues when relying on the header for access control.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from my reading of RFC 7239
the spec states that each for
parameter represents a single address. unless I'm reading this wrong. can you link to or highlight the section that says otherwise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is irrelevant; an incoming header can contain any values, valid or not. The current implementation will do the wrong thing and this is not good, as the only reason you are parsing these headers is for logging, auditing, or access controls.
Just think of a standard proxy that does not modify anything existing in the Forwarded
header and simply adds it's own data to it; it will pass-through the bogus value that will create false parses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, what I'm saying is that the current parse is actually not RFC 7239 compliant, as you are not correctly following the ABNF, which allows for ,
and ;
to appear within a quoted-string
and the defined ABNF for the header is as follows:
Forwarded = 1#forwarded-element
forwarded-element = [ forwarded-pair ] *( ";" [ forwarded-pair ] )
forwarded-pair = token "=" value
value = token / quoted-string
token = 1*tchar
tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*"
/ "+" / "-" / "." / "^" / "_" / "`" / "|" / "~"
/ DIGIT / ALPHA
quoted-string = DQUOTE *( qdtext / quoted-pair ) DQUOTE
qdtext = HTAB / SP /%x21 / %x23-5B / %x5D-7E / obs-text
obs-text = %x80-FF
The given header, Forwarded: for="foo,bar"
is syntactically valid, yet this module will definitely parse it incorrectly.
Because the RFC allows extensions, perhaps your proxy decides it's going to include a custom field of the authenticated user's name. Uh, oh!
# this is in bash, so the "" is an escaped double quote (i.e. it's only a single double quote in the program)
$ node -pe "require('forwarded')({connection:{remotePort:0},headers:{forwarded:'user=""bad,for=_whitelisted_ip"";for=_blacklisted_ip'}})"
{ addrs: [ 'whitelisted_ip', 'blacklisted_ip' ],
by: null,
host: undefined,
port: '0',
ports: [ '0' ],
proto: 'http' }
Even though the header is 100% syntactically valid, with a single forwarded entry of {for: '_blacklisted_ip', user: 'bad,for=_whitelisted_ip'}
, we end up thinking that the proxy got the request from _whitelisted_ip
incorrectly due to the invalid parsing this module did :(
And I'm not even a security expert; I can tell you that if we don't fix the parsing, it will most likely end up as a critical security bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation will do the wrong thing
can you clarify what you mean by that?
the token syntax does not actually allow a comma to be present:
token = 1*tchar
tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*"
/ "+" / "-" / "." / "^" / "_" / "`" / "|" / "~"
/ DIGIT / ALPHA
; any VCHAR, except delimiters
Delimiters are chosen from the set of US-ASCII visual characters not allowed in a token
(DQUOTE and "(),/:;<=>?@[\]{}").
so, if anything, we should discard for="foo,bar"
since its not a valid token
verification vs. validation
further, imo, the purpose of this library is to parse the headers, _not to verify values_, the resulting values should be filtered through a secondary layer if the implementer chooses to verify values, such as network addresses, host names, etc ...
which leaves us with the question: should we add validation here?
specifically: to validate addresses, hostnames, ports.
as a side, RFC 7239 node identifier syntax could also include unknown and obfuscated identifiers:
MUST have a leading underscore "_". Furthermore, it MUST also
consist of only "ALPHA", "DIGIT", and the characters ".", "_", and
"-".
which we can easily validate as well.
…r' property is an array
I've added some minor performance improvements and switched to using @lpinca next steps:
|
@dougwilson any further direction from you to the todo list above? |
This part I don't understand. The entire purpose of this module is literally to do that operation, not depend on another module to do it. This module's name marched the name of the header for a reason... |
Basically I think there is a huge misunderstanding of the current separation of concerns between the modules here. This module is meant only to be concerned with parsing/formatting of the HTTP "Forwarded" header (and I suppose the X headers that it unified) The proxy-addr module is designed to determine the "proxied address" of a request, which could include configuring the source of proxy headers. This PR seems to be heavily mixing those two together into this single library to me. |
A good litmus test in this organization is that if your code logically organizes into more than a single JavaScript file, that is multiple modules, not one. There are still some old, inherited modules that have a lib folder, but we are slowing working to get rid of that such that every module consists only of a single index.js file (for runtime code). |
Oh, I know I keep making comments but I keep thinking of new things to say :) I have not yet looked at the new changes, or even really looked though to remind myself of the existing changes, but I think all the features and stuff in here are good, but more though from us (and perhaps the technical committee that now oversees this organization) needs to be done around which changes go into which modules, what new modules should be created, etc. |
I suppose I did not expand on my thoughts here... mea culpa, let me explain:
so my thoughts up this point:
hence the separation and added dependency, mostly due to the extension parsing allowed by spec, as I'm a fan of separation of concerns. |
all good, same here :)
as I was reviewing the original work (almost a year old now!) I started thinking of better way to break this down too, I'm no longer a fan of having the vendor headers included in the core of this module. I believe now, it should only concern it self with the RFC 7239 Spec and and the vendor specs can be additional, separate modules, that can be passed to this one for additional processing (e.g. building the stack of IPs in a desired order across multiple specs) not sure what a good pattern for the latter though, or how it would propagate across higher dependencies for user-level control? (e.g. a simple pattern can perhaps just be an array of module names, or directly references functions (that in turn can be loaded from modules) that provide a common object |
now that I'm back at a desktop keyboard, here's an example of how my previous thoughts might shape out: minimal / defaultconst forwarded = require('forwarded') // included dependencies: RFC2739, X-Forwarded-*
let result = forwarded(request) verboseconst forwarded = require('forwarded')
const rfc2739 = require('forwarded-rfc2739')
const xforwarded = require('forwarded-x-forwarded')
const cloudflare = require('forwarded-cloudflare')
const fastly = require('forwarded-fastly')
// custom order
let options = {
schemas: [cloudflare, rfc7239, fastly, xforwarded]
}
let result = forwarded(request, options) |
I think |
would |
There is no reason why it cannot do that, since those are all basically part of the "proxied address". Ideally one should at least get enough information to construct the URL for the request, and just the hostname/IP is not enough information, because it could be on a non-default port, could be using HTTP or HTTPS, and more. Typically if people just wanted to parse their headers, they would directly depend on the module that just does the header parsing they need. The I think the long-term plans is that It used to also parse things like Because we keep going back and forth on this discussion, I still have not even had time to lok at your new code or even refresh myself on the existing content of this pull request, sorry! |
no worries, I think your point about I think we're debating semantics of where the logic sits. The logic used here can be abstracted to TL;DRI think we're saying the same thing. :) I'll let you review the current code first, then we can discuss further. |
Update mocha to version 2.5.1 🚀
Update mocha to version 2.5.3 🚀
Update istanbul to version 0.4.3 🚀
Update mocha to version 3.0.1 🚀
BREAKING CHANGE: This module no longer supports Node.js 0.10
👻😱 Node.js 0.10 is unmaintained 😱👻
Update dependencies to enable Greenkeeper 🌴
Update mocha to the latest version 🚀
This is essentially a complete re-write and expansion on the original functionality of
v0.1.0
RFC 7239
&X-Forwarded-*
as primary trusted schemasoptions
objectv0.1.0
returns an array of IPs,v1.0.0
returns an object representing parameters ofRFC 7239
npm
)RFC 7239
ParametersTODO
forwarded().addrs
hapi-forward
has no testshapi-forwarded
worksproxy-addr
works (with minor fix to its own test)express
workshttp
server / clientForwarded
header elements usingRFC 7230
token syntax