Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http: allow raw header capture (#347) #349

Merged
merged 1 commit into from
Feb 18, 2024

Conversation

codyprime
Copy link
Member

See also: #347

The golang textproto library does a few things when parsing the HTTP
headers:

  • consume some whitespace characters (e.g. \r\n)
  • canonicalizes the header keys (e.g. "content-type" => "Content-Type")
  • moves the headers into a map

This all makes sense when parsing HTTP, but for a scanner some may want
to have the exact headers, to match on order, non-canonical keys, etc.

This adds that option, if '--raw-headers' is specified during an HTTP
scan. This is accomplished by implementing a tee reader on the pconn
interface, that tees before the bufio reader is put in place. The
tee copy can be disabled once the headers have been read, so as to not
waste memory while consuming the HTTP body.

While denoted as "raw headers", this will also capture the raw status
line as well.

(cherry picked from commit 83e55e0)
Signed-off-by: Jeff Cody jcody@censys.io

How to Test

echo "8.8.8.8"  | ./zgrab2 http -p 443 --use-https --raw-headers --max-redirects=0 | \
    jq -r .data.http.result.response.headers_raw | base64 -d

stdout output should then look something like:

X-Content-Type-Options: nosniff
Access-Control-Allow-Origin: *
Location: https://dns.google/
Date: Tue, 12 Apr 2022 15:53:04 GMT
Content-Type: text/html; charset=UTF-8
Server: HTTP server (unknown)
Content-Length: 216
X-XSS-Protection: 0
X-Frame-Options: SAMEORIGIN
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"

Notes & Caveats

This has already been merged & tested on the TLS 1.3 feature branch

The golang textproto library does a few things when parsing the HTTP
headers:

* consume some whitespace characters (e.g. \r\n)
* canonicalizes the header keys (e.g. "content-type" => "Content-Type")
* moves the headers into a map

This all makes sense when parsing HTTP, but for a scanner some may want
to have the exact headers, to match on order, non-canonical keys, etc.

This adds that option, if '--raw-headers' is specified during an HTTP
scan.  This is accomplished by implementing a tee reader on the pconn
interface, that tees before the bufio reader is put in place.  The
tee copy can be disabled once the headers have been read, so as to not
waste memory while consuming the HTTP body.

While denoted as "raw headers", this will also capture the raw status
line as well.

(cherry picked from commit 83e55e0)
Signed-off-by: Jeff Cody <jcody@censys.io>
@codyprime
Copy link
Member Author

@p-l-

@p-l-
Copy link

p-l- commented Apr 21, 2022

@codyprime excellent! Thank you

Copy link
Member

@elliotcubit elliotcubit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me.

Same as the other PR that was approved & sanity tested against a few hosts in the wild with interesting headers.

@dav3
Copy link

dav3 commented Jun 14, 2022

@codyprime the "--raw-headers" extension works well until you use "--max-redirects=1" and hit a device that does a redirect. In that case, zgrab2 panics, as shown below:

$ echo 45.76.78.21 |  ./zgrab2 http -p 443 --source-ip=$IP3 --use-https --raw-headers --max-redirects=1
INFO[0000] started grab at 2022-06-14T16:10:27Z
panic: runtime error: slice bounds out of range [-3:]

goroutine 1046 [running]:
github.com/zmap/zgrab2/lib/http.(*TeeConn).Bytes(...)
        /home/dav3/src/zgrab2-jcody-http-raw-headers-main/lib/http/transport.go:1166
github.com/zmap/zgrab2/lib/http.readResponse(0xc000288940, 0xc000618400)
        /home/dav3/src/zgrab2-jcody-http-raw-headers-main/lib/http/response.go:227 +0x71e
github.com/zmap/zgrab2/lib/http.ReadResponseTee(...)
        /home/dav3/src/zgrab2-jcody-http-raw-headers-main/lib/http/response.go:170
github.com/zmap/zgrab2/lib/http.(*persistConn).readResponse(0xc00035a240, {0xc000618400, 0xc000118540, 0x1, 0x0, 0xc0001184e0}, 0x0)
        /home/dav3/src/zgrab2-jcody-http-raw-headers-main/lib/http/transport.go:1708 +0x94
github.com/zmap/zgrab2/lib/http.(*persistConn).readLoop(0xc00035a240)
        /home/dav3/src/zgrab2-jcody-http-raw-headers-main/lib/http/transport.go:1550 +0x3a5
created by github.com/zmap/zgrab2/lib/http.(*Transport).dialConn
        /home/dav3/src/zgrab2-jcody-http-raw-headers-main/lib/http/transport.go:1138 +0x16dc

zgrab2 does not panic in the following combination of flags:

  • --raw-headers --max-redirects=0
  • --max-redirects=1

@codyprime
Copy link
Member Author

@dav3 Thanks, I can confirm. Working on a fix now, I'll push it up to this PR and cherry-pick it for the TLS 1.3 feature branch when done

@zakird zakird merged commit 1e97dd8 into zmap:master Feb 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants