Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_destination for local server not using direct connection since 0.8.0 #161

Closed
marcOcram opened this issue Jul 22, 2022 · 8 comments
Closed

Comments

@marcOcram
Copy link

Hello,

we have our local server server01 and until version 0.7.2 it uses a Direct connection to that server

MainProcess: Thread_0: 1658467520: /do_GET/do_curl/__init__: b9bd7302aa1fbccb742fc69ec481329f41812d6f: New curl instance
MainProcess: Thread_0: 1658467520: /do_curl/__init__/_setup: b9bd7302aa1fbccb742fc69ec481329f41812d6f: GET http://server01:8080/aaa/bbb/ccc/ddd/eee/fff/ggg?hhh=iii using HTTP/1.1
MainProcess: Thread_0: 1658467520: /handle_one_request/do_GET/do_curl: b9bd7302aa1fbccb742fc69ec481329f41812d6f: Path = http://server01:8080/aaa/bbb/ccc/ddd/eee/fff/ggg?hhh=iii
MainProcess: Thread_0: 1658467520: /find_proxy_for_url/find_proxy_for_url/get_netloc: netloc = ('server01', 8080), path = /aaa/bbb/ccc/ddd/eee/fff/ggg?hhh=iii
MainProcess: Thread_0: 1658467520: /do_GET/do_curl/get_destination: b9bd7302aa1fbccb742fc69ec481329f41812d6f: Direct connection

since version 0.8.0 it uses the proxy, which it should not use (logging of version 0.8.3 because of #160)

MainProcess: Thread_0: 1658467178: /do_GET/do_curl/__init__: 4009a79a8679119cf2dc3d5bdce8e1b6ca6cde6e: New curl instance
MainProcess: Thread_0: 1658467178: /do_curl/__init__/_setup: 4009a79a8679119cf2dc3d5bdce8e1b6ca6cde6e: GET http://server01:8080/aaa/bbb/ccc/ddd/eee/fff/ggg?hhh=iii using HTTP/1.1
MainProcess: Thread_0: 1658467178: /handle_one_request/do_GET/do_curl: 4009a79a8679119cf2dc3d5bdce8e1b6ca6cde6e: Path = http://server01:8080/aaa/bbb/ccc/ddd/eee/fff/ggg?hhh=iii
MainProcess: Thread_0: 1658467178: /find_proxy_for_url/find_proxy_for_url/get_netloc: netloc = ('server01', 8080), path = /aaa/bbb/ccc/ddd/eee/fff/ggg?hhh=iii
MainProcess: Thread_0: 1658467178: /find_proxy_for_url/find_proxy_for_url/__init__: Loading PAC utils
MainProcess: Thread_0: 1658467178: /find_proxy_for_url/find_proxy_for_url/load_url: Loading PAC url: http://xxx/web.pac
MainProcess: Thread_0: 1658467178: /find_proxy_for_url/load_url/__init__: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: New curl instance
MainProcess: Thread_0: 1658467178: /load_url/__init__/_setup: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: GET http://xxx/web.pac using HTTP/1.1
MainProcess: Thread_0: 1658467178: /find_proxy_for_url/load_url/buffer: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Setting up buffers for bridge
MainProcess: Thread_0: 1658467178: /load_url/buffer/bridge: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Setting up bridge
MainProcess: Thread_0: 1658467178: /perform/do/add: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Handles = 0
MainProcess: Thread_0: 1658467178: /do/add/_add_handle: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Add handle
MainProcess: Thread_0: 1658467178: /do/add/_add_handle: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Added handle
MainProcess: Thread_0: 1658467178: /_perform/_socket_action/_debug_callback: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Curl info: Trying aa.bb.cc.dd:80...
MainProcess: Thread_0: 1658467178: /_perform/_socket_action/_debug_callback: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Curl info: Connected to xxx (aa.bb.cc.dd) port 80 (#0)
MainProcess: Thread_0: 1658467178: /_perform/_socket_action/_debug_callback: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Sent header => GET /web.pac HTTP/1.1
MainProcess: Thread_0: 1658467178: /_perform/_socket_action/_debug_callback: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Sent header => Host: xxx
MainProcess: Thread_0: 1658467178: /_perform/_socket_action/_debug_callback: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Sent header => Accept: */*
MainProcess: Thread_0: 1658467178: /_perform/_socket_action/_debug_callback: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Curl info: Mark bundle as not supporting multiuse
MainProcess: Thread_0: 1658467178: /_perform/_socket_action/_debug_callback: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Received header <= HTTP/1.1 200 OK
MainProcess: Thread_0: 1658467178: /_perform/_socket_action/_debug_callback: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Received header <= Content-Type: text/plain
MainProcess: Thread_0: 1658467178: /_perform/_socket_action/_debug_callback: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Received header <= Last-Modified: Wed, 25 May 2022 14:13:46 GMT
MainProcess: Thread_0: 1658467178: /_perform/_socket_action/_debug_callback: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Received header <= Accept-Ranges: bytes
MainProcess: Thread_0: 1658467178: /_perform/_socket_action/_debug_callback: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Received header <= ETag: "97d7e7a54170d81:0"
MainProcess: Thread_0: 1658467178: /_perform/_socket_action/_debug_callback: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Received header <= Server: Microsoft-IIS/8.5
MainProcess: Thread_0: 1658467178: /_perform/_socket_action/_debug_callback: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Received header <= X-Powered-By: ASP.NET
MainProcess: Thread_0: 1658467178: /_perform/_socket_action/_debug_callback: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Received header <= Date: Fri, 22 Jul 2022 05:19:39 GMT
MainProcess: Thread_0: 1658467178: /_perform/_socket_action/_debug_callback: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Received header <= Content-Length: 14768
MainProcess: Thread_0: 1658467178: /_perform/_socket_action/_header_callback: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Done sending headers
MainProcess: Thread_0: 1658467178: /_perform/_socket_action/_write_callback: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Wrote 5094 bytes
MainProcess: Thread_0: 1658467179: /_perform/_socket_action/_write_callback: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Wrote 9674 bytes
MainProcess: Thread_0: 1658467179: /_perform/_socket_action/_debug_callback: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Curl info: Connection #0 to host xxx left intact
MainProcess: Thread_0: 1658467179: /perform/remove/_remove_handle: 5fccabdc3c2be4dfcde49a1b9313248d57a7abe2: Remove handle: 
MainProcess: Thread_0: 1658467179: /do_GET/do_curl/get_destination: 4009a79a8679119cf2dc3d5bdce8e1b6ca6cde6e: Proxy = [('ee.ff.gg.hh', 3128)]

Is this as intended that it now using the proxy instead of the direct connection?

@genotrance
Copy link
Owner

PAC files specified to Px (rather than in Internet Options) on Windows are now processed using QuickJS instead of WinHttp on v0.8.0+. This is because Windows doesn't like local PAC files anymore. Px had a goofy workaround to host the PAC file on its own web server but that was removed ever since QuickJS was required on non-Windows platforms to process PAC files and made for a simpler and consistent solution for local and remote files specified to Px. In your case, it is a URL but still leverages the QuickJS engine since it isn't configured in Internet Options.

It will help to have more context on the v0.7.2 log - curious if the PAC file was loaded and processed by WinHttp.

I'm also curious what the PAC file looks like and why WinHttp (assuming it was processing the PAC file) felt it was a direct connection, whereas QuickJS thinks otherwise. It is likely the processing of the return value from the PAC file processing is still wrong. Adding print(proxies) after line 60 will probably provide some insight.

@marcOcram
Copy link
Author

The minified, anonymized web.pac, as a note our server has an IP address starting with 10.62.

function FindProxyForURL(url, host) 
{ 
function FindProxyForURL(url, host) 
{ 
  if (isPlainHostName(host)) return "DIRECT";

  if (isInNet(host, "172.xx.xx.x", "255.255.255.0")) return "DIRECT";  

  if (dnsDomainIs(host,"xx.xyz")) return "PROXY aa.bb.cc.dd:3128";
  if (dnsDomainIs(host,"xx.xyz")) return "DIRECT";

  if (dnsDomainIs(host,"xx.xyz")) return "PROXY aa.bb.cc.dd:3128";
  
  // deny
   if (dnsDomainIs(host,"xx.xyz")) return "PROXY 0.0.0.0:3128";
   
  // other proxies
  if (dnsDomainIs(host,"xx.xyz")) return "PROXY a2.b2.c2.d2:8080; " + "PROXY a3.b3.c3.d3:8080";
  if (dnsDomainIs(host,"xx.xyz")) return "PROXY a4.b4.c4.d4:8080";
  if (dnsDomainIs(host,"xx.xyz")) return "PROXY a5.b5.c5.d5:8080";
  if (dnsDomainIs(host,"xx.xyz")) return "PROXY a6.b6.c6.d6:8080";
  if (dnsDomainIs(host,"xx.xyz")) return "PROXY a7.b7.c7.d7:8080";

  // direct connection
  if (dnsDomainIs(host,"xx.xyz")) return "DIRECT";
  if (host.substring(0,6) == "10.62.") return "DIRECT";
  if (host.substring(0,8) == "192.168.") return "DIRECT";  

  // residue
  return "PROXY aa.bb.cc.dd:3128";
}

The additional debug line can only be added in version 0.8.0+, on version 0.7.2 you have to point me to a different file as pac.py does not exist.

@genotrance
Copy link
Owner

Thanks for taking the time to provide this info.

The debug output is only needed for v0.8.2. The older version doesn’t have access to the PAC file since WinHttp does all that.

Which if statement should your server match in this list? First thought is the second last one but that is only matched when you use the actual IP itself.

Since everything is xx.xyz, I’m not able to tell what is really happening here.

@marcOcram
Copy link
Author

marcOcram commented Jul 22, 2022

alright, v0.8.2 outputs the expected PROXY aa.bb.cc.dd:3128. The server01 does not match anything inside web.pac.

I do not know if I can tell you the details of the web.pac but I hosted it myself, changed the IP of the proxy to detect which line gets returned and it is the residue line. So nothing matches as expected.

I also tested v0.7.2 with a web.pac which returns PROXY aa.bb.cc.dd:3128 for every URL. It does not get used in 0.7.2.

function FindProxyForURL(url, host) 
{ 
  return "PROXY aa.bb.cc.dd:3128";
}

EDIT: after some more testing, v0.7.2 uses the proxy now, too

EDIT: with winhttp it matches the line if (isPlainHostName(host)) return "DIRECT"; in version 0.7.2, in 0.8.3 it does not match this line.

@genotrance
Copy link
Owner

with winhttp it matches the line if (isPlainHostName(host)) return "DIRECT"; in version 0.7.2, in 0.8.3 it does not match this line.

That is what I suspected.

px/px/pacutils.py

Lines 61 to 63 in 39df09c

function isPlainHostName(host) {
return host.search("(\\.)|:") == -1;
}

Either QuickJS doesn't handle the search() correctly or it is wrongly implemented. The code does come from Mozilla so not sure what to make of it.

@genotrance
Copy link
Owner

Looks like it is QuickJS - I tested the above function in Chrome's console and it returned true for "server01" but QuickJS returns false.

import quickjs
f = quickjs.Function("isPlainHostName", """
  function isPlainHostName(host) {
      return host.search("(\\.)|:") == -1;
  }
""")

print(f("server01"))

Prints False.

@marcOcram
Copy link
Author

Maybe it is an escaping issue or an issue with the Python wrapper of QuickJS not forwarding the escaping correctly. I do not know exactly how Python escapes characters but if you use three backslashes it works and matches dots or colons (just as intended). Maybe you have to do some kind of multiple escaping due to the layering between wrapper and quickjs.

try the following

import quickjs
f = quickjs.Function("isPlainHostName", """
  function isPlainHostName(host) {
      return host.search("(\\\.)|:") == -1;
  }
""")

print(f("server01"))

Prints True.

@genotrance
Copy link
Owner

This should be fixed in v0.8.4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants