Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Add SSL Certificate Context And Setting For Async/Sync HTTP Requests #6976

Open
wants to merge 57 commits into
base: develop
Choose a base branch
from

Conversation

deeleeramone
Copy link
Contributor

@deeleeramone deeleeramone commented Nov 29, 2024

  1. Why?:

The Requests library accepts an environment variable, REQUESTS_CA_BUNDLE, and AIOHTTP does not. This creates inconsistencies between synchronous and async HTTP requests. Furthermore, defining this variable limits the Requests library to that particular file. If you want to use a self-signed certificate for only some things, it is not straightforward.

This solution combines the specified certificate with the certifi defaults.

  1. What?:

    • Allows a CA certificate file to be specified for verifying HTTPS requests.
      • Self-signed certificates do not need to be part of the system's trust store.
    • Allows "python_settings" in system_settings.json to accept extras.
      • Uses keys http as a nested dictionary.
    • Applies the configuration to the internal helpers:
      • make_request
      • amake_request
      • amake_requests
    • Handles the scenarios where libraries are using the Requests library directly. This is done by manipulating environment variables before and after making the request.
      • yFinance
      • Finviz
      • Posthog
      • OpenBB Hub
    • Allows keyword arguments to be passed to uvicorn.run by storing them as a dictionary to: system_settings.python_settings.uvicorn
  2. Impact:

    • Should have absolutely no impact unless these items are specified.
    • The behavior of Requests environment variable, REQUESTS_CA_BUNDLE, ports directly to the async requests.
    • Allows it to be defined in system_settings.json instead of environment variables.
    • cafile is an equivalent to REQUESTS_CA_BUNDLE, when pointing to a file.
    • When "verify_ssl": false, SSL certificate verification is disabled within all OpenBB functions.
    • Keyword arguments supplied to the "python_settings["uvicorn"]" dictionary are passed directly to uvicorn.run when launching the API as:
      • python -m openbb_core.api.rest_api
      • openbb-api

These items, in system_settings.json, will take precedence over environment variables:

{
    "python_settings": {
        "http": {
            "cafile": "/full/path/to/certificate/localhost.crt",
            "verify_ssl": null,
            "proxy": null,
            "certfile": null,
            "keyfile": null,
        }
    }
}

certfile and keyfile are intended more for future use, the most important key is cafile.

Note: Keyword arguments added to the command line from openbb-api take precedence over the system_settings.json file.

  1. Testing Done:

    • Create a self-signed certificate and start an OpenBB API over HTTPS
    • Add the full path to the .crt file as shown above.
    • Import both requests and openbb_core.provider.utils.helpers.make_request
      • requests.get should fail while make_request succeeds.
      • Set as an environment variable instead, "cafile" is null and add REQUESTS_CA_BUNDLE='/full/path/to/certificate/localhost.crt' to the .env file.
        • Both requests.get and make_request should succeed.
        • Making an outside call using requests.get("https://google.com") will fail, make_request should succeed.

Screenshot 2024-11-29 at 3 01 25 PM

With the environment variable defined, and not system_settings.json, the same A/B can be applied to yfinance.download() vs. `obb.equity.price.historical(provider="yfinance")

This fails because yFinance is only verifying against the self-signed certificate for localhost. openbb_yfinance.utils.helpers.yf_download applies the environment configuration for the duration of the request.

Update to the behavior: the implementation now impacts the yFinance singleton.

Screenshot 2024-11-29 at 3 07 01 PM

@deeleeramone deeleeramone added enhancement Enhancement platform OpenBB Platform v4 PRs for v4 labels Nov 29, 2024
@deeleeramone deeleeramone added the do not merge Label to prevent pull request merge label Nov 29, 2024
@deeleeramone deeleeramone removed the do not merge Label to prevent pull request merge label Dec 1, 2024
Copy link
Contributor

@montezdesousa montezdesousa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few concerns with this solution.

  1. We are modifying environment variables from within the application and the logic to so in get_certificates and restore_certs is complex to follow.
  2. We are introduction a state inside Env(). I assume that's why we have some lines with Env() to refresh the state (?). This class should be stateless and readonly for consistency and predictability.
  3. The implementation requires us to constantly get and restore certificates, which is not obvious when implementing new requests.

I suggest using a single session that picks configurations and is shared across the app as proposed here #6939.

@deeleeramone
Copy link
Contributor Author

I suggest using a single session that picks configurations and is shared across the app as proposed here #6939.

OK, try it out now, @montezdesousa. I found a way to hack the session into the Posthog handler. The implementation now goes the other way where trust_env is disabled for Requests - which is the default for AIOHTTP - and cut everything down into two helper functions:

  • get_requests_session()
  • get_async_requests_session()

Using any of the utility HTTP request functions will read the system_settings.json and take precedence over any environment variable. Binding session objects to the obb app doesn't accomplish much because it needs to happen at the provider level, and we also don't want singleton behavior because it would mess with threading and concurrency. But, a session of the appropriate type can be passed into the functions as a parameter.

  • make_request
  • amake_request
  • amake_requests

Any unclosed async session object will be closed on exit.

@montezdesousa
Copy link
Contributor

When trying to test I get the same error both with requests and make_request. Also tried to provide the path to the certificate through verify and get the same error. Uvicorn is running over HTTPS and I get access it through the browser fine.

SSLError: HTTPSConnectionPool(host='0.0.0.0', port=8000): Max retries exceeded with url: / (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1133)')))

@deeleeramone
Copy link
Contributor Author

deeleeramone commented Jan 7, 2025

When trying to test I get the same error both with requests and make_request. Also tried to provide the path to the certificate through verify and get the same error. Uvicorn is running over HTTPS and I get access it through the browser fine.

SSLError: HTTPSConnectionPool(host='0.0.0.0', port=8000): Max retries exceeded with url: / (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1133)')))

This error seems to indicate that the self-signed certificate is not being found at all. I know there's lots of ways to make it not work, so rooting out the edge cases would be nice.

What are your exact deployment steps here, is this a network deployment? Is the same .crt file accessible to both uvicorn and the Python environment? Something like:

    "python_settings": {
        "http": {
            "cafile": "/Users/darrenlee/OpenBBUserData/localhost.crt"
        },
        "uvicorn": {
            "ssl_certfile": "/Users/darrenlee/OpenBBUserData/localhost.crt",
            "ssl_keyfile": "/Users/darrenlee/OpenBBUserData/localhost.key"
        }
    }

Was the certificate created by the same version of Python and openssl used in the deployment and environment?

@montezdesousa montezdesousa self-requested a review January 8, 2025 15:04
Copy link
Contributor

@montezdesousa montezdesousa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested and it works as expected, just left some comments and suggestions mostly on code readability.

Comment on lines 99 to 124
def get_python_request_settings() -> dict:
"""Get the python settings."""
# pylint: disable=import-outside-toplevel
from openbb_core.app.service.system_service import SystemService

python_settings = SystemService().system_settings.python_settings.model_dump()
http_settings = python_settings.get("http", {})
allowed_keys = [
"cafile",
"certfile",
"keyfile",
"password",
"verify_ssl",
"fingerprint",
"proxy",
"proxy_auth",
"proxy_headers",
"timeout",
"auth",
"headers",
"cookies",
]

return {
k: v for k, v in http_settings.items() if v is not None and k in allowed_keys
}
Copy link
Contributor

@montezdesousa montezdesousa Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http_settings should be defined explicitly in openbb_platform/core/openbb_core/app/model/python_settings.py as a dict, this becomes hidden if just mentioned here. That way we also don't need extra_allow=True in PythonSettings

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add these to documentation so they are not hidden.

extra_allow makes it extendable for other futures uses. Adding extra keys to the JSON file won't actually do anything because those keys would not be reached by any code.

I don't see any real benefit to not allowing extras here. Think, VS Code settings file. All the VS Code extensions define specific behaviors and preferences in the same settings file as the main program.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are hidden in the code, documentation is useful for detailing what code cannot convey. The benefit is explicit field definition, one should know all the python settings by looking at the PythonSettings model.

With extra_allow=True you open the door to implicit definitions all over the code with no real benefit, like http (I think a better name would be something like request_settings, http is too general) that only lives inside the helper function get_python_request_settings. If we add a few more like this and it will become hard to track which settings exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you do raise a good point, the intention is to allow others (developers, users, etc.) to extend functionality without having to change code in the core library. This is a major benefit, and has very little downside.

I'll add it into the model, but I feel pretty strongly about making it way easier for people to build custom components and configurations. It won't be hard to keep track of.

I would expect more people to look for this type of information in the documentation - i.e, where you are supposed to find out about the available settings - rather than trying to figure out how to import some subclass just to read the annotations.

http settings is a more accurate name for what the key is responsible for, and it doesn't have to exist only for serving those specific keys. It's filtered in the function to not allow other settings in those functions, but there can easily be other cases where you need additional settings - like maybe httpx or some other library somebody wants to use.

openbb_platform/core/openbb_core/provider/utils/helpers.py Outdated Show resolved Hide resolved
@montezdesousa
Copy link
Contributor

When trying to test I get the same error both with requests and make_request. Also tried to provide the path to the certificate through verify and get the same error. Uvicorn is running over HTTPS and I get access it through the browser fine.

SSLError: HTTPSConnectionPool(host='0.0.0.0', port=8000): Max retries exceeded with url: / (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1133)')))

This error seems to indicate that the self-signed certificate is not being found at all. I know there's lots of ways to make it not work, so rooting out the edge cases would be nice.

What are your exact deployment steps here, is this a network deployment? Is the same .crt file accessible to both uvicorn and the Python environment? Something like:

    "python_settings": {
        "http": {
            "cafile": "/Users/darrenlee/OpenBBUserData/localhost.crt"
        },
        "uvicorn": {
            "ssl_certfile": "/Users/darrenlee/OpenBBUserData/localhost.crt",
            "ssl_keyfile": "/Users/darrenlee/OpenBBUserData/localhost.key"
        }
    }

Was the certificate created by the same version of Python and openssl used in the deployment and environment?

Thanks for the help. I'm using mkcert and it turns out the cafile needs to come from /Users<username>/Library/Application Support/mkcert/rootCA.pem and not the self signed certificate itself.

Copy link
Contributor

@montezdesousa montezdesousa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

Edit: updating review

@montezdesousa montezdesousa self-requested a review January 13, 2025 10:29
Comment on lines 99 to 124
def get_python_request_settings() -> dict:
"""Get the python settings."""
# pylint: disable=import-outside-toplevel
from openbb_core.app.service.system_service import SystemService

python_settings = SystemService().system_settings.python_settings.model_dump()
http_settings = python_settings.get("http", {})
allowed_keys = [
"cafile",
"certfile",
"keyfile",
"password",
"verify_ssl",
"fingerprint",
"proxy",
"proxy_auth",
"proxy_headers",
"timeout",
"auth",
"headers",
"cookies",
]

return {
k: v for k, v in http_settings.items() if v is not None and k in allowed_keys
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are hidden in the code, documentation is useful for detailing what code cannot convey. The benefit is explicit field definition, one should know all the python settings by looking at the PythonSettings model.

With extra_allow=True you open the door to implicit definitions all over the code with no real benefit, like http (I think a better name would be something like request_settings, http is too general) that only lives inside the helper function get_python_request_settings. If we add a few more like this and it will become hard to track which settings exist.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement platform OpenBB Platform v4 PRs for v4
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG][FR] Impossible to specify SSL certification
3 participants