Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1904122: Retryable client error isn't being retried #2161

Open
zanebclark opened this issue Jan 30, 2025 · 3 comments
Open

SNOW-1904122: Retryable client error isn't being retried #2161

zanebclark opened this issue Jan 30, 2025 · 3 comments
Assignees
Labels
bug status-triage Issue is under initial triage

Comments

@zanebclark
Copy link

zanebclark commented Jan 30, 2025

Python version

Python 3.10.7 (tags/v3.10.7:6cc6b13, Sep 5 2022, 14:08:36) [MSC v.1933 64 bit (AMD64)]

Operating system and processor architecture

Windows-10-10.0.19045-SP0

Installed packages

aiohttp==3.8.3
aiosignal==1.3.2
anyio==4.3.0
asn1crypto==1.5.1
async-timeout==4.0.3
attrs==25.1.0
black==24.4.2
boto3==1.24.70
boto3-stubs==1.34.125
botocore==1.27.96
botocore-stubs==1.34.125
certifi==2024.12.14
cffi==1.17.1
charset-normalizer==2.1.1
click==8.1.7
cloudpickle==2.2.1
colorama==0.4.6
cryptography==44.0.0
exceptiongroup==1.2.2
filelock==3.17.0
frozenlist==1.5.0
gitdb==4.0.11
GitPython==3.1.43
idna==3.10
iniconfig==2.0.0
jmespath==1.0.1
multidict==6.1.0
mypy-boto3-dynamodb==1.34.114
mypy-boto3-s3==1.34.120
mypy-extensions==1.0.0
numpy==1.26.4
packaging==24.2
pandas==2.2.2
pathspec==0.12.1
platformdirs==4.3.6
pluggy==1.5.0
propcache==0.2.1
protobuf==5.29.3
psycopg2-binary==2.9.10
py4j==0.10.9.7
pyarrow==16.1.0
pycparser==2.22
PyJWT==2.10.1
pyOpenSSL==24.3.0
pyspark==3.5.1
pyspark-stubs==3.0.0.post3
pytest==8.3.4
pytest-asyncio==0.25.2
python-dateutil==2.9.0.post0
pytz==2024.2
PyYAML==6.0.2
requests==2.32.3
s3transfer==0.6.2
six==1.17.0
smmap==5.0.1
sniffio==1.3.1
snowflake-connector-python==3.13.2
snowflake-snowpark-python==1.26.0
sortedcontainers==2.4.0
structlog==25.1.0
tomli==2.2.1
tomlkit==0.13.2
types-awscrt==0.20.12
types-s3transfer==0.10.1
typing_extensions==4.12.2
tzdata==2025.1
tzlocal==5.2
urllib3==1.26.20
yarl==1.18.3

What did you do?

An ECONNRESET error (ProtocolError) is usually a retryable error, seen here. If the ECONNRESET error occurs at a certain point in the request the retryable ProtocolError is wrapped in an unhandled ChunkedEncodingError. I'd like to add the ChunkedEncodingError to the list of retryable errors

Can you set logging to DEBUG and collect the logs?

25/01/30 16:38:15 ERROR ProcessLauncher: Error from Python:Traceback (most recent call last):
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/vendored/urllib3/contrib/pyopenssl.py", line 318, in recv_into
    return self.connection.recv_into(*args, **kwargs)
  File "/home/spark/.local/lib/python3.10/site-packages/OpenSSL/SSL.py", line 2268, in recv_into
    self._raise_ssl_error(self._ssl, result)
  File "/home/spark/.local/lib/python3.10/site-packages/OpenSSL/SSL.py", line 1962, in _raise_ssl_error
    raise SysCallError(errno, errorcode.get(errno))
OpenSSL.SSL.SysCallError: (104, 'ECONNRESET')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/vendored/urllib3/response.py", line 444, in _error_catcher
    yield
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/vendored/urllib3/response.py", line 828, in read_chunked
    self._update_chunk_length()
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/vendored/urllib3/response.py", line 758, in _update_chunk_length
    line = self._fp.fp.readline()
  File "/usr/local/lib/python3.10/socket.py", line 705, in readinto
    return self._sock.recv_into(b)
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/vendored/urllib3/contrib/pyopenssl.py", line 323, in recv_into
    raise SocketError(str(e))
OSError: (104, 'ECONNRESET')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/vendored/requests/models.py", line 816, in generate
    yield from self.raw.stream(chunk_size, decode_content=True)
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/vendored/urllib3/response.py", line 624, in stream
    for line in self.read_chunked(amt, decode_content=decode_content):
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/vendored/urllib3/response.py", line 816, in read_chunked
    with self._error_catcher():
  File "/usr/local/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/vendored/urllib3/response.py", line 461, in _error_catcher
    raise ProtocolError("Connection broken: %r" % e, e)
snowflake.connector.vendored.urllib3.exceptions.ProtocolError: ('Connection broken: OSError("(104, \'ECONNRESET\')")', OSError("(104, 'ECONNRESET')"))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tmp/GW_S3_to_Snowflake_MRG_prod.py", line 1344, in <module>
    main(
  File "/tmp/GW_S3_to_Snowflake_MRG_prod.py", line 1263, in main
    raise exc
  File "/tmp/GW_S3_to_Snowflake_MRG_prod.py", line 1250, in main
    exceptions = handler.load_tables_to_mrg(
  File "/tmp/GW_S3_to_Snowflake_MRG_prod.py", line 1121, in load_tables_to_mrg
    future.result()
  File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 439, in result
    return self.__get_result()
  File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/tmp/GW_S3_to_Snowflake_MRG_prod.py", line 753, in load_to_mrg
    self.load_to_s3(executor=s3_executor, handler=handler, cursor=cursor)
  File "/tmp/GW_S3_to_Snowflake_MRG_prod.py", line 724, in load_to_s3
    set_jc_load_status_post_s3_load(
  File "/tmp/GW_S3_to_Snowflake_MRG_prod.py", line 221, in set_jc_load_status_post_s3_load
    insert_into_jc_process_log(
  File "/tmp/GW_S3_to_Snowflake_MRG_prod.py", line 158, in insert_into_jc_process_log
    cursor.execute(
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/cursor.py", line 994, in execute
    ret = self._execute_helper(query, **kwargs)
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/cursor.py", line 700, in _execute_helper
    ret = self._connection.cmd_query(
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/connection.py", line 1368, in cmd_query
    ret = self.rest.request(
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/network.py", line 501, in request
    return self._post_request(
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/network.py", line 749, in _post_request
    ret = self.fetch(
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/network.py", line 864, in fetch
    ret = self._request_exec_wrapper(
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/network.py", line 992, in _request_exec_wrapper
    raise e
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/network.py", line 914, in _request_exec_wrapper
    return_object = self._request_exec(
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/network.py", line 1191, in _request_exec
    raise err
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/network.py", line 1081, in _request_exec
    raw_ret = session.request(
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/vendored/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/vendored/requests/sessions.py", line 747, in send
    r.content
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/vendored/requests/models.py", line 899, in content
    self._content = b"".join(self.iter_content(CONTENT_CHUNK_SIZE)) or b""
  File "/home/spark/.local/lib/python3.10/site-packages/snowflake/connector/vendored/requests/models.py", line 818, in generate
    raise ChunkedEncodingError(e)
snowflake.connector.vendored.requests.exceptions.ChunkedEncodingError: ('Connection broken: OSError("(104, \'ECONNRESET\')")', OSError("(104, 'ECONNRESET')"))
@github-actions github-actions bot changed the title Retryable client error isn't being retried SNOW-1904122: Retryable client error isn't being retried Jan 30, 2025
@sfc-gh-sghosh sfc-gh-sghosh self-assigned this Feb 3, 2025
@sfc-gh-sghosh
Copy link

Hello @zanebclark ,

Thanks for raising the issue.
Could you share the code snippet to reproduce the issue, what is your application scenario ?
Have you configured any timeout such as login_timeout, network_timeout, socket_timeout ?
https://docs.snowflake.com/en/developer-guide/python-connector/python-connector-connect#managing-connection-timeouts
What is the value of parameter MAX_CON_RETRY_ATTEMPTS in your account ?
Please share the sample code snippet to reproduce the issue.

Regards,
Sujan

@sfc-gh-sghosh sfc-gh-sghosh added status-triage Issue is under initial triage and removed needs triage labels Feb 3, 2025
@zanebclark
Copy link
Author

@sfc-gh-sghosh , thanks for reaching out! Reproducing this issue is difficult. We have the same code deployed in a number of environments but we're only seeing it regularly in the highest volume environment. We're using concurrent.futures ThreadPoolExecutor with 20 workers to execute a three sequential stored procedure in parallel for ~8,000 tables. We're running this job every hour of the business day and it succeeds most times. All that so say that the failure rate is very low and we haven't been able to discern a pattern.

@zanebclark
Copy link
Author

Have you configured any timeout such as login_timeout, network_timeout, socket_timeout ?

I can't say that I have. Would a socket timeout influence a ECONNRESET error? I'm under the impression that this isn't a delay or latency issue but a forceful termination of the socket on the receiver.

What is the value of parameter MAX_CON_RETRY_ATTEMPTS in your account ?

I understand this to be an environment variable, not an account parameter. We don't set this environment variable. When I run SHOW PARAMETERS LIKE 'MAX_CON_RETRY_ATTEMPTS', I don't get any results. SHOW PARAMETERS LIKE 'MAX_CON_RETRY_ATTEMPTS' IN ACCOUNT also returns no values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug status-triage Issue is under initial triage
Projects
None yet
Development

No branches or pull requests

2 participants