Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

await file_obj.credential.close() : TypeError: object NoneType can't be used in 'await' expression #431

Open
ELToulemonde opened this issue Oct 6, 2023 · 10 comments

Comments

@ELToulemonde
Copy link

ELToulemonde commented Oct 6, 2023

My problem

At the end of my python script, I get a clean up error : TypeError: object NoneType can't be used in 'await' expression

Complete trace is :

Traceback (most recent call last):
  File ".../lib/python3.10/weakref.py", line 667, in _exitfunc
    f()
  File ".../lib/python3.10/weakref.py", line 591, in __call__
    return info.func(*info.args, **(info.kwargs or {}))
  File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 103, in sync
    raise return_result
  File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 56, in _runner
    result[0] = await coro
  File ".../lib/python3.10/site-packages/adlfs/utils.py", line 78, in close_credential
    await file_obj.credential.close()
TypeError: object NoneType can't be used in 'await' expression

Which is weird because, I don't do anything asynchronious.

A reproducible example

At least as much as I can:

from azure.identity import ChainedTokenCredential, ManagedIdentityCredential, AzureCliCredential
azure_cli = AzureCliCredential()
managed_identity = ManagedIdentityCredential()
CREDENTIAL_CHAIN = ChainedTokenCredential(managed_identity, azure_cli)

import pandas as pd
pd.read_parquet("abfs://blob-name@datalake.blob.core.windows.net/path_to_parquets.parquet", storage_options={"credential": credential_chain})
print("Done")

I do get the "Done" printed before failure.

My config

  • ubuntu,
  • python 3.10,
  • azure-storage-blob==12.16.0
  • pandas==2.0.0
  • pyarrow==11.0.0
  • adlfs==2023.9.0
@TomAugspurger
Copy link
Contributor

TomAugspurger commented Oct 7, 2023 via email

@davidsteinar
Copy link

@TomAugspurger @ELToulemonde I get the same error, no difference importing DefaultAzureCredential from either azure.identity.aio or azure.identity , did you solve this?

@mkp-jansen
Copy link

Same error here, any solutions?

@marktodisco
Copy link

Importing DefaultAzureCredential from azure.identity.aio silenced that error for me.

Python 3.10.13 on Ubuntu.

Package                     Version  
--------------------------- ---------
adlfs                       2023.12.0
azure-ai-ml                 1.12.1
azure-common                1.1.28
azure-core                  1.29.6
azure-datalake-store        0.0.53
azure-identity              1.15.0
azure-mgmt-core             1.4.0
azure-mgmt-resource         23.0.1
azure-mgmt-storage          21.1.0
azure-mgmt-subscription     3.1.1
azure-storage-blob          12.19.0
azure-storage-file-datalake 12.14.0
azure-storage-file-share    12.15.0
pyarrow                     14.0.2

@mhtrinh
Copy link

mhtrinh commented Jun 5, 2024

I am aware that using azure.identity.aio silence the error but: Why does async involved in non-async call ?

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Jun 9, 2024

fsspec uses asyncio internally.

I'd recommend people use credentials from azure.identity.aio. If someone wants, we could add an inspect.iscoroutine check to before we call .close.

@mhtrinh
Copy link

mhtrinh commented Jun 9, 2024

That may solve one of our issue: I am using adlfs as part of a complex code that use ThreadPool. At the end of the run, I get this message that do not change the exit code, so not fatal but looks a bit ugly:

Traceback (most recent call last):
  File ".../lib/python3.10/weakref.py", line 667, in _exitfunc
    f()
  File ".../lib/python3.10/weakref.py", line 591, in __call__
    return info.func(*info.args, **(info.kwargs or {}))
  File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 103, in sync
    raise return_result
  File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 56, in _runner
    result[0] = await coro
  File ".../lib/python3.10/site-packages/adlfs/utils.py", line 78, in close_credential
    await file_obj.credential.close()
TypeError: object NoneType can't be used in 'await' expression

I did not manage to create a small reproducable example ...

@bikeshedder
Copy link

Reproduction example

Just create an AzureBlobFileSystem with non-AIO DefaultAzureCredential:

from adlfs import AzureBlobFileSystem
from azure.identity import DefaultAzureCredential

AzureBlobFileSystem(
    account_name="_",
    credential=DefaultAzureCredential(),
)

When the code exits an exception is raised:

Traceback (most recent call last):
  File "/usr/lib/python3.12/weakref.py", line 666, in _exitfunc
    f()
  File "/usr/lib/python3.12/weakref.py", line 590, in __call__
    return info.func(*info.args, **(info.kwargs or {}))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.12/site-packages/fsspec/asyn.py", line 103, in sync
    raise return_result
  File ".venv/lib/python3.12/site-packages/fsspec/asyn.py", line 56, in _runner
    result[0] = await coro
                ^^^^^^^^^^
  File ".venv/lib/python3.12/site-packages/adlfs/utils.py", line 78, in close_credential
    await file_obj.credential.close()
TypeError: object NoneType can't be used in 'await' expression

What about azure.identity.aio.DefaultAzureCredential

I'm using both azure.storage.blob.BlobServiceClient and adlfs.AzureBlobFileSystem in my code. I made the mistake importing the AIO version and using it for both which caused the ABFS to no longer raise an exception but BlobServiceClient stopped to work as it expects a sync version.

I fixed that by using non-AIO credentials for the BlobServiceClient and AIO ones for AzureBlobFileSystem. It's not pretty having to create two credential objects but fixes the issue.

@BnJam
Copy link

BnJam commented Aug 24, 2024

fsspec / adlfs use an async internally. can you try using the credentials from azure.identity.aio instead?

The azure.identity.aio solution worked for me in a simple script while testing out remote storage connectivity:

import io
import os
import adlfs

from azure.identity.aio import DefaultAzureCredential

credential = DefaultAzureCredential()

azfs = adlfs.AzureBlobFileSystem(
    account_name=<storage-account-name>,
    credential=credential
)

with io.BytesIO() as buf:
    buf.write(str('hello from the byterealm!').encode())
    buf.seek(0)
    azfs.write_bytes('/container/path/msg.txt', buf)

file_output = azfs.read_text('/container/path/msg.txt')

print(file_output)

Giving the print out:

$ python remote-storage.py 
hello from the byterealm!

That may solve one of our issue: I am using adlfs as part of a complex code that use ThreadPool. At the end of the run, I get this message that do not change the exit code, so not fatal but looks a bit ugly:

Traceback (most recent call last):
  File ".../lib/python3.10/weakref.py", line 667, in _exitfunc
    f()
  File ".../lib/python3.10/weakref.py", line 591, in __call__
    return info.func(*info.args, **(info.kwargs or {}))
  File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 103, in sync
    raise return_result
  File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 56, in _runner
    result[0] = await coro
  File ".../lib/python3.10/site-packages/adlfs/utils.py", line 78, in close_credential
    await file_obj.credential.close()
TypeError: object NoneType can't be used in 'await' expression

I did not manage to create a small reproducable example ...

Using an asynchronous object in a threadpool sounds like a headache... you could create a credential in each thread to isolate the async handlers per action, but, perhaps collecting the pool and finalising the credential once all threads are complete could work too? Practicing the former would help when you move from threads to processes or external Workers/Jobs.

@f-vt
Copy link

f-vt commented Nov 23, 2024

I had the same issue but I needed to use the exclude_interactive_browser_credential=False parameter which is not available in the azure.identity.aio one.

What I ended up is creating a custom class from the sync (not the aio), adding a dumb close async function so fsspec is happy:

class AsyncDefaultAzureCredentialWrapper(AsyncTokenCredential):
    """
    This class wraps the synchronous DefaultAzureCredential to provide an asynchronous interface.
    It allows the use of DefaultAzureCredential with fsspec, which requires asynchronous credentials.
    The close method is overridden to be a no-op to prevent issues when fsspec tries to close the credential.
    """

    def __init__(self, **kwargs):
        # Initialize the synchronous DefaultAzureCredential
        self._credential = DefaultAzureCredential(**kwargs)

    async def get_token(self, *scopes, **kwargs):
        #Asynchronously get a token for the specified scopes: this method runs the synchronous get_token method in an executor to avoid blocking.
        loop = asyncio.get_event_loop()
        token = await loop.run_in_executor(None, self._credential.get_token, *scopes, **kwargs)
        return AccessToken(token.token, token.expires_on)

    async def close(self):
        #No-op close method to prevent issues when fsspec tries to close the credential.
        pass

# Create the async credential from the sync one, with the no-op close
credential = AsyncDefaultAzureCredentialWrapper(exclude_interactive_browser_credential=False)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants