Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: globbing with fsspec #1061

Merged
merged 37 commits into from
Dec 13, 2023
Merged

feat: globbing with fsspec #1061

merged 37 commits into from
Dec 13, 2023

Conversation

lobis
Copy link
Collaborator

@lobis lobis commented Dec 12, 2023

Implement globbing support using fsspec.

Currently not working for all protocols due to fsspec/filesystem_spec#1459. In Related to scikit-hep/fsspec-xrootd#40 we implement the necessary changes to make it work for xrootd.

A new release of fsspec-xrootd is necessary for this to work. (DONE!)

Before this PR, glob expressions were ignored for all but local file paths.

This PR tries to expand them using fsspec, but this may not always work due to fsspec/filesystem_spec#1459. We made sure it worked for xrootd. Not working here means that the glob expression will be expanded but the full path will be wrong due to the protocol prefix being incomplete (missing auth information, etc.). uproot will just throw an exception as if it tried to open a bad path.

Added tests for xrootd and s3 globs (http glob does not work at the moment).

In order to correclty expand the glob expression fsspec may need additional storage_options. These are part of the options directionary that needs to be passed to the regularize_files method.

lobis and others added 11 commits December 11, 2023 16:38
* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list
* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* remove deprecated handlers from docs

* simplify source selection

* return object source

* pickle executor

* rename test

* test more handlers

* option to check writeable file-like object

* rename test

* explicitly set handler

* fix s3 source

* rename test

* Revert "fix s3 source"

This reverts commit e76fdbb.

* sesparate PR for s3 fix (#1024)

* strip file://

* rename test

* rename tests

* add aiohttp skip

* attempt to parse windows paths

* test ci

* Revert "test ci"

This reverts commit 4c1c8a5.

* rename test

* remove fsspec from test

* remove *_handler options

* update defaults

* do not override default s3

* do not use fsspec for multiprocessing

* rename test

* fix not selecting object source

* missing import

* normalize doc

* remove helper

* never return None as source

* remove unnecessary xrootd source default override since fsspec is default now

* rename test

* add empty class to pass old pickle test
* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* feat: set fsspec as default source (#1023)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* remove deprecated handlers from docs

* simplify source selection

* return object source

* pickle executor

* rename test

* test more handlers

* option to check writeable file-like object

* rename test

* explicitly set handler

* fix s3 source

* rename test

* Revert "fix s3 source"

This reverts commit e76fdbb.

* sesparate PR for s3 fix (#1024)

* strip file://

* rename test

* rename tests

* add aiohttp skip

* attempt to parse windows paths

* test ci

* Revert "test ci"

This reverts commit 4c1c8a5.

* rename test

* remove fsspec from test

* remove *_handler options

* update defaults

* do not override default s3

* do not use fsspec for multiprocessing

* rename test

* fix not selecting object source

* missing import

* normalize doc

* remove helper

* never return None as source

* remove unnecessary xrootd source default override since fsspec is default now

* rename test

* add empty class to pass old pickle test

* set version to 5.2.0rc1 (release candidate)

* set s3fs as default for s3

* test different handlers

* correct serialization of fsspec source
* simplify object path split

* add example from #975

* fix tests

* add more test cases

* test case update

* remove scheme unused regex
…coded (#1034)

* writing goes through fsspec

* increase rc version

* type hints and docs

* add helper methods, create

* throw more specific error

* add additional test for `create` failure with scheme other than local

* simplify source selection

* remove windows specific code

* raise exception if invalid combination of handler / input (file-like object and fsspec)

* use softer check for file-like object

* cover problematic case with additional slash (file:///c:/file.root)

* test "file:" scheme (no slash)

* test backslash
* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* feat: set fsspec as default source (#1023)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* remove deprecated handlers from docs

* simplify source selection

* return object source

* pickle executor

* rename test

* test more handlers

* option to check writeable file-like object

* rename test

* explicitly set handler

* fix s3 source

* rename test

* Revert "fix s3 source"

This reverts commit e76fdbb.

* sesparate PR for s3 fix (#1024)

* strip file://

* rename test

* rename tests

* add aiohttp skip

* attempt to parse windows paths

* test ci

* Revert "test ci"

This reverts commit 4c1c8a5.

* rename test

* remove fsspec from test

* remove *_handler options

* update defaults

* do not override default s3

* do not use fsspec for multiprocessing

* rename test

* fix not selecting object source

* missing import

* normalize doc

* remove helper

* never return None as source

* remove unnecessary xrootd source default override since fsspec is default now

* rename test

* add empty class to pass old pickle test

* feat: set `fsspec` (`s3fs`) as default handler for s3 paths (#1032)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* feat: set fsspec as default source (#1023)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* remove deprecated handlers from docs

* simplify source selection

* return object source

* pickle executor

* rename test

* test more handlers

* option to check writeable file-like object

* rename test

* explicitly set handler

* fix s3 source

* rename test

* Revert "fix s3 source"

This reverts commit e76fdbb.

* sesparate PR for s3 fix (#1024)

* strip file://

* rename test

* rename tests

* add aiohttp skip

* attempt to parse windows paths

* test ci

* Revert "test ci"

This reverts commit 4c1c8a5.

* rename test

* remove fsspec from test

* remove *_handler options

* update defaults

* do not override default s3

* do not use fsspec for multiprocessing

* rename test

* fix not selecting object source

* missing import

* normalize doc

* remove helper

* never return None as source

* remove unnecessary xrootd source default override since fsspec is default now

* rename test

* add empty class to pass old pickle test

* set version to 5.2.0rc1 (release candidate)

* set s3fs as default for s3

* test different handlers

* correct serialization of fsspec source

* feat: simplify object path split (#1028)

* simplify object path split

* add example from #975

* fix tests

* add more test cases

* test case update

* remove scheme unused regex

* feat: fsspec for all non-object writing - %-encoded urls no longer decoded (#1034)

* writing goes through fsspec

* increase rc version

* type hints and docs

* add helper methods, create

* throw more specific error

* add additional test for `create` failure with scheme other than local

* simplify source selection

* remove windows specific code

* raise exception if invalid combination of handler / input (file-like object and fsspec)

* use softer check for file-like object

* cover problematic case with additional slash (file:///c:/file.root)

* test "file:" scheme (no slash)

* test backslash

* add new test case

* split big test in two

* retry on socket error

* xrootd iterator

* iterate over different files

* iterate over tree

* pytest fixture for test directory

* pytest fixture for test directory

* add annotation to open argument

* remove repeated test
…th colons in name) (#1055)

* add test for issue 1054

* additional test

* make sure fsspec fix works

* try new test in older fsspec version (need to test windows)

* skip test in windows due to colons in name

* add explicit object-path split with open

* revert use fsspec fork in ci
@lobis lobis changed the base branch from main to main-fsspec December 12, 2023 05:54
@lobis lobis requested review from jpivarski and nsmith- December 12, 2023 06:11
@lobis lobis marked this pull request as ready for review December 12, 2023 06:12
@lgray
Copy link
Contributor

lgray commented Dec 12, 2023

fsspec-xrootd 0.2.3 that includes scikit-hep/fsspec-xrootd#40 is now on pypi

lobis and others added 8 commits December 12, 2023 08:48
* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list
* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* remove deprecated handlers from docs

* simplify source selection

* return object source

* pickle executor

* rename test

* test more handlers

* option to check writeable file-like object

* rename test

* explicitly set handler

* fix s3 source

* rename test

* Revert "fix s3 source"

This reverts commit e76fdbb.

* sesparate PR for s3 fix (#1024)

* strip file://

* rename test

* rename tests

* add aiohttp skip

* attempt to parse windows paths

* test ci

* Revert "test ci"

This reverts commit 4c1c8a5.

* rename test

* remove fsspec from test

* remove *_handler options

* update defaults

* do not override default s3

* do not use fsspec for multiprocessing

* rename test

* fix not selecting object source

* missing import

* normalize doc

* remove helper

* never return None as source

* remove unnecessary xrootd source default override since fsspec is default now

* rename test

* add empty class to pass old pickle test
* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* feat: set fsspec as default source (#1023)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* remove deprecated handlers from docs

* simplify source selection

* return object source

* pickle executor

* rename test

* test more handlers

* option to check writeable file-like object

* rename test

* explicitly set handler

* fix s3 source

* rename test

* Revert "fix s3 source"

This reverts commit e76fdbb.

* sesparate PR for s3 fix (#1024)

* strip file://

* rename test

* rename tests

* add aiohttp skip

* attempt to parse windows paths

* test ci

* Revert "test ci"

This reverts commit 4c1c8a5.

* rename test

* remove fsspec from test

* remove *_handler options

* update defaults

* do not override default s3

* do not use fsspec for multiprocessing

* rename test

* fix not selecting object source

* missing import

* normalize doc

* remove helper

* never return None as source

* remove unnecessary xrootd source default override since fsspec is default now

* rename test

* add empty class to pass old pickle test

* set version to 5.2.0rc1 (release candidate)

* set s3fs as default for s3

* test different handlers

* correct serialization of fsspec source
* simplify object path split

* add example from #975

* fix tests

* add more test cases

* test case update

* remove scheme unused regex
…coded (#1034)

* writing goes through fsspec

* increase rc version

* type hints and docs

* add helper methods, create

* throw more specific error

* add additional test for `create` failure with scheme other than local

* simplify source selection

* remove windows specific code

* raise exception if invalid combination of handler / input (file-like object and fsspec)

* use softer check for file-like object

* cover problematic case with additional slash (file:///c:/file.root)

* test "file:" scheme (no slash)

* test backslash
…th colons in name) (#1055)

* add test for issue 1054

* additional test

* make sure fsspec fix works

* try new test in older fsspec version (need to test windows)

* skip test in windows due to colons in name

* add explicit object-path split with open

* revert use fsspec fork in ci
* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* feat: set fsspec as default source (#1023)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* remove deprecated handlers from docs

* simplify source selection

* return object source

* pickle executor

* rename test

* test more handlers

* option to check writeable file-like object

* rename test

* explicitly set handler

* fix s3 source

* rename test

* Revert "fix s3 source"

This reverts commit e76fdbb.

* sesparate PR for s3 fix (#1024)

* strip file://

* rename test

* rename tests

* add aiohttp skip

* attempt to parse windows paths

* test ci

* Revert "test ci"

This reverts commit 4c1c8a5.

* rename test

* remove fsspec from test

* remove *_handler options

* update defaults

* do not override default s3

* do not use fsspec for multiprocessing

* rename test

* fix not selecting object source

* missing import

* normalize doc

* remove helper

* never return None as source

* remove unnecessary xrootd source default override since fsspec is default now

* rename test

* add empty class to pass old pickle test

* feat: set `fsspec` (`s3fs`) as default handler for s3 paths (#1032)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* feat: set fsspec as default source (#1023)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* remove deprecated handlers from docs

* simplify source selection

* return object source

* pickle executor

* rename test

* test more handlers

* option to check writeable file-like object

* rename test

* explicitly set handler

* fix s3 source

* rename test

* Revert "fix s3 source"

This reverts commit e76fdbb.

* sesparate PR for s3 fix (#1024)

* strip file://

* rename test

* rename tests

* add aiohttp skip

* attempt to parse windows paths

* test ci

* Revert "test ci"

This reverts commit 4c1c8a5.

* rename test

* remove fsspec from test

* remove *_handler options

* update defaults

* do not override default s3

* do not use fsspec for multiprocessing

* rename test

* fix not selecting object source

* missing import

* normalize doc

* remove helper

* never return None as source

* remove unnecessary xrootd source default override since fsspec is default now

* rename test

* add empty class to pass old pickle test

* set version to 5.2.0rc1 (release candidate)

* set s3fs as default for s3

* test different handlers

* correct serialization of fsspec source

* feat: simplify object path split (#1028)

* simplify object path split

* add example from #975

* fix tests

* add more test cases

* test case update

* remove scheme unused regex

* feat: fsspec for all non-object writing - %-encoded urls no longer decoded (#1034)

* writing goes through fsspec

* increase rc version

* type hints and docs

* add helper methods, create

* throw more specific error

* add additional test for `create` failure with scheme other than local

* simplify source selection

* remove windows specific code

* raise exception if invalid combination of handler / input (file-like object and fsspec)

* use softer check for file-like object

* cover problematic case with additional slash (file:///c:/file.root)

* test "file:" scheme (no slash)

* test backslash

* add new test case

* split big test in two

* retry on socket error

* xrootd iterator

* iterate over different files

* iterate over tree

* pytest fixture for test directory

* pytest fixture for test directory

* add annotation to open argument

* remove repeated test
…th colons in name) (#1055)

* add test for issue 1054

* additional test

* make sure fsspec fix works

* try new test in older fsspec version (need to test windows)

* skip test in windows due to colons in name

* add explicit object-path split with open

* revert use fsspec fork in ci
@lobis lobis changed the title feat: globbing with fsspec (xrootd only) feat: globbing with fsspec Dec 12, 2023
@jpivarski
Copy link
Member

The Windows tests are failing because it's thinking that the drive letter is a URI scheme:

        if isinstance(files, str):
            if parse_colon:
                file_path, object_path = file_object_path_split(files)
            else:
                file_path, object_path = files, None
    
            parsed_url = urlparse(file_path)
            scheme = parsed_url.scheme
            if scheme in fsspec.available_protocols():
                # user specified a protocol, so we use fsspec to expand the glob and return the full paths
                file_names_full = [file.full_name for file in fsspec.open_files(files)]
                # https://github.com/fsspec/filesystem_spec/issues/1459
                # Not all protocols return the full_name attribute correctly (if they have url parameters)
                for file_name_full in file_names_full:
                    yield file_name_full, object_path, maybe_steps
            elif scheme != "":
                # user specified a protocol, but it's not supported by fsspec (e.g. user does not have s3fs installed)
>               raise ValueError(
                    f"Protocol {scheme} is not supported by fsspec. Please install the corresponding package."
                )
E               ValueError: Protocol c is not supported by fsspec. Please install the corresponding package.

HasBranches = <class 'uproot.behaviors.TBranch.HasBranches'>
counter    = [1]
file_path  = 'C:\\Users\\runneradmin\\.local\\skhepdata\\uproot-Zmumu.root'
files      = 'C:\\Users\\runneradmin\\.local\\skhepdata\\uproot-Zmumu.root:events'
files2     = 'C:\\Users\\runneradmin\\.local\\skhepdata\\uproot-Zmumu.root:events'
maybe_steps = None
object_path = 'events'
parse_colon = True
parsed_url = ParseResult(scheme='c', netloc='', path='\\Users\\runneradmin\\.local\\skhepdata\\uproot-Zmumu.root', params='', query='', fragment='')
scheme     = 'c'
steps_allowed = True

The same preprocessing that protects "open file" should be applied to "expand glob."

@jpivarski
Copy link
Member

It looks like you just got it in 50835ae.

@jpivarski
Copy link
Member

jpivarski commented Dec 12, 2023

lobis changed the title feat: globbing with fsspec (xrootd only) feat: globbing with fsspec

Does that mean that the new fsspec is out? No, fsspec/filesystem_spec#1459 hasn't been merged, and Martin has a response about it.

@jpivarski
Copy link
Member

It looks like it only tests globbing in XRootD, so I'll assume that's what it does.

Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the tests pass, this will be a good addition, adding globbing for XRootD. (So far, XRootD only.)

If this is opening the door to globbing with any URI scheme and the set of schemes that work depends on what gets implemented in fsspec, that's fine. We can refer users to fsspec if they're wondering why it works in XRootD and not in HTTP. That's part of the separation of concerns we're setting up between Uproot and XRootD.

@lgray
Copy link
Contributor

lgray commented Dec 12, 2023

Looks like this one needs its failed tests rerun as well. Failures appear unrelated.

@jpivarski
Copy link
Member

I wonder if this is one of the tests that had previously been called "super-flaky"? @lobis reinstated quite a few of these and they hadn't been acting up, maybe until now. The issue is an XRootD connection failure.

Maybe we want to install this: https://github.com/marketplace/actions/retry-step

(Maybe also here: https://github.com/CoffeaTeam/integration-test/blob/main/.github/workflows/test.yml)

@lgray
Copy link
Contributor

lgray commented Dec 13, 2023

Since this one is approved - if it can go in 5.2.0 please add it :-)

@lobis
Copy link
Collaborator Author

lobis commented Dec 13, 2023

I wonder if this is one of the tests that had previously been called "super-flaky"? @lobis reinstated quite a few of these and they hadn't been acting up, maybe until now. The issue is an XRootD connection failure.

Maybe we want to install this: https://github.com/marketplace/actions/retry-step

(Maybe also here: https://github.com/CoffeaTeam/integration-test/blob/main/.github/workflows/test.yml)

Yes the xrootd tests are the most problematic and likely to fail due to server issues.

We are already doing reruns for the xrootd tests, with the following settings: --reruns 3 --reruns-delay 30, so I'm not sure about adding this retry-step action, it feels redundant (maybe I didn't understand what it does). Perhaps we should increase the number of reruns and the time between them.

@lobis
Copy link
Collaborator Author

lobis commented Dec 13, 2023

It looks like it only tests globbing in XRootD, so I'll assume that's what it does.

Actually this should work for all fsspec filesystems that support globbing. I added some tests for s3 to show this. http will not work (I added a test that should fail if fsspec ever implements globbing for http, so we are noticied).

Since these tests expand a glob expression they may fail in the future if some other files are added that match the glob. I set the tests in such a way that this is hopefully easy to spot and fix.

@jpivarski
Copy link
Member

Right, let's not add multiple ways to retry. But if the error rate is high enough that with 3 retries we're still seeing it, then perhaps the retry parameters should get increased: --reruns 5 --reruns-delay 60 (I assume that's 60 seconds).

That doesn't need to be done for this PR, though, which has nothing to do with the failing tests. For this PR, we can "rerun failed tests" manually go get all the green checkmarks and then merge the PR when everything is verified-okay.

@lobis lobis merged commit 5dc898f into main-fsspec Dec 13, 2023
21 checks passed
@lobis lobis deleted the globbing-fsspec branch December 13, 2023 18:41
jpivarski pushed a commit that referenced this pull request Dec 13, 2023
* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* feat: set fsspec as default source (#1023)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* remove deprecated handlers from docs

* simplify source selection

* return object source

* pickle executor

* rename test

* test more handlers

* option to check writeable file-like object

* rename test

* explicitly set handler

* fix s3 source

* rename test

* Revert "fix s3 source"

This reverts commit e76fdbb.

* sesparate PR for s3 fix (#1024)

* strip file://

* rename test

* rename tests

* add aiohttp skip

* attempt to parse windows paths

* test ci

* Revert "test ci"

This reverts commit 4c1c8a5.

* rename test

* remove fsspec from test

* remove *_handler options

* update defaults

* do not override default s3

* do not use fsspec for multiprocessing

* rename test

* fix not selecting object source

* missing import

* normalize doc

* remove helper

* never return None as source

* remove unnecessary xrootd source default override since fsspec is default now

* rename test

* add empty class to pass old pickle test

* feat: set `fsspec` (`s3fs`) as default handler for s3 paths (#1032)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* feat: set fsspec as default source (#1023)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* remove deprecated handlers from docs

* simplify source selection

* return object source

* pickle executor

* rename test

* test more handlers

* option to check writeable file-like object

* rename test

* explicitly set handler

* fix s3 source

* rename test

* Revert "fix s3 source"

This reverts commit e76fdbb.

* sesparate PR for s3 fix (#1024)

* strip file://

* rename test

* rename tests

* add aiohttp skip

* attempt to parse windows paths

* test ci

* Revert "test ci"

This reverts commit 4c1c8a5.

* rename test

* remove fsspec from test

* remove *_handler options

* update defaults

* do not override default s3

* do not use fsspec for multiprocessing

* rename test

* fix not selecting object source

* missing import

* normalize doc

* remove helper

* never return None as source

* remove unnecessary xrootd source default override since fsspec is default now

* rename test

* add empty class to pass old pickle test

* set version to 5.2.0rc1 (release candidate)

* set s3fs as default for s3

* test different handlers

* correct serialization of fsspec source

* feat: simplify object path split (#1028)

* simplify object path split

* add example from #975

* fix tests

* add more test cases

* test case update

* remove scheme unused regex

* feat: fsspec for all non-object writing - %-encoded urls no longer decoded (#1034)

* writing goes through fsspec

* increase rc version

* type hints and docs

* add helper methods, create

* throw more specific error

* add additional test for `create` failure with scheme other than local

* simplify source selection

* remove windows specific code

* raise exception if invalid combination of handler / input (file-like object and fsspec)

* use softer check for file-like object

* cover problematic case with additional slash (file:///c:/file.root)

* test "file:" scheme (no slash)

* test backslash

* test: improve path object split tests (#1039)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* feat: set fsspec as default source (#1023)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* remove deprecated handlers from docs

* simplify source selection

* return object source

* pickle executor

* rename test

* test more handlers

* option to check writeable file-like object

* rename test

* explicitly set handler

* fix s3 source

* rename test

* Revert "fix s3 source"

This reverts commit e76fdbb.

* sesparate PR for s3 fix (#1024)

* strip file://

* rename test

* rename tests

* add aiohttp skip

* attempt to parse windows paths

* test ci

* Revert "test ci"

This reverts commit 4c1c8a5.

* rename test

* remove fsspec from test

* remove *_handler options

* update defaults

* do not override default s3

* do not use fsspec for multiprocessing

* rename test

* fix not selecting object source

* missing import

* normalize doc

* remove helper

* never return None as source

* remove unnecessary xrootd source default override since fsspec is default now

* rename test

* add empty class to pass old pickle test

* feat: set `fsspec` (`s3fs`) as default handler for s3 paths (#1032)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* feat: set fsspec as default source (#1023)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* remove deprecated handlers from docs

* simplify source selection

* return object source

* pickle executor

* rename test

* test more handlers

* option to check writeable file-like object

* rename test

* explicitly set handler

* fix s3 source

* rename test

* Revert "fix s3 source"

This reverts commit e76fdbb.

* sesparate PR for s3 fix (#1024)

* strip file://

* rename test

* rename tests

* add aiohttp skip

* attempt to parse windows paths

* test ci

* Revert "test ci"

This reverts commit 4c1c8a5.

* rename test

* remove fsspec from test

* remove *_handler options

* update defaults

* do not override default s3

* do not use fsspec for multiprocessing

* rename test

* fix not selecting object source

* missing import

* normalize doc

* remove helper

* never return None as source

* remove unnecessary xrootd source default override since fsspec is default now

* rename test

* add empty class to pass old pickle test

* set version to 5.2.0rc1 (release candidate)

* set s3fs as default for s3

* test different handlers

* correct serialization of fsspec source

* feat: simplify object path split (#1028)

* simplify object path split

* add example from #975

* fix tests

* add more test cases

* test case update

* remove scheme unused regex

* feat: fsspec for all non-object writing - %-encoded urls no longer decoded (#1034)

* writing goes through fsspec

* increase rc version

* type hints and docs

* add helper methods, create

* throw more specific error

* add additional test for `create` failure with scheme other than local

* simplify source selection

* remove windows specific code

* raise exception if invalid combination of handler / input (file-like object and fsspec)

* use softer check for file-like object

* cover problematic case with additional slash (file:///c:/file.root)

* test "file:" scheme (no slash)

* test backslash

* add new test case

* split big test in two

* retry on socket error

* xrootd iterator

* iterate over different files

* iterate over tree

* pytest fixture for test directory

* pytest fixture for test directory

* add annotation to open argument

* remove repeated test

* test: add test for issue 1054 (newer fsspec failing to parse files with colons in name) (#1055)

* add test for issue 1054

* additional test

* make sure fsspec fix works

* try new test in older fsspec version (need to test windows)

* skip test in windows due to colons in name

* add explicit object-path split with open

* revert use fsspec fork in ci

* use fsspec to expand glob

* skip root from remote_schemas

* test iterate over xrootd

* test

* add temporary install to ci

* remove ci debug

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* feat: set fsspec as default source (#1023)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* remove deprecated handlers from docs

* simplify source selection

* return object source

* pickle executor

* rename test

* test more handlers

* option to check writeable file-like object

* rename test

* explicitly set handler

* fix s3 source

* rename test

* Revert "fix s3 source"

This reverts commit e76fdbb.

* sesparate PR for s3 fix (#1024)

* strip file://

* rename test

* rename tests

* add aiohttp skip

* attempt to parse windows paths

* test ci

* Revert "test ci"

This reverts commit 4c1c8a5.

* rename test

* remove fsspec from test

* remove *_handler options

* update defaults

* do not override default s3

* do not use fsspec for multiprocessing

* rename test

* fix not selecting object source

* missing import

* normalize doc

* remove helper

* never return None as source

* remove unnecessary xrootd source default override since fsspec is default now

* rename test

* add empty class to pass old pickle test

* feat: set `fsspec` (`s3fs`) as default handler for s3 paths (#1032)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* feat: set fsspec as default source (#1023)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* remove deprecated handlers from docs

* simplify source selection

* return object source

* pickle executor

* rename test

* test more handlers

* option to check writeable file-like object

* rename test

* explicitly set handler

* fix s3 source

* rename test

* Revert "fix s3 source"

This reverts commit e76fdbb.

* sesparate PR for s3 fix (#1024)

* strip file://

* rename test

* rename tests

* add aiohttp skip

* attempt to parse windows paths

* test ci

* Revert "test ci"

This reverts commit 4c1c8a5.

* rename test

* remove fsspec from test

* remove *_handler options

* update defaults

* do not override default s3

* do not use fsspec for multiprocessing

* rename test

* fix not selecting object source

* missing import

* normalize doc

* remove helper

* never return None as source

* remove unnecessary xrootd source default override since fsspec is default now

* rename test

* add empty class to pass old pickle test

* set version to 5.2.0rc1 (release candidate)

* set s3fs as default for s3

* test different handlers

* correct serialization of fsspec source

* feat: simplify object path split (#1028)

* simplify object path split

* add example from #975

* fix tests

* add more test cases

* test case update

* remove scheme unused regex

* feat: fsspec for all non-object writing - %-encoded urls no longer decoded (#1034)

* writing goes through fsspec

* increase rc version

* type hints and docs

* add helper methods, create

* throw more specific error

* add additional test for `create` failure with scheme other than local

* simplify source selection

* remove windows specific code

* raise exception if invalid combination of handler / input (file-like object and fsspec)

* use softer check for file-like object

* cover problematic case with additional slash (file:///c:/file.root)

* test "file:" scheme (no slash)

* test backslash

* test: add test for issue 1054 (newer fsspec failing to parse files with colons in name) (#1055)

* add test for issue 1054

* additional test

* make sure fsspec fix works

* try new test in older fsspec version (need to test windows)

* skip test in windows due to colons in name

* add explicit object-path split with open

* revert use fsspec fork in ci

* test: improve path object split tests (#1039)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* feat: set fsspec as default source (#1023)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* remove deprecated handlers from docs

* simplify source selection

* return object source

* pickle executor

* rename test

* test more handlers

* option to check writeable file-like object

* rename test

* explicitly set handler

* fix s3 source

* rename test

* Revert "fix s3 source"

This reverts commit e76fdbb.

* sesparate PR for s3 fix (#1024)

* strip file://

* rename test

* rename tests

* add aiohttp skip

* attempt to parse windows paths

* test ci

* Revert "test ci"

This reverts commit 4c1c8a5.

* rename test

* remove fsspec from test

* remove *_handler options

* update defaults

* do not override default s3

* do not use fsspec for multiprocessing

* rename test

* fix not selecting object source

* missing import

* normalize doc

* remove helper

* never return None as source

* remove unnecessary xrootd source default override since fsspec is default now

* rename test

* add empty class to pass old pickle test

* feat: set `fsspec` (`s3fs`) as default handler for s3 paths (#1032)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* feat: set fsspec as default source (#1023)

* feat: add fsspec as required dependency (#1021)

* fsspec requirements

* simplify fsspec import

* use loop property

* correctly create schemes list

* remove deprecated handlers from docs

* simplify source selection

* return object source

* pickle executor

* rename test

* test more handlers

* option to check writeable file-like object

* rename test

* explicitly set handler

* fix s3 source

* rename test

* Revert "fix s3 source"

This reverts commit e76fdbb.

* sesparate PR for s3 fix (#1024)

* strip file://

* rename test

* rename tests

* add aiohttp skip

* attempt to parse windows paths

* test ci

* Revert "test ci"

This reverts commit 4c1c8a5.

* rename test

* remove fsspec from test

* remove *_handler options

* update defaults

* do not override default s3

* do not use fsspec for multiprocessing

* rename test

* fix not selecting object source

* missing import

* normalize doc

* remove helper

* never return None as source

* remove unnecessary xrootd source default override since fsspec is default now

* rename test

* add empty class to pass old pickle test

* set version to 5.2.0rc1 (release candidate)

* set s3fs as default for s3

* test different handlers

* correct serialization of fsspec source

* feat: simplify object path split (#1028)

* simplify object path split

* add example from #975

* fix tests

* add more test cases

* test case update

* remove scheme unused regex

* feat: fsspec for all non-object writing - %-encoded urls no longer decoded (#1034)

* writing goes through fsspec

* increase rc version

* type hints and docs

* add helper methods, create

* throw more specific error

* add additional test for `create` failure with scheme other than local

* simplify source selection

* remove windows specific code

* raise exception if invalid combination of handler / input (file-like object and fsspec)

* use softer check for file-like object

* cover problematic case with additional slash (file:///c:/file.root)

* test "file:" scheme (no slash)

* test backslash

* add new test case

* split big test in two

* retry on socket error

* xrootd iterator

* iterate over different files

* iterate over tree

* pytest fixture for test directory

* pytest fixture for test directory

* add annotation to open argument

* remove repeated test

* test: add test for issue 1054 (newer fsspec failing to parse files with colons in name) (#1055)

* add test for issue 1054

* additional test

* make sure fsspec fix works

* try new test in older fsspec version (need to test windows)

* skip test in windows due to colons in name

* add explicit object-path split with open

* revert use fsspec fork in ci

* try to expand all glob strings if they have the protocol

* making it work on windows

* testing globbing for s3

* add failing test for http globbing

* test more handlers, failing test for xrootd (missing files)

* understanding error

* add class method to extract fsspec options

* call super constructor for fsspec source

* pass options to regularize files util

* python 3.12 aiohttp test in other PR

* attempt to hide the ssl destructor error

* retry on "expired"

* style: pre-commit fixes

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants