-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support filtering glob paths on known extensions #495
Conversation
The current implementation doesn't filter files (it does filter folders though!). What this means is that if a (glob) path contains an unsupported file path, eventually an unsupported file format error will be thrown. By pre-filtering on known image file extensions, we attempt to prevent having a user specify very complex and specific globs. This implementation re-uses "DetermineEncodingFormat.by_extension(ext)" to check whether an extension is known. A disadvantage of this approach is that the above trait methods does slightly more than checking just whether the extension is supported (it also constructs an encoding format). Additionally, now this trait method will be run twice. Once here, and once when determining the format to save it to disk. Perhaps it can be cached in the future? The implementation handles both the extensions supported directly by sic, as well as the extensions as defined by the image crate fallback decider.
…ttern re-use After this, we can start using GlobWalker::from_patterns and add our custom filters to the patterns
use globwalk::GlobWalkerBuilder; | ||
use std::path::PathBuf; | ||
|
||
/// Function to determine the base directory and (remaining) pattern. | ||
/// Create a generic glob builder | ||
pub fn glob_builder_base<PAT: AsRef<str>>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realised something here: we can probably combine this implementation with CommonDir
specified elsewhere (they both have the goal of finding the greatest common base directory; here so the smallest amount of paths have to be matched against, and there to find the top level for mirroring the directory structure). For this PR, that doesn't need to be included however, since it was already separately available prior to this PR.
Thanks again for the PR btw! |
The current implementation doesn't filter files (it does filter folders though!). What this means is that if a (glob) path contains an unsupported file path, eventually an unsupported file format error will be thrown. By pre-filtering on known image file extensions, we attempt to prevent having a user specify very complex and specific globs.
This implementation re-uses "DetermineEncodingFormat.by_extension(ext)" to check whether an extension is known. A disadvantage of this approach is that the above trait methods does slightly more than checking just whether the extension is supported (it also constructs an encoding format). Additionally, now this trait method will be run twice. Once here, and once when determining the format to save it to disk. Perhaps it can be cached in the future?
The implementation handles both the extensions supported directly by sic, as well as the extensions as defined by the image crate fallback decider.
An alternative idea to the current implementation (this PR) could add specific only known image extensions glob patterns to globwalk iterator to disallow any unsupported image extensions through globwalk. This would limit to filter the walkdir iterator once during globwalk iteration, instead of twice (during globwalk iteration and the in this PR implemented post globwalk filter). I went with the current implementation because I wasn't sure yet how to merge such patterns with the user given pattern.
closes #494