-
-
Notifications
You must be signed in to change notification settings - Fork 975
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions, Feedback, and Suggestions #4 #5262
Comments
For most sites I'm able to sort files into year/month folders like this:
However for redgifs it doesn't look like there's a date keyword available for |
There's a typo in
|
There's also another typo in |
Can you grab all the media from quoted tweets? Example. |
#5262 (comment) It's implemented as a search for 'quoted_tweet_id:…' on Twitter.
#5262 (comment) This on was on the same line as the previous one ... (9fd851c)
Regarding typos, thanks for pointing them out. @biggestsonicfan |
EDIT: Actually, I think there's just something wrong with that URL. I had it saved for a long time and searching that tag normally gives a different URL ( |
You could use |
Is there support to remove metadata like this?
Post-processor: "filter-metadata":
{
"name": "metadata",
"mode": "delete",
"event": "prepare",
"fields": ["preview[images][0][resolutions]"]
} I've tried a few variations but no dice. "fields": ["preview[images][][resolutions]"] "fields": ["preview[images][N][resolutions]"] "fields": ["preview['images'][0]['resolutions']"] |
Hello, I left a comment in #4168 . Does the |
@taskhawk def remove_resolutions(metadata):
for image in metadata["preview"]["images"]:
del image["resolutions"] (untested, might need some check whether @YuanGYao |
@mikf |
Not sure if I'm missing something, but are directory specific configurations exclusive to running gallery-dl via the executable? Basically, I have a directory for regular tags, and a directory for artist tags. For regular tags I use So right now the only way I know to get this per-directory configuration to work, is to copy the gallery-dl executable everywhere I want to use a master configuration override. Am I missing something? It feels like there should be a better way. |
Huh? No, the configuration works always in the same way. You're simply using different configuration files? |
From the readme:
I want to override my master configuration |
You can load additional configuration files from the console with:
You just need to specify the path to the file and any options there will overwrite your main configuration file. Edit: From my understanding, yeah, automatic loading of local config files in each directory is only possible having the standalone executable in each directory. Are different directory options the only thing you need? |
Thanks, that's exactly what I was looking for! Guess I didn't read the documentation thoroughly enough. For now the only thing I'd want to override is the directory structure for artist tags. I don't think it's possible to determine from the metadata alone if a given tag is the name of an artist or not, so I thought the best way to go about it is to just have a separate directory for artists, and use a configuration override. So yeah, loading that override with the -c flag works great for that purpose, thanks again! |
You kinda can, but you need to enable "gelbooru": {
"directory": {
"search_tags in tags_artists": ["{category}", "{search_tags[0]!u}", "{search_tags}", "{date:%Y}", "{date:%m}"],
"" : ["{category}", "{search_tags}", "{date:%Y}", "{date:%m}"]
},
"tags": true
}, Set Of course, this depends on the artists being correctly tagged. Not sure if it happens on Gelbooru, but at least in other boorus and booru-like sites I've come across posts with the artist tagged as a general tag instead of an artist tag. Another limitation is that your search tag can only include one artist at a time, doing more will require a more complex expression to check all tags are present in What I do instead is that I inject a keyword to influence where it will be saved, like this:
And in my config I have "gelbooru": {
"directory": ["boorus", "{search_tags_type}", "{search_tags}"]
}, You can have: "gelbooru": {
"directory": {
"search_tags_type == 'artists'": ["{category}", "{search_tags[0]!u}", "{search_tags}", "{date:%Y}", "{date:%m}"],
"" : ["{category}", "{search_tags}", "{date:%Y}", "{date:%m}"]
}
}, You can do this for other tag types, like general, copyright, characters, etc. Because it's a chore to type that option every time I made a wrapper script, so I just call it like this because artists is my default:
For other tag types I can do:
|
Thanks for pointing out there's a tags option available for the gelbooru extractor. I already used it in the kemono extractor to get the name of the artist, but it didn't occur to me that gelbooru might also have such an option (and just accepted that the tags aren't categorized). For artists I store all the url's in their respective gelbooru.txt, rule34.txt, etc files like so:
And then just run |
When I'm making an extractor, what do I do if the site doesn't have different URL patterns for different page types? Every single page is just a numerical ID that could be a forum post, image, blog post, or something completely different. |
@Wiiplay123 You handle everything with a single extractor and decide what type of result to return on the fly. The |
Hi, what options should I use in my config file to change the format of dates in metadata files? I would like to use And would it also be possible to do this for json files that ytdl creates? I downloaded some videos with gallery-dl but the dates got saved as |
Trying to download this: using: produces this error: until it hits 5/5 then fails. It happens for all misskey.gg links. In contrast, misskey.io links work without even needing to preface the link with "misskey:". For example: Is there anything I can do to make misskey.gg links work? |
Trying to download resources from Imgur: |
Is there a "correct" way to convert large deviantart "*.gif" files to webm? I suppose this is doable with "exec" post-processor, but this seems quite tricky, especially given this functionality "almost" exists for ugoira:
So, maybe I'm missing the correct way to do that? And if there is none, maybe it makes sense to add an ugoira-like filter specifically for that to gallery-dl? |
I'm finding case-differences in my twitter directory ("Username" vs "UserName"). It's a btrfs partition under linux so it can handle that, but what's the best way to find out what twitter currently considers the case of the username? Should I take a current tweet, convert it to |
I realize I am double posting here, but I think I have a solution for at least fanbox posts for this. Fanbox metadata has |
@biggestsonicfan {
"extractor": {
"fanbox": {
"postprocessors": [
{
"name": "metadata",
"event": "post",
"filename": "{id}.json",
"directory": ["metadata", "supported"],
"filter": "locals().get('isSupported')"
},
{
"name": "metadata",
"event": "post",
"filename": "{id}.json",
"directory": ["metadata", "unsupported"],
"filter": "not locals().get('isSupported')"
}
]
}
}
}
|
Holy Christ, I've only recently started using filters with gallery-dl and I didn't realize it had potential like this. |
I would like to amend this to say that misskey.gg links actually do successfully download, but only sometimes? Specific links appear to seemingly always fail in the way I explained in this reply, but others will succeed no problem. It appears that I simply didn't let the command run long enough to reach media it would successfully download. Maybe I need to be logged in to download everything? I haven't tried that yet, but also I don't think I will, since the profile I tried to scrape hosts their media elsewhere, so I have no incentive to make an account just to test this. |
@mikf Looks like adding both those filters as a fanbox postprocessor is throwing everything in "unsupported":
Also not entirely sure this would work out well anyway anymore. As higher tier metadata that I don't support also set I'll play with the filter system a bit to see if I can fine tune it. |
When installing from source using
I understand it's inaccurate but I can't figure out why. There is no |
@fireattack I personally use this command on Windows |
Now it says 1.26.9.dev0. despite the built wheel clearly says |
I feel like something has gone awry for sure. Try creating a fresh venv and installing in that, just in case? |
Ah thanks, I figured it out. Apparently I have billions of And doing So, I have to run I suspect this is caused by
Maybe we shouldn't let the users use it unless really needed, @mikf ? (Or change to --force-reinstall instead.) Log if interested
|
Is there a way to download specifically the revisions on an artist's page on kemono.su? For example, one artist has had many of their posts updated with a revision that removed the content, while the original revision retains them. There are hundreds of posts on their page like that, so I was wondering if there was a way to set it to download the original revisions for all of them automatically. |
I just noticed that the latest gallery-dl release made this "just werk":
I still don't know whether it was possible to download such artwork using gallery-dl before (I thought it was, so I was just asking for someone to explain to me in simple terms how to do it), but, again, it "just werks" now, so, much appreciated. |
So Seiga is now region-locked. Can I proxy/wireguard just that extractor? EDIT: I've managed to get Wireguard locally to proxy via a port using wireproxy, but I just need a post(pre)processor to launch it as a daemon and close it when it's done. EDIT2: Figured it out:
|
I hate posting so frequently here but I hate making new issues more. This is once again an issue for me. I've just supported a user that has a preview image and download urls in their post. I normally parse the json files with a python script, however this preview image had been downloaded previously and I don't overwrite json data anymore. So I will re-run the user with skip set to EDIT: I also don't get how the metadata archive works either. Will the metadata entry be the same as the one for the extractor? |
@biggestsonicfan
A
|
could it be allowed that the default config be in toml? so the user does not have to specify i.e. add to
(And it would probably make sense to also add the equivalent yaml paths) |
The example link I provided no longer seems to be online, but I just noticed when downloading a profile on misskey.gg that the 5/5 timeout error no longer happened. But, it also doesn't appear that it added any older media that I assumed was being skipped. I didn't actually look at what was causing the 5/5 timeout error to see if it was media, but, since it appears to "just werk" at this point, I assume what was timing out simply was not media at all. I don't know. Either way, I am saying that I just noticed this is no longer reproducible. |
@mikf |
|
#5262 (comment) fixes regression introduced in 9e72968 'argparse' sets a flag and changes its behavior when using something that looks like a negative number as option string, '-4' and '-6' in this case.
Can the postprocessor use multiple filters? I'm trying but I'm getting |
@God-damnit-all @biggestsonicfan |
Gotcha. It might be nice to clarify that in the post-processor docs, as that's where I got the idea to use it as a list. My idea is to use filters to run specific postprocessors in order if:
Which I think would resolve to:
|
#5262 (comment) allow (theoretically*) all filter expression statements to be a list of individual filters (*) except for 'filename' and 'directory' conditionals, as dict keys cannot be lists
@biggestsonicfan |
Continuation of the previous issue as a central place for any sort of question or suggestion not deserving their own separate issue.
Links to older issues: #11, #74, #146.
The text was updated successfully, but these errors were encountered: