-
-
Notifications
You must be signed in to change notification settings - Fork 975
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[photovogue] added portfolio extractor #1253
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for your contribution.
There are some minor things that should be changed to fit the usual conventions.
from .common import Extractor, Message | ||
from datetime import datetime | ||
|
||
BASE_PATTERN = r"(?:https?://)?(?:www.vogue.it(?:/en)?/photovogue)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BASE_PATTERN = r"(?:https?://)?(?:www.vogue.it(?:/en)?/photovogue)" | |
BASE_PATTERN = r"(?:https?://)?(?:www\.)?vogue\.it/(?:en/)?photovogue" |
This would also allow non-www URLs
directory_fmt = ("{category}", "{photographer[id]}_{photographer[name]}") | ||
filename_fmt = "{id}_{title}.{extension}" | ||
archive_fmt = "{id}" | ||
root = "https://www.vogue.it/en/photovogue/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
root = "https://www.vogue.it/en/photovogue/" | |
root = "https://www.vogue.it/en/photovogue" |
by convention, root
doesn't end with a /
("https://www.vogue.it/en/photovogue/portfolio/?id=221252",), | ||
("https://www.vogue.it/photovogue/portfolio/?id=221252",), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
("https://www.vogue.it/en/photovogue/portfolio/?id=221252",), | |
("https://www.vogue.it/photovogue/portfolio/?id=221252",), | |
("https://www.vogue.it/en/photovogue/portfolio/?id=221252"), | |
("https://vogue.it/photovogue/portfolio/?id=221252"), |
Tests should either be a single string or a (url, results) tuple.
python test_results.py photovogue
fails when they are single-element tuples.
res = self.request( | ||
"https://api.vogue.it/production/photos", | ||
params={ | ||
"count": 50, | ||
"order_by": "DESC", | ||
"page": page, | ||
"photographer_id": self.user_id, | ||
}, | ||
).json() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just personal preference, but I'd prepare url and params before the loop, and then just call
res = self.request(url, params=params).json()
inside the loop.
This would also get rid of the page
variable, since you could just do params["page"] += 1
instead.
for item in res["items"]: | ||
item["extension"] = "jpg" | ||
item["title"] = item["title"].strip() | ||
item["_mtime"] = datetime.fromisoformat( | ||
item["date"].replace("Z", "+00:00") | ||
).timestamp() | ||
|
||
yield item |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just do a yield from res["items"]
and do all other processing outside.
Speaking of.
- add
from .. import text
at the very top item["extension"] = "jpg"
->text.nameext_from_url(<image url>, item)
item["_mtime"] = …
->item["date"] = text.parse_datetime(item["date"], "%Y-%m-%dT%H:%M:%S.%f%z")
- by convention,
date
is adatetime
object photovogue
images have aLast-Modified
header, so mtime modification should happen automatically. If you want the contents ofdate
as mtime, use--mtime-from-date
or themtime
post processor.
- by convention,
@@ -94,6 +94,7 @@ Nyafuu Archive https://archive.nyafuu.org/ Boards, Search Results, | |||
Patreon https://www.patreon.com/ Creators, Posts, User Profiles `Cookies <https://github.com/mikf/gallery-dl#cookies>`__ | |||
Pawoo https://pawoo.net/ Images from Statuses, User Profiles `OAuth <https://github.com/mikf/gallery-dl#oauth>`__ | |||
Photobucket https://photobucket.com/ Albums, individual Images | |||
PhotoVogue https://www.vogue.it/en/photovogue/ User profiles |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The contents of this file are auto-generated by supportedsites.py
Update CATEGORY_MAP
in there if necessary and run it once.
Thanks! Sorry I didn't have time during the week to address the mentioned issues. |
First contribution, there's probably something missing/not ideal. Let me know and I'll fix it. 😃