Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pixiv User Background Banner #2495

Closed
afterdelight opened this issue Apr 13, 2022 · 41 comments
Closed

Pixiv User Background Banner #2495

afterdelight opened this issue Apr 13, 2022 · 41 comments

Comments

@afterdelight
Copy link

Please add an option to include download pixiv user background banner. ty

@AlttiRi
Copy link

AlttiRi commented Apr 16, 2022

Also it's possible to replace
https://i.pximg.net/c/1920x960_80_a2_g5/background/img/...
with
https://i.pximg.net/background/img/...
to get the original image.

(removing of /c/1920x960_80_a2_g5)

BTW, is there a support of banners for naming (Is it possible to add a special rule for it? Since it will not have title, num, id.) and for the download archive? (I would not like to download the same banner each time.)

@afterdelight
Copy link
Author

@AlttiRi
Copy link

AlttiRi commented Apr 17, 2022

@afterdelight
Copy link
Author

its 403 forbidden

@AlttiRi
Copy link

AlttiRi commented Apr 18, 2022

(requires referer header)

It requires the existence of Referer HTTP request's header with Pixiv's origin value.

@afterdelight
Copy link
Author

how to call it

@AlttiRi
Copy link

AlttiRi commented Apr 18, 2022

Edit any visible image HTML element <img src=""> on the site with DevTools by replacing the original src attribute value by any other image URL.
The browser will load the new image with the site origin as the Referer header value.

Nevermind.

@afterdelight
Copy link
Author

i dont get it

@AlttiRi
Copy link

AlttiRi commented Apr 19, 2022

Okay, a UserScript (updated on 2022.12.21):

< Code (click to expand) >
// ==UserScript==
// @name        Pixiv BG Download Script
// @namespace   Pixiv
// @version     0.0.6-2022.12.21
// @match       https://www.pixiv.net/en/users/*
// @match       https://www.pixiv.net/users/*
// @description Pixiv BG Download Button
// @grant       GM_registerMenuCommand
// @grant       GM_xmlhttpRequest
// @connect     i.pximg.net
// @noframes
// @author      [Alt'tiRi]
// @supportURL  https://github.com/mikf/gallery-dl/issues/2495#issuecomment-1102505269
// ==/UserScript==



// ------------------------------------------------------------------------------------
// Init
// ------------------------------------------------------------------------------------

const globalFetch = ujs_getGlobalFetch();
const fetch = GM_fetch;

if (globalThis.GM_registerMenuCommand /* undefined in Firefox with VM */ || typeof GM_registerMenuCommand === "function") {
    GM_registerMenuCommand("Download BG", downloadBg);
}



// ------------------------------------------------------------------------------------
// Main code
// ------------------------------------------------------------------------------------

function downloadBg() {
    const userId = parseUserId();
    void downloadBgWithApi(userId);
}

function parseUserId(url = location.href) {
    const _url = new URL(url);
    const id = _url.pathname.match(/(?<=users\/)\d+/)[0];
    return id;
}

async function downloadBgWithApi(userId) {
    const titleText = document.title;
    try {
        document.title = "💤" + titleText;
        const resp = await globalFetch("https://www.pixiv.net/ajax/user/" + userId);
        const json = await resp.json();

        if (!json?.body?.background?.url) {
            document.title = "⬜" + titleText;
            console.log("[ujs] no bg");
            await sleep(1000);            
            return;
        }

        const {name: userName, background} = json.body;
        const url = background.url.replace("/c/1920x960_80_a2_g5", "");

        document.title = "⏳" + titleText;
        const {blob, lastModifiedDate, filename} = await fetchResource(url, {headers: {"referer": location.href}});
        const filenamePrefix = userId + "_";
        const _filename = filename.startsWith(filenamePrefix) ? filename.slice(filenamePrefix.length) : filename;
        const name = `[pixiv][bg] ${userId}${userName}${lastModifiedDate}${_filename}`;
        download(blob, name, url);
      
        document.title = "✅" + titleText;
        await sleep(5000);
    } catch (e) {
        console.error(e);
        document.title = "❌" + titleText;
        await sleep(5000);        
    } finally {
        document.title = titleText;
    }
}


// ------------------------------------------------------------------------------------
// GM Util
// ------------------------------------------------------------------------------------

function ujs_getGlobalFetch({verbose, strictTrackingProtectionFix} = {}) {
    const useFirefoxStrictTrackingProtectionFix = strictTrackingProtectionFix === undefined ? true : strictTrackingProtectionFix; // Let's use by default
    const useFirefoxFix = useFirefoxStrictTrackingProtectionFix && typeof wrappedJSObject === "object" && typeof wrappedJSObject.fetch === "function";
    // --- [VM/GM + Firefox ~90+ + Enabled "Strict Tracking Protection"] fix --- //
    function fixedFirefoxFetch(resource, init = {}) {
        verbose && console.log("wrappedJSObject.fetch", resource, init);
        if (init.headers instanceof Headers) {
            // Since `Headers` are not allowed for structured cloning.
            init.headers = Object.fromEntries(init.headers.entries());
        }
        return wrappedJSObject.fetch(cloneInto(resource, document), cloneInto(init, document));
    }
    return useFirefoxFix ? fixedFirefoxFetch : globalThis.fetch;
}

// The simplified `fetch` — wrapper for `GM_xmlhttpRequest`
/* Using:
// @grant       GM_xmlhttpRequest

const response = await fetch(url);
const {status, statusText} = response;
const lastModified = response.headers.get("last-modified");
const blob = await response.blob();
*/
async function GM_fetch(url, init = {}) {
    const defaultInit = {method: "get"};
    const {headers, method} = {...defaultInit, ...init};

    return new Promise((resolve, _reject) => {
        const blobPromise = new Promise((resolve, reject) => {
            GM_xmlhttpRequest({
                url,
                method,
                headers,
                responseType: "blob",
                onload: (response) => resolve(response.response),
                onerror: reject,
                onreadystatechange: onHeadersReceived
            });
        });
        blobPromise.catch(_reject);
        function onHeadersReceived(response) {
            const {
                readyState, responseHeaders, status, statusText
            } = response;
            if (readyState === 2) { // HEADERS_RECEIVED
                const headers = parseHeaders(responseHeaders);
                resolve({
                    headers,
                    status,
                    statusText,
                    arrayBuffer: () => blobPromise.then(blob => blob.arrayBuffer()),
                    blob: () => blobPromise,
                    json: () => blobPromise.then(blob => blob.text()).then(text => JSON.parse(text)),
                    text: () => blobPromise.then(blob => blob.text()),
                });
            }
        }
    });
}
function parseHeaders(headersString) {
    class Headers {
        get(key) {
            return this[key.toLowerCase()];
        }
    }
    const headers = new Headers();
    for (const line of headersString.trim().split("\n")) {
        const [key, ...valueParts] = line.split(":"); // last-modified: Fri, 21 May 2021 14:46:56 GMT
        headers[key.trim().toLowerCase()] = valueParts.join(":").trim();
    }
    return headers;
}


// ------------------------------------------------------------------------------------
// Util
// ------------------------------------------------------------------------------------

function sleep(time) {
    return new Promise(resolve => setTimeout(resolve, time));
}

// Using:
// const {blob, lastModifiedDate, contentType, filename, name, extension, status} = await fetchResource(url);
//
async function fetchResource(url, init = {}) {
    const response = await fetch(url, {
        cache: "force-cache",
        ...init,
    });
    const {status} = response;
    const lastModifiedDateSeconds = response.headers.get("last-modified");
    const contentType = response.headers.get("content-type");

    const lastModifiedDate = dateToDayDateString(lastModifiedDateSeconds);
    const extension = extensionFromMime(contentType);
    const blob = await response.blob();

    const _url = new URL(url);
    const {filename} = (_url.origin + _url.pathname).match(/(?<filename>[^\/]+$)/).groups;
    const {name} = filename.match(/(?<name>^[^\.]+)/).groups;

    return {blob, lastModifiedDate, contentType, filename, name, extension, status};
}

// "Sun, 10 Jan 2021 22:22:22 GMT" -> "2021.01.10"
function dateToDayDateString(dateValue, utc = true) {
    const _date = new Date(dateValue);
    if (_date.toString() === "Invalid Date") {
        throw "Invalid Date";
    }
    function pad(str) {
        return str.toString().padStart(2, "0");
    }
    const _utc = utc ? "UTC" : "";
    const year  = _date[`get${_utc}FullYear`]();
    const month = _date[`get${_utc}Month`]() + 1;
    const date  = _date[`get${_utc}Date`]();

    return year + "." + pad(month) + "." + pad(date);
}

function extensionFromMime(mimeType) {
    let extension = mimeType.match(/(?<=\/).+/)[0];
    extension = extension === "jpeg" ? "jpg" : extension;
    return extension;
}

function download(blob, name, url) {
    const anchor = document.createElement("a");
    anchor.setAttribute("download", name || "");
    const blobUrl = URL.createObjectURL(blob);
    anchor.href = blobUrl + (url ? ("#" + url) : "");
    anchor.click();
    setTimeout(() => URL.revokeObjectURL(blobUrl), 5000);
}

https://www.pixiv.net/en/users/42083333
image

@afterdelight
Copy link
Author

where did u get that code?

@afterdelight
Copy link
Author

i dont see any download button on the page after installing the script

@AlttiRi
Copy link

AlttiRi commented Apr 19, 2022

i dont see any download button on the page

It's not in the page.

GM_registerMenuCommand

image

@afterdelight
Copy link
Author

i clicked the button but nothing happened tho

@AlttiRi
Copy link

AlttiRi commented Apr 19, 2022

Found the typo. Fixed. Should work now.

(It worked for me because of the existence of another my script.)


It works for me. Tested in Firefox with Violentmonkey, Tampermonkey and in Chrome.

@afterdelight
Copy link
Author

Still doesnt work. I use firefox.

@AlttiRi
Copy link

AlttiRi commented Apr 19, 2022

Well, I have fixed multiple bugs (3 total, the last two bugs were Firefox only), maybe you just tried to use an intermediate version, or cached one.

If it still does not work for you, I'm don't know the reason.

@afterdelight
Copy link
Author

Still doesnt work for me. maybe i was missing the other script you installed

@AlttiRi
Copy link

AlttiRi commented Apr 20, 2022

By the way, in some kind you are right.
I tested it in 3 different browsers (I even checked the work with the different adblock extensions.), but it worked for me because of I have installed PixivToolkit extension in them. It modifies all requests on Pixiv site (it adds Access-Control-Allow-Origin: * header), not only the requests made from the extension. That side effect was unexpected.

Okay, finally fixed.

@afterdelight
Copy link
Author

finally it works! now we need somebody who can implement it to gellary-dl

@afterdelight
Copy link
Author

ty!

@AlttiRi
Copy link

AlttiRi commented May 2, 2022

8475698 looks good, however, I don't think that these are proper:

archive_fmt = "avatar_{user[id]}"

archive_fmt = "background_{user[id]}"

Avatar and BG can be changed time by time, but this archive_fmt is not suited for this case.

It should additionally use a value from the URL:

  • either date — 2022/02/22/17/50/48,
  • or the hash from filename a96f0c7e94df3824568249734d7933a5 (42083333_a96f0c7e94df3824568249734d7933a5.jpg).

to download new versions of bg/ava.


BTW, the question:
How to specify the special filename pattern for bg/ava?
Currently I get the error:

[pixiv][error] FilenameFormatError: Applying filename format string failed (TypeError: unsupported format string passed to NoneType.__format__)

I would like to create a filename like here #2495 (comment)

  • [pixiv][bg] 42083333—トックリブンブク—2022.02.22—a96f0c7e94df3824568249734d7933a5.jpg

With using date and "hash" from the filename (For example, filename[id.length + 1: filename.length] will not work in the config).

@afterdelight
Copy link
Author

afterdelight commented May 2, 2022

I think [bg] filename_hash.jpg would be sufficient. you could add date too so it be like [bg] [date] filename_hash.jpg
sorry i dont know how to solve that error.

mikf added a commit that referenced this issue May 4, 2022
- add 'date' metadata to avatar/background files when available
  and use that in default filenames / archive ids
- remove deprecation warnings as their option names clash with
  subcategory names
@mikf
Copy link
Owner

mikf commented May 4, 2022

Avatar and BG can be changed time by time, but this archive_fmt is not suited for this case.

I did not consider that it can change over time. Should be fixed in 9adea93.

How to specify the special filename pattern for bg/ava?

By putting options for them in an avatar / background block inside your pixiv settings, like with any other "subcategory":

"pixiv":
{
    "avatar"    : {
        "directory": ["{category}", "{user[id]}"],
        "filename" : "..."
    },
    "background": {
        "directory": ["{category}", "{user[id]}"],
        "filename" : "..."
    }
}

Currently I get the error:

Because the date field for avatars/bgs was always None and applying a datetime format to that results in an error.
With 9adea93, date now usually has a valid value, but there are still instances where date is None (default avatars and bgs). You can use the ? operator like here to ignore this field in that case.

@afterdelight
Copy link
Author

@mikf with "directory": ["{category}", "{user[id]}"] setting. will they both create avatar and background subfolders? sorry im new in this

@AlttiRi
Copy link

AlttiRi commented May 4, 2022

Look working, one minor thing I still want is "hash" from filename.
It's only:

filename_hash = filename.split("_")[1]

It's not important value, but I would prefer to save this just in case.

Currently "filename": "[{category}][bg] {user[id]}—{user[name]}—{date:?//%Y.%m.%d}—{filename}.{extension}" has duplicate {user[id]} in {filename}.


with "directory": ["{category}", "{user[id]}"] setting.

{category} is in any case will be pixiv string.

For subfolders: "directory": ["{category}", "{user[id]}", "{subcategory}"]


BTW, "include" is the order dependent:

  • "include": ["artworks", "background", "avatar"] is not the same as
  • "include": ["background", "artworks", "avatar"]

@mikf
Copy link
Owner

mikf commented May 4, 2022

one minor thing I still want is "hash" from filename.

Since hash if it exists is always 32 characters long, you can get it by slicing the last few chars from a filename: {filename[-32:]}

BTW, "include" is the order dependent:

That's by design. All other include options for other sites behave the same way.

@AlttiRi
Copy link

AlttiRi commented May 4, 2022

Probably the last issue is how to prevent running "metadata" postprocessor for "background", "avatar"?

Limit it only by "artworks" subcategory.

https://github.com/mikf/gallery-dl/blob/master/docs/configuration.rst
has no mention about subcategory in "Postprocessor Options" section.


UPD:
I just can put the "postprocessors" in "artworks" subcategory like in the example above for filenames.

"pixiv": {
    "artworks": {
        "postprocessors": []
    }
}

@afterdelight
Copy link
Author

afterdelight commented May 4, 2022

i want to put specific tags folder in a R-18 folder. how to do that?
if a post have a tag for example 'hand' i want to put it in
pixiv/user id/non NSFW/hand
pixiv/user id/r-18(NSFW)/hand

the rest of downloads if dont have specific tags i specify. it will go to NSFW or non NSFW folder
could you please explain,ty

@mikf
Copy link
Owner

mikf commented May 4, 2022

@AlttiRi this concept is explained at the very top and applies for all/most options:
https://github.com/mikf/gallery-dl/blob/master/docs/configuration.rst#extractor-options

@afterdelight conditional directory and filenames:

"directory": {
    "'女の子' in tags": ["{category}", "{user[id]}", "{rating}", "girl"],
    "'足指' in tags"  : ["{category}", "{user[id]}", "{rating}", "toes"],
    ""               : ["{category}", "{user[id]}", "{rating}"]
}

tags by default are Japanese only on Pixiv.

@afterdelight
Copy link
Author

afterdelight commented May 4, 2022

so i cant write the tags in english? whats that rating for? to seperate r18 and non r18?

@AlttiRi
Copy link

AlttiRi commented May 5, 2022

tags by default are Japanese only on Pixiv.

"tags": "translated"

R-18 is just a tag.

@AlttiRi
Copy link

AlttiRi commented May 5, 2022

A bit questionable thing:

$ gg https://www.pixiv.net/en/users/81311664 -K
[pixiv][info] This extractor only spawns other extractors and does not provide any metadata on its own.
[pixiv][info] Showing results for 'https://www.pixiv.net/users/81311664/background' instead:

[pixiv][info] No results for https://www.pixiv.net/users/81311664/background

This user has no bg. I have set "include": ["background", "artworks"],.

Although https://www.pixiv.net/en/users/81311664/artworks -K works as expected.

@afterdelight
Copy link
Author

tags by default are Japanese only on Pixiv.

"tags": "translated"

R-18 is just a tag.

okay man, thanks for the info

@afterdelight
Copy link
Author

how to put if there are two tags in a post such as if it has r-18 and feet goes to
pixiv/user id/r-18(NSFW)/hand

@AlttiRi
Copy link

AlttiRi commented Jun 3, 2022

Bug

BG downloading fails if the extension is gif.
Currently it tries to download jpg first, then png.

Does it really require guessing the extension?
For example, https://www.pixiv.net/ajax/user/ returns full URL with the extension. The mobile API does not?

@mikf
Copy link
Owner

mikf commented Jun 3, 2022

Guessing the extension sadly is necessary, or at least I haven't found a better way.

https://www.pixiv.net/ajax/user/ would return the full URL, but it sometimes return an empty background entry even though that user clearly has a background image in his profile. Maybe this is R18 related?

The mobile API only returns the _master1200 version, which almost always has .jpg as extension, but I think I've seen a .gif background that worked without fallback / guessing, which is why .gif wasn't included in the fallback list until now.

@AlttiRi
Copy link

AlttiRi commented Jun 3, 2022

Just rechecked with my userscript above (it uses https://www.pixiv.net/ajax/user/) the downloaded BGs by gallery-dl.

It (this endpoint) looks working fine.

Maybe you did tests without cookies?

@mikf
Copy link
Owner

mikf commented Jun 3, 2022

Well, yes. gallery-dl doesn't use cookies for Pixiv, only an OAuth access_token.

@afterdelight
Copy link
Author

how to put if there are two tags in a post such as if it has r-18 and feet goes to pixiv/user id/r-18(NSFW)/hand

hi guys, could you please answer my previous question? thanks. or should i create a new issue?

@AlttiRi
Copy link

AlttiRi commented Jun 4, 2022

#2513 bump this one. Or create a new one in Discussions.

For me, it's not good idea to have Pixiv works grouped by a tag, instead of an artists.

I keep descriptions and tags within metadata html files placed near in a subfolder.

BTW, Windows's search can find text context within text (html) files, if you enable indexing of the folder. (By default indexing enabled only for "Download" and a few other system folders.)

My config:

        "pixiv":
        {
            "tags": "translated",
            "directory": ["[gallery-dl]", "[{category}] {user[id]}—{user[name]}"],
            "filename": "[{category}] {user[id]}—{id}—{user[name]}—{date:%Y.%m.%d}—{title}—{num}.{extension}",
            "include": ["background", "artworks"],
            "background": {
                "filename" : "[{category}][bg] {user[id]}—{user[name]}—{date:?//%Y.%m.%d}—{filename[-32:]}.{extension}"
            },
            "avatar": {
                "filename" : "[{category}][ava] {user[id]}—{user[name]}—{date:?//%Y.%m.%d}—{filename[-32:]}.{extension}"
            },
            "artworks": {
                "postprocessors": [{
                    "mtime": true,
                    "directory": "metadata",
                    "filename": "[{category}] {user[id]}—{id}—{user[name]}—{date:%Y.%m.%d}—{title}.html",
                    "name": "metadata",
                    "mode": "custom",
                    "format": "<div id='{id}'><h4><a href='https://www.pixiv.net/artworks/{id}'>{title}</a> by <a href='https://www.pixiv.net/users/{user[id]}'>{user[name]}</a></h4><div class='content'>{caption}</div><hr><div>{user[id]}—{id}—{date:%Y.%m.%d %H:%M:%S}{frames[0][delay]:?<br>Ugoira delay: / ms/}</div><hr><div class='tags'>[\"{tags:J\", \"}\"]</div><hr></div>"
                }]
            }
        },

Also it's useful to concatinate all html files into one and open it with a browser with bash command:

cat *.html > "$TEMP/_temp-catahtml-result.html"; start "$TEMP/_temp-catahtml-result.html"; sleep 0; exit;

When you are looking for links in descriptions.

@afterdelight
Copy link
Author

no, what i really want is like this

post which contains r18 and hand tag will go to: pixiv-user/r-18/hand/file
post which contains r18 and doesnt have specified tag will go to pixiv-user/r-18/file
post which contains non r18 and feet tag will go to: pixiv-user/r-0/feet/file
post which contains non r18 and doesnt have specified tag go to: pixiv-user/r-0/file

please guide me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants