Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[newgrounds] Add post_url and HTML content of comment #2328

Closed
AlttiRi opened this issue Feb 22, 2022 · 9 comments
Closed

[newgrounds] Add post_url and HTML content of comment #2328

AlttiRi opened this issue Feb 22, 2022 · 9 comments

Comments

@AlttiRi
Copy link

AlttiRi commented Feb 22, 2022

I would like to have a key (post_url) that leads to the post.

The value of it is for images:
https://www.newgrounds.com/art/view/{user}/{filename.split("_").slice(2).join("")}
(Not tested, should work.)

for videos:
https://www.newgrounds.com/portal/view/{index}

It would be useful in HTMLs generated with postprocessors:

        "newgrounds": 
        {
            "directory": ["[gallery-dl]", "[{category}] {user}"],
            "filename": "[{category}] {user}—{date:%Y.%m.%d}—{_index}—{title}.{extension}",
            "postprocessors": [{
                "directory": "metadata",
                "filename": "[{category}] {user}—{date:%Y.%m.%d}—{index}—{title}.html",
                "name": "metadata",
                "mode": "custom",
                "format": "<div id='{index}'><h4>{title} by <a href='https://{user}.newgrounds.com/'>{user}</a></h4><div class='content'>{comment}</div><hr><div>{date:%Y.%m.%d %H:%M:%S}—{index}</div><hr><div class='tags'>[\"{tags:J\", \"}\"]</div><hr></div><br>"
            }]
        },

Formatted HTML:

<div id='{index}'>
   <h4><a href='{post_url}'>{title}</a> by <a href='https://{user}.newgrounds.com/'>{user}</a></h4>
   <div class='content'>{comment}</div>
   <hr>
   <div>{date:%Y.%m.%d %H:%M:%S}—{index}</div>
   <hr>
   <div class='tags'>[\"{tags:J\", \"}\"]</div>
   <hr>
</div>
<br>

Currently it (concated HTMLs) looks so:

image

(With just {title} instead of <a href='{post_url}'>{title}</a>.)


By the way:

  • {num} is None for the first image.
  • {_index} is not listed with -K
@AlttiRi
Copy link
Author

AlttiRi commented Feb 23, 2022

It definitively requires a key with HTML text.

Currently for this:
image

I get

PatreonTwitterMegaSound by @RealAudiodudeVoice by AndrastaeTracer model by @HydraFXX

text in {comment}.

  • no URLs,
  • no new line characters.

@AlttiRi
Copy link
Author

AlttiRi commented Feb 23, 2022

Also {description} key sometimes is empty.

For example: https://www.newgrounds.com/portal/view/741256

In HTML:

<meta name="description" content="Best part of Overwatch 2">
<meta property="og:description" content="Best part of Overwatch 2">
<meta name="twitter:description" content="Best part of Overwatch 2">

However, {description} key is empty (or it has only some spaces).

@AlttiRi
Copy link
Author

AlttiRi commented Feb 23, 2022

The content (HTML code) of id="author_comments" looks clear and simple, no need to change this HTML, return it as is.

@AlttiRi AlttiRi changed the title [newgrounds] Add post_url [newgrounds] Add post_url and HTML content of comment Feb 23, 2022
@AlttiRi
Copy link
Author

AlttiRi commented Feb 23, 2022

In one post:

  1. Request of new {post_url} key.
  2. Return raw HTML in {comment} key. (Like it is in deviantart's {description}, pixiv's {caption}, mastodon's {content}.)
  3. {description} key sometimes empty.
  4. {num} is None for the first image.
  5. {_index} is not listed with -K.

@AlttiRi
Copy link
Author

AlttiRi commented Mar 11, 2022

{post_url} looks working.

But missed HTML in {comment} makes this property useless in most cases (when it contains a hypertext). Since the most important information is missed.

@mikf
Copy link
Owner

mikf commented Mar 11, 2022

I don't want to touch the current {comment} if possible.

For example for https://www.newgrounds.com/art/view/tomfulp/ryu-is-hawt it is

      "comment": "Consider this the bottom threshold for scouted artists.In fact consider it BELOW the bottom threshold.",

but would become this bloated mess when leaving all HTML tags in:

      "_comment": "<p>Consider this the bottom threshold for scouted artists.</p><p><br></p><p>In fact consider it BELOW the bottom threshold.</p><p><br></p><ul class=\"itemlist alternating\">\n\t<li><a href=\"https://www.newgrounds.com/portal/view/495979\" class=\"item-portalsubmission\" >\n\t<div class=\"item-icon\" style=\"position: relative\">\n\t\t<div\n\t\t\t\t\t>\n\t\t\t<img src=\"https://picon.ngfiles.com/495000/flash_495979_medium.jpg?f1601082331\" width=\"93\" height=\"60\" alt=\"Street Fighter Collab\">\n\t\t",

I could add an option for that like with furaffinity.descriptions, but you can also just directly use {_comment} instead of {comment}.

Keys starting with an underscore get ignored by -K, but you can view them with -j by enabling output.private. (Yes, I know, this option should work for both)

Newgrounds in particular has a lot of backwards compatibility baggage, like "{num} is None for the first image", which is the result of it expecting there being only one file per post, combined with trying to preserve filenames and archive keys when adding support for multiple files per post.

{description} key sometimes empty

Do you have an example where this shouldn't be the case?

@AlttiRi
Copy link
Author

AlttiRi commented Mar 12, 2022

#2328 (comment)

Also {description} key sometimes is empty.

For example: https://www.newgrounds.com/portal/view/741256

In HTML:

<meta name="description" content="Best part of Overwatch 2">
<meta property="og:description" content="Best part of Overwatch 2">
<meta name="twitter:description" content="Best part of Overwatch 2">

However, {description} key is empty (or it has only some spaces).

@AlttiRi
Copy link
Author

AlttiRi commented Mar 12, 2022

I see:
image


I faced the similar problem with kemonoparty's subscribestar posts, which have extra </div>.

    function fixHTML(html) {
        html = html.replaceAll("\n", "<br>");
        if (service === "subscribestar") {
            const e = document.createElement("div");
            e.innerHTML = html;
            return e.innerHTML.replaceAll("&nbsp;", " ");
        }
        return html;
    }

I think you can do the same thing.

Create an element object from non-valid HTML, then get the valid HTML from the element object.

@AlttiRi
Copy link
Author

AlttiRi commented Mar 12, 2022

Wait, it's a valid HTML.
It looks that you just have a bug with cutting the part of the content id="author_comments".

If you want correctly remove <div class="item-details"> create an element object first, remove the node, then get the new (correct) HTML.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants