Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: fix handling of image and PDF links with special characters #84

Merged
merged 1 commit into from
Nov 18, 2024

Conversation

ashishb
Copy link
Owner

@ashishb ashishb commented Nov 18, 2024

Handling of other media files like audio is still broken and not worth the fix right now.

From the attached GitHub issue

Aeons ago I made a blog post including the image https://habi.gna.ch/blog/images/Picture(2).jpg

When calling wp2hugo with

./src/wp2hugo/bin/wp2hugo -source habignach.WordPress.2024-10-29.xml  -download-media

the process fails with

05:33:50PM DBG hugo_gen_setup.go:412 > Embedded media links links=1 page=https://habi.gna.ch/2003/07/26/canyoning/
05:33:50PM DBG hugo_gen_setup.go:416 > Downloading media files links=1
05:33:50PM INF media_cache_setup.go:33 > media https://habi.gna.ch/blog/images/Picture(2 will be fetched
05:33:50PM FTL main.go:43 > Error: error fetching media file https://habi.gna.ch/blog/images/Picture(2: error fetching media https://habi.gna.ch/blog/images/Picture(2: 404 Not Found

I think the fetching link needs escaping of the parenthesis :)

grep "Picture(2" habignach.WordPress.2024-10-29.xml returns <a href="http://habi.gna.ch/blog/images/Picture(2).jpg"><img src="http://habi.gna.ch/blog/images/Picture(2)-tm.jpg" height="288" width="352" align="middle" border="2" hspace="0" vspace="0" alt="" longdesc="" /></a><p> BTW

Ref: #81

Handling of other media files like audio is still broken and not worth
the fix right now.

From the attached GitHub issue

> Aeons ago I made [a blog post](https://habi.gna.ch/2003/07/26/canyoning/) including the image `https://habi.gna.ch/blog/images/Picture(2).jpg`
>
> When calling `wp2hugo` with
> ```bash
> ./src/wp2hugo/bin/wp2hugo -source habignach.WordPress.2024-10-29.xml  -download-media
> ```
> the process fails with
> ```bash
> 05:33:50PM DBG hugo_gen_setup.go:412 > Embedded media links links=1 page=https://habi.gna.ch/2003/07/26/canyoning/
> 05:33:50PM DBG hugo_gen_setup.go:416 > Downloading media files links=1
> 05:33:50PM INF media_cache_setup.go:33 > media https://habi.gna.ch/blog/images/Picture(2 will be fetched
> 05:33:50PM FTL main.go:43 > Error: error fetching media file https://habi.gna.ch/blog/images/Picture(2: error fetching media https://habi.gna.ch/blog/images/Picture(2: 404 Not Found
> ```
>
> I think the fetching link needs escaping of the parenthesis :)
>
> `grep "Picture(2" habignach.WordPress.2024-10-29.xml` returns `<a href="http://habi.gna.ch/blog/images/Picture(2).jpg"><img src="http://habi.gna.ch/blog/images/Picture(2)-tm.jpg" height="288" width="352" align="middle" border="2" hspace="0" vspace="0" alt="" longdesc="" /></a><p>` BTW

Ref: #81
@ashishb ashishb merged commit b1da435 into main Nov 18, 2024
3 checks passed
@ashishb ashishb deleted the fix1 branch November 18, 2024 02:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant