Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception for DOI search in case DOI contains masked characters #8787

Closed
2 tasks done
claell opened this issue May 11, 2022 · 15 comments · Fixed by #8812
Closed
2 tasks done

Exception for DOI search in case DOI contains masked characters #8787

claell opened this issue May 11, 2022 · 15 comments · Fixed by #8812
Labels
component: import-load good first issue An issue intended for project-newcomers. Varies in difficulty. [outdated] type: bug Confirmed bugs or reports that are very likely to be bugs

Comments

@claell
Copy link
Contributor

claell commented May 11, 2022

JabRef version

5.6 (latest release)

Operating system

Windows

Details on version and operating system

Windows 10

Checked with the latest development build

  • I made a backup of my libraries before testing the latest development version.
  • I have tested the latest development version and the problem persists

Steps to reproduce the behaviour

Similar issue to #8786 in case of a DOI search (masked underscore) in DOI.

Appendix

...

Log File
org.jabref.logic.importer.FetcherException: No DOI data exists
	at org.jabref@5.6.60000/org.jabref.logic.importer.fetcher.DoiFetcher.performSearchById(Unknown Source)
	at org.jabref@5.6.60000/org.jabref.gui.mergeentries.FetchAndMergeEntry.lambda$fetchAndMerge$0(Unknown Source)
	at org.jabref@5.6.60000/org.jabref.gui.util.BackgroundTask$1.call(Unknown Source)
	at org.jabref@5.6.60000/org.jabref.gui.util.DefaultTaskExecutor$1.call(Unknown Source)
	at org.jabref.merged.module@5.6.60000/javafx.concurrent.Task$TaskCallable.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.io.FileNotFoundException: https://doi.org/10.18726/2018%7B%5Ctextunderscore%7D3
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
	at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Unknown Source)
	at java.base/java.lang.reflect.Constructor.newInstance(Unknown Source)
	at java.base/sun.net.www.protocol.http.HttpURLConnection$10.run(Unknown Source)
	at java.base/sun.net.www.protocol.http.HttpURLConnection$10.run(Unknown Source)
	at java.base/java.security.AccessController.doPrivileged(Unknown Source)
	at java.base/sun.net.www.protocol.http.HttpURLConnection.getChainedException(Unknown Source)
	at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source)
	at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
	at java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(Unknown Source)
	at org.jabref@5.6.60000/org.jabref.logic.net.URLDownload.asString(Unknown Source)
	at org.jabref@5.6.60000/org.jabref.logic.net.URLDownload.asString(Unknown Source)
	... 11 more
Caused by: java.io.FileNotFoundException: https://doi.org/10.18726/2018%7B%5Ctextunderscore%7D3
	at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source)
	at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
	at java.base/java.net.HttpURLConnection.getResponseCode(Unknown Source)
	at java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(Unknown Source)
	at org.jabref@5.6.60000/org.jabref.logic.net.URLDownload.openConnection(Unknown Source)
	... 13 more
@claell
Copy link
Contributor Author

claell commented May 11, 2022

If #8743 is fixed, this will just show that no data could be retrieved for that DOI, which is also not ideal.

@fly-ing-fish
Copy link
Contributor

fly-ing-fish commented May 12, 2022

Hello! I am a junior undergraduate cs student. Seems that this issue is meaningful and not very complicated so I wanna have a try. Hope you can give me the chance. By the way, Can you give me more information about this? like the expected results after repair or what I can do for this issue. Thank you!

@claell
Copy link
Contributor Author

claell commented May 12, 2022

Thanks @fly-ing-fish for your interest in this.

I assume that somebody from the team (like @ThiloteE) will soon tell you more.

Personally, I am not entirely sure what to do in those cases.

First, I am not entirely sure, whether the entry is conforming to BibTeX standard.

Assuming it is, then JabRef should either just understand the masked characters and replace them internally for the DOI lookup, or it can notify the user about it and suggest modifying ("fixing" this entry (in case it is decided that this formatting is not compliant with JabRef internal standards).

@mlep
Copy link
Contributor

mlep commented May 12, 2022

@fly-ing-fish It may be easier for you if you choose one of the issues tagged as good first issue (https://github.com/JabRef/jabref/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22)
@Siedlerchr There are not that many new good first issues lately.

@Siedlerchr
Copy link
Member

@mlep Well, there are still enough issues... And even more unfinished pull requests for those...
And for some issues it's more like deciding on what solution we want (for example UI).

(We really need a random shuffle function for github issues so that not always the latest opened issues get picked)

@ThiloteE
Copy link
Member

@fly-ing-fish Thank you once again for your last contribution and the introduction of the \$ cleanup action. Glad you are willing to tackle another issue. If I understand it right, this issue here is a duplicate of #8786 and we probably should mark it as such (and close it).

I agree tackling "good first issues" will be good. Of course, since you already have contributed once, maybe "good first issues" is not challenging enough for you. If you think so, you could have a look at the projects page also, but keep in mind that those might be of larger scope and require more work, thought and commitment :-)

@ThiloteE ThiloteE added status: duplicate [outdated] type: bug Confirmed bugs or reports that are very likely to be bugs component: import-load labels May 12, 2022
@claell
Copy link
Contributor Author

claell commented May 12, 2022

@ThiloteE This is not an exact duplicate. The exception originates from different causes, also one is talking about handling the exception, the other is about handling the masking. But both issues might have the same solution in common (but not necessarily), which is why I created this as a separate issue.

In case you still think this should be treated as a duplicate, then I suggest editing the scope of the other issue (to cover all masked characters in a bib file, which might require different solutions depending on the case).

Let me know how to proceed.

@ThiloteE
Copy link
Member

You are right. The error is different. Excuse my trigger happy fingers pushing buttons too fast.

Caused by: java.io.FileNotFoundException: https://doi.org/10.18726/2018%7B%5Ctextunderscore%7D3

@ThiloteE
Copy link
Member

But this is also caused by Citavi export, right?

@claell
Copy link
Contributor Author

claell commented May 12, 2022

No problem, the issues are easy to mix up, I myself needed to think a bit about it when creating them.

Yes, both are caused by Citavi export.

And there also might be a shared solution (at least to parts of them), but not necessarily.

@Siedlerchr
Copy link
Member

Siedlerchr commented May 12, 2022

Okay, solution would be two steps:

  1. Improve DOI cleanup formatter, detect and decode URL encoded characters (The very first checkbox in the cleanup dialog is the Cleanup DOI formatter. This will at least show that there is something problematic, and ther user can afterwards run the Latex2Uniciode formatter over it.
  2. Add a test for the URL decoding
  3. Do not show the full exception, only the message from the exception in the dialog (see class FetchAndMergeEntry)

@Siedlerchr Siedlerchr added the good first issue An issue intended for project-newcomers. Varies in difficulty. label May 12, 2022
@fly-ing-fish
Copy link
Contributor

Thanks for your reply ! The issue becomes more clear after disscussion. I will work on it. Also, I will pay attention to the progress of #8786 and #8743

@fly-ing-fish
Copy link
Contributor

Okay, solution would be two steps:

  1. Improve DOI cleanup formatter, detect and decode URL encoded characters (The very first checkbox in the cleanup dialog is the Cleanup DOI formatter. This will at least show that there is something problematic, and ther user can afterwards run the Latex2Uniciode formatter over it.
  2. Add a test for the URL decoding
  3. Do not show the full exception, only the message from the exception in the dialog (see class FetchAndMergeEntry)

According to step1, Seems that URLDecoder.decode() can handle the problem.
Some code will be added into org/jabref/logic/cleanup/DoiCleanup.java like

image
image

and cleanedDOI = URLDecoder.decode(cleanedDOI, "UTF-8");will be add into org/jabref/model/entry/identifier/DOI.java like

image

According to step2, another test class may named DoiDecodeCleanupTest.java like org/jabref/logic/cleanup/DoiCleanupTest.java will be added.

According to step3, seems that the dialog box concise enough?

image

If it is fine, I may try to start a PR.

@claell
Copy link
Contributor Author

claell commented May 16, 2022

Regarding step 3: That should be solved by #8743.

@Siedlerchr
Copy link
Member

@fly-ing-fish That's a good start. Additionally, I would also integrate this directly into the DOI.parse method as well, so that even if the user has a DOI in that format, fetching will work.

And @claell is right, for the message the other issue. We try to be a bit more user-friendly. The exception should not be shown to the user in the dialog, but instead just be logged.

fly-ing-fish added a commit to fly-ing-fish/jabref that referenced this issue May 17, 2022
add " We fixed an issue that there are exceptions for DOI search in case DOI contains masked characters. [JabRef#8787](JabRef#8787) " in unreleased fixed
koppor pushed a commit that referenced this issue May 23, 2022
Co-authored-by: Christoph <siedlerkiller@gmail.com>
@koppor koppor moved this to Done in Prioritization Nov 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: import-load good first issue An issue intended for project-newcomers. Varies in difficulty. [outdated] type: bug Confirmed bugs or reports that are very likely to be bugs
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants