Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix IllegalArgumentException with non-ascii paths #1195

Merged
merged 2 commits into from
Oct 2, 2024

Conversation

tachyonicClock
Copy link
Contributor

Fix #1194

@tachyonicClock tachyonicClock changed the title Fix with non-ascii paths Fix IllegalArgumentException with non-ascii paths Jun 16, 2024
Copy link

codecov bot commented Jun 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 87.21%. Comparing base (09f325f) to head (e2e60a5).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1195   +/-   ##
=======================================
  Coverage   87.21%   87.21%           
=======================================
  Files         113      113           
  Lines       10277    10280    +3     
  Branches     4088     4088           
=======================================
+ Hits         8963     8966    +3     
  Misses        718      718           
  Partials      596      596           
Flag Coverage Δ
87.21% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@Thrameos Thrameos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LTGM

@Thrameos
Copy link
Contributor

Looks like I have some maintenance tasks before the CI will run.

@tachyonicClock
Copy link
Contributor Author

@Thrameos I'm working on a test and it seems emoji still don't work although that appears to be a different reason.

@tachyonicClock
Copy link
Contributor Author

The emoji thing seems to be a problem with java itself

 java -classpath "😊/mypackage.jar" mypackage.MyClass
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.IllegalArgumentException: Error decoding percent encoded characters
        at java.base/sun.net.www.ParseUtil.decode(ParseUtil.java:218)
        at java.base/jdk.internal.loader.FileURLMapper.getPath(FileURLMapper.java:64)
        ...

https://youtrack.jetbrains.com/issue/IJPL-19330/Scratch-Java-file-wont-run-if-theres-an-emoji-in-the-project-name

@Thrameos
Copy link
Contributor

I am hoping it is safe to ignore emoji. Although a coworker of mine had great fun using the poop emoji when we added internationalization to string conversions, I think that it is an edge case.

It is also a good idea to add a line to "doc/CHANGELOG.rst" describing the change so that others will know it is fixes. Thanks for taking the time to add tests.

@tachyonicClock
Copy link
Contributor Author

Since Java appears to have difficulties with emojis in classpaths, it's probably impossible to fix this issue. The important thing is to support non-English character sets that are likely in user paths. That is the problem that prompted me to open this issue in the first place. Although a poop emoji directory does sound like a good place to store certain parts of java :D.

I've added my changes to the changelog. Let me know if its not in the right place or the wording is bad.

@Thrameos Thrameos requested a review from marscher June 19, 2024 15:04
@marscher
Copy link
Member

Honestly I do not understand right now, why the pipeline isnt converging. On the main branch everything was fine.

@marscher
Copy link
Member

@tachyonicClock I think the hung tests are due to the process creation within the new regression test. Could you please check, if using https://pypi.org/project/pytest-forked/ and marking the test with the provided marker could be a solution?

@pytest.mark.forked
def test_with_leaky_state():
    run_some_monkey_patches()

This would require of course to install pytest-forked for the test suite. It should go to test-requirements.txt. Thank you!

@tachyonicClock tachyonicClock force-pushed the bug_1194 branch 4 times, most recently from 9463137 to 693c6f5 Compare August 28, 2024 00:39
@tachyonicClock
Copy link
Contributor Author

I fixed the hanging issue and packaged it with the related tests in test_startup.py. These are run as sub-processes using `subrun'. I'm running into an issue with them failing on Windows. I'll try to fix this later this week.

@tachyonicClock
Copy link
Contributor Author

tachyonicClock commented Sep 15, 2024

Last week I tried fixing this on windows but ran into some other issues related to UTF-8 vs UTF-16 character encoding on Windows in the JNI. It will likely take me a little longer to figure out a solution or workaround.

@marscher
Copy link
Member

@tachyonicClock Thanks for your work on this so far! Any updates on this one? We would like to conclude a set of changes for the next release and I would like to include this fix if it also works on Windows.

@tachyonicClock tachyonicClock force-pushed the bug_1194 branch 5 times, most recently from a242500 to dea04b7 Compare September 30, 2024 02:52
@tachyonicClock
Copy link
Contributor Author

tachyonicClock commented Sep 30, 2024

  • Launching Java with non-ascii class paths appears impossible on Windows (https://stackoverflow.com/questions/20052455/jni-start-jvm-with-unicode-support). To resolve this issue, I tried adding the classpath after initialization since this supports Unicode. But the change to how jpype starts up is significant.
  • I fixed a problem in Python 3.8 with JDK8 todo with "zip file system provider returns doubly % encoded URIs" (native/java/org/jpype/pkg/JPypePackageManager.java). A pre-existing fix removed % signs if '%2520' was present. This did not fix doubly encoded Unicode characters and risked removing % signs that were intentional. The new version will work in these cases.
  • Emoji now work.
  • I added project/jars/unicode_à😎 to describe how to build the Unicode jar test case. I tried to follow the pattern of other examples, but I'm not confident I did this correctly.

@marscher marscher merged commit 354968c into jpype-project:master Oct 2, 2024
4 of 6 checks passed
// by re-encoding the URI after decoding it.
uri = new URI(
uri.getScheme(),
URLDecoder.decode(uri.getSchemeSpecificPart()),

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note

Invoking
URLDecoder.decode
should be avoided because it has been deprecated.
Comment on lines +267 to +281
"""Prior versions of JPype used the jvmargs to setup the class paths via
JNI (Java Native Interface) option strings:
i.e -Djava.class.path=...
See: https://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/invocation.html

Unfortunately, unicode is unsupported by this interface on windows, since
windows uses wide-byte (16bit) character encoding.
See: https://stackoverflow.com/questions/20052455/jni-start-jvm-with-unicode-support

To resolve this issue we add the classpath after initialization since jpype
itself supports unicode class paths.
"""
for cp in _handleClassPath(classpath):
addClassPath(Path.cwd() / Path(cp).resolve())

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears this code is going to cause some serious issues because certain types of classpath absolutely must be loaded prior to starting the JVM. The two that come to mind are database drivers and anything containing a classloader.

I believe there is a solution in that we just need to send a bytes with the correct encoding, but I need to test it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry this ended up breaking things.

I believe there is a solution in that we just need to send a bytes with the correct encoding, but I need to test it.

I wish you luck. I spent some time struggling to get this to work, but did not try exhaustively since I found the stack-overflow response suggesting I set the properties at runtime. I found some discussion of this problem from 2017 but it does not look like it went anywhere: https://mail.openjdk.org/pipermail/jdk10-dev/2017-May/000191.html.

Thrameos added a commit to Thrameos/jpype that referenced this pull request Oct 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IllegalArgumentException after using dir(jpype.JPackage("mypackage")) and non-ascii paths.
3 participants