-
Notifications
You must be signed in to change notification settings - Fork 588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Loader] improve performance of resource extraction and library search #512
base: master
Are you sure you want to change the base?
Conversation
} else if (!cacheDirectory || !file.exists() || file.length() != entrySize | ||
|| file.lastModified() != entryTimestamp || !file.equals(file.getCanonicalFile())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the intention of the comparison with the canonical file is, but the file is not canonicalized in the called extractResource() method. Furthermore canonicalization does not (or at least should not) change the effective target the file is pointing to, so the metadata checked before should not change because of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Symbolic links get created and we need to resolve them to prevent creating cycles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please explain how this prevents cycles or is there an example anywhere? Because I don't understand it.
I'm asking because avoiding the canonicalization had a significant impact on the runtime improvement (around one third of the runtime reduction).
So if we can avoid this call or at least have a cheaper pre-check (e.g. canCreateSymbolicLink==true
? or Files.isSymbolicLink()==true
, I'm not sure if this is much faster), it should make it faster.
But of course the logic has to be correct in any case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There can be symbolic links anywhere in the path, not just the last file in the path.
I'm pretty sure we can disable those checks when org.bytedeco.javacpp.cachedir.nosubdir is set though, and I think that's what you're already using, right? If so, I'd be happy to skip all that when that flag is set. How does that sound?
@@ -1247,7 +1251,7 @@ public static String load(Class cls, Properties properties, boolean pathsFirst, | |||
foundLibraries.put(preload, urls = findLibrary(cls, p, preload, pathsFirst)); | |||
} | |||
String filename = null; | |||
if (oldUrls == null || urls.length > 0) { | |||
if (oldUrls == null && urls.length > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure the 'or' was intentional or not, but before it called the following loadLibrary method when
oldURLs==null and urls.length>0
, so the library in question was not searched before and found now.oldURLs==null and urls.length==0
, e.g. the library in question was not searched before but not found now.oldURLs!=null and urls.length>0
, e.g. the library in question was searched before but and found.
For me it does not make sense to attempt to load the library when it was not found (case 2) or was already loaded (case 3). Or did I oversee something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a previously found library failed to load, we should try to keep finding other versions that might be able to load.
Maybe this also useful regarding issues #452. |
If you know of a more efficient way without changing the logic, please try to do that! Thanks |
So, is there anything in particular that is preventing you from making progress with those pull requests? |
It is just my limited time and shifted priorities, but I definitely plan to complete this PR (as well as my other open PRs!) |
BTW, concerning symbolic links on Windows, we could definitively disable them and assume by default that they are never supported. The user could still enable them with some system property, but since they are not available to users by default, and with bugs such as https://bugs.openjdk.java.net/browse/JDK-8003887 that seem like they are never going to get fixed but that @devjeonghwan found can still cause problems, it doesn't sound like we should be relying on them for anything on Windows. But we should still be able to enable their use in JavaCPP for users that really need them. |
Based on suggestions in this pull request, I've updated a couple of things in commit e8b5734. With these modifications, it takes a bit less than 1200 ms on my Windows 10 machine to execute the following, when everything is already cached, extracted, and all: Py_Initialize(org.bytedeco.scipy.presets.scipy.cachePackages());
org.bytedeco.numpy.global.numpy._import_array(); I haven't tried to do something about the unnecessary calls to loadLibrary(), but it wouldn't even gain us 100 ms, so I'm reluctant to change the logic there just for that. If you see other places where we can gain more, please update this pull request! Thanks @yukoba Does your javacpp-embedded-python incur any additional overhead that we should know about? |
Thanks for implementing the changes and sorry for the delay. Seem that my estimation was bad but I hope to be able to check for possible updates of this and my other PR/issues in the next few weeks. |
With this PR I suggest changes to reduce the runtime of resource extraction and library search performed by the javacpp.Loader.
My benchmark is a stripped variant of the embeddedpython.Python that uses only javacpp and cpython (but not numpy) and the 'regular' variant of embeddedpython.Python but with updated javacpp dependencies.
The benchmark is just the following program, which is absolutly dominated by the initialization of the embeddedpython.Python :
On my Windows 10 computer this program takes (when all files are already cached in a previous run) with the current javacpp master 2000ms+-20 for the stripped version respectively 3630ms+-20 for the regular version.
With the suggested changes, the stripped version takes around 800+-20ms and the regular version 1400ms+-20.
So this improves the runtime of the initialization by more than the factor of two.