Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modularize presets for ffmpeg #669

Merged
merged 1 commit into from
Jan 19, 2019
Merged

Modularize presets for ffmpeg #669

merged 1 commit into from
Jan 19, 2019

Conversation

HGuillemet
Copy link
Collaborator

No description provided.

@saudet
Copy link
Member

saudet commented Jan 17, 2019

BTW, the builds fail, let's make them pass! :)

@HGuillemet
Copy link
Collaborator Author

Argh.
The build fails on those OS where the file names are case insensitive. In ffmeg the avcodec lbrary defines an AVCodec struct and the files get corrupted since the parser generates avcodec.java and AVCodec.java at the same time.
I see two possible fixes:

  1. finally choose another convention for naming the global class. Like Global or G.
  2. make an exception for avcodec and the other libraries using a struct or class with the same name than the library.

I'd tend to prefer 1. It's less likely than a library defines an object named Global or G than named like the library itself. And we'd conform to the Java casing convention.

@saudet
Copy link
Member

saudet commented Jan 18, 2019 via email

@HGuillemet
Copy link
Collaborator Author

opencv_core_global ?
We'd have the class org.bytedeco.opencv_core.Global instead of org.bytedeco.opencv_core.opencv_core. What do you mean ?

@saudet
Copy link
Member

saudet commented Jan 18, 2019

That won't work well when we need to use more than one Global class at the same time, something that happens quite often.

@HGuillemet
Copy link
Collaborator Author

Other option:
3. insert a "preset" level in the package hierarchy, as we already discussed before.
We'd have:
org.bytedeco.opencv.core.Mat (or org.bytedeco.opencv.opencv_core.Mat)
org.bytedeco.opencv.Core (global class) (or org.bytedeco.opencv.OpenCVCore)
This should prevent all name clashes + offer a package for some classes that doesn't fit well presently anywhere like cvkernels or opencv_java + conform to Java conventions

@saudet
Copy link
Member

saudet commented Jan 18, 2019

That could work, but we'd end up with names like this:

org.bytedeco.tensorflow.tensorflow.Tensor
org.bytedeco.tensorflow.TensorFlow

org.bytedeco.ffmpeg.avcodec.AVCodec
org.bytedeco.ffmpeg.AVCodec

Not exactly the most user-friendly naming scheme...

@HGuillemet
Copy link
Collaborator Author

Ok.
4. a preset level in the hierarchy and back to big inner classes named like
org.bytedeco.opencv.opencv_core or org.bytedeco.opencv.Core ?

@HGuillemet
Copy link
Collaborator Author

Packages and outer classes are namespaces. If we want to be rigorous, we should try to match the C/C++ namespace concept, which means:

  1. one Java unit (package or class) where to put all that lie in the unnamed namespace (that is all C symbols and all C++ symbols from the unnamed namespace).
  2. a Java unit per C++ namespace.

Concerning the constraints, we must define:

  1. at least one package per presets (JPMS constraint)
  2. a class to put symbols not encapsulated in native classes, since in Java we cannot have global functions (the "global" class).

So the solution would be one preset-level package matching the unnamed namespace, and containing one special class for the global symbols. If we want to avoid a constant name for this class (like Global) to avoid clash when working with multiple presets, we need to name it after the preset. C++ namespace could be mapped to outer class or package. If I recall correctly, name lookup in C++ follows the nesting of namespaces (symbol is searched in current namespace, then enclosing one, etc.). This rather match the class nesting Java behaviour, not the package nesting. But that is not a strong point. Using packages would be more Java-friendly.

We would end up with the following classes:

org.bytedeco.ffmpeg.FFmpeg
org.bytedeco.ffmpeg.AVCodec
org.bytedeco.ffmpeg.AVFilter

org.bytedeco.opencv.OpenCV
org.bytedeco.opencv.cv.Mat
org.bytedeco.opencv.cv.ocl.Kernel

But that would need some changes in JavaCPP: sharing of the global class between libraries and mapping of namespaces to class or package. And that will not suppress definitively the risk of clash between the global class name (FFmpeg, OpenCV...) and a class defined in the unnamed namespace of the presets, but that would limit the odds and we could always work around by adjusting the names.

What do you think ?

@saudet
Copy link
Member

saudet commented Jan 19, 2019 via email

@saudet
Copy link
Member

saudet commented Jan 19, 2019

Namespaces can also span multiple native libraries, and we can have multiple namespaces in any given header file. It's quite a mess really. So we'd basically have to load all libraries to make sure we get everything for any given class. Also, trying to map namespaces doesn't solve the problem of where to put functions and variables that don't belong to a class. We'd need to come up with "global" classes for each namespace! We can map Java packages much more easily to native libraries, where each library contains lists of dependencies and symbols, similarly to how a Java package contains lists of imports and classes, especially when constrained by a Java module.

How about something like this where we basically replace "javacpp" with the name of the submodules themselves and move the global classes to a "global" package alongside "helper" and "presets"? It would make it clear that we're accessing global functions via import statements without appending something clunky like "global" to the class names.

org.bytedeco.opencv.global.opencv_core
org.bytedeco.opencv.global.opencv_imgproc
org.bytedeco.opencv.helper.opencv_core
org.bytedeco.opencv.helper.opencv_improc
org.bytedeco.opencv.presets.opencv_core
org.bytedeco.opencv.presets.opencv_imgproc
org.bytedeco.opencv.opencv_core.Mat
...

org.bytedeco.ffmpeg.global.avcodec
org.bytedeco.ffmpeg.global.avformat
org.bytedeco.ffmpeg.presets.avcodec
org.bytedeco.ffmpeg.presets.avformat
org.bytedeco.ffmpeg.avcodec.AVCodec
...

org.bytedeco.tensorflow.global.tensorflow
org.bytedeco.tensorflow.helper.tensorflow
org.bytedeco.tensorflow.presets.tensorflow
org.bytedeco.tensorflow.tensorflow.Tensor
...

In cases like TensorFlow where there is only a single fat library we could also drop the last component of the package name, so we'd get instead org.bytedeco.tensorflow.Tensor.

And while we're at it also remove "javacpp-presets" from the groupId of the submodules since they are no longer limited to bindings generated exclusively with JavaCPP.

@SamCarlberg Any opinions?

In any case, I'll merge this PR, rename the branch to something like jpms_fail and restart a new jpms branch from current master.

@saudet saudet merged commit 5c03601 into bytedeco:jpms Jan 19, 2019
@HGuillemet
Copy link
Collaborator Author

HGuillemet commented Jan 19, 2019

Before commenting your suggestion and giving up the idea of mapping namespaces, I'd like to understand why it's a bad idea.
We are mapping C/C++ symbols to Java symbols, so it seems natural to map the symbol tables/name space too. And symbol tables in Java are handled by packages (classes can play this role too like it's currently the case with the big global class and its inner classes).

If we do not map symbol tables and put symbols from different C++ namespace in the same package or same outer class, how can we handle cases where the same library defines both ns1::Symb and ns2::Symb ? Hasn't the situation already occur in existing presets ?

I don't understand the question about std::vector.

Yes, namespaces can span multiple libraries. If a library exports ns::Symb1 and another library of the same presets exports ns::Symb2, why is it a a problem if Symb1 and Symb2 end up in the same Java package ?

Yes a library can export ns1::Symb1 and ns2:Symb2.Symb1 and Symb2 will be in 2 different Java packages. Why is it a problem ?

Yes I agree this has nothing to do with the problem of finding a name for the global class. And this class should be different for each namespace indeed since two global functions with same name can exist in two different namespaces.

A C++ program doesn't use the symbol of one library or another, it uses the symbol of a namespace. It's the silent job of the linker to find the symbol in a library or another. So I'm just wondering if we couldn't abstract away the concept of library in Java too and only keep the concepts of presets (= JPMS modules) and namespaces.

@saudet
Copy link
Member

saudet commented Jan 20, 2019

A C++ program doesn't use the symbol of one library or another, it uses the symbol of a namespace. It's the silent job of the linker to find the symbol in a library or another. So I'm just wondering if we couldn't abstract away the concept of library in Java too and only keep the concepts of presets (= JPMS modules) and namespaces.

Well, if you really want to try something, go for it. I'm not going to stop you.

Think about it though, C++ namespaces are just a lazy way of naming things with "::" instead of "_" like we have to do in C. Nothing more, nothing else. It doesn't offer anything beyond that! Look just for example at imgproc.hpp: Everything's in "cv". There is no "cv::imgproc" namespace. What would you do about that?
https://github.com/opencv/opencv/blob/master/modules/imgproc/include/opencv2/imgproc.hpp

@HGuillemet
Copy link
Collaborator Author

Well, if you really want to try something, go for it. I'm not going to stop you.

Does that mean "It's a good idea, but the needed changes are too deep for little benefit and we don't have the resources" or " It's a clearly bad idea, I won't take time arguing with this guy" ?

We are talking about reorganisation of packages, which is needed for JPMS, I'm just suggesting ideas so that the result is the most logical we can do and so that the solution will last.

Think about it though, C++ namespaces are just a lazy way of naming things with "::" instead of "_" like we have to do in C. Nothing more, nothing else. It doesn't offer anything beyond that! Look just for example at imgproc.hpp: Everything's in "cv". There is no "cv::imgproc" namespace. What would you do about that?

Put all in the org.bytedeco.opencv.cv package.

@HGuillemet
Copy link
Collaborator Author

Think about it though, C++ namespaces are just a lazy way of naming things with "::" instead of "_" like we have to do in C.

Exactly, so it's part of the name and I wonder if we should not keep it. We could prefix the names replacing :: by _, but using . and packages looks cleaner.

@saudet
Copy link
Member

saudet commented Jan 20, 2019

Well, if you really want to try something, go for it. I'm not going to stop you.

Does that mean "It's a good idea, but the needed changes are too deep for little benefit and we don't have the resources" or " It's a clearly bad idea, I won't take time arguing with this guy" ?

The point is, you'll need to figure out what your priorities are, for example, below. I can't decide that for you.

Think about it though, C++ namespaces are just a lazy way of naming things with "::" instead of "_" like we have to do in C. Nothing more, nothing else. It doesn't offer anything beyond that! Look just for example at imgproc.hpp: Everything's in "cv". There is no "cv::imgproc" namespace. What would you do about that?

Put all in the org.bytedeco.opencv.cv package.

Since pretty much all OpenCV modules define identifiers in the root "cv" namespace, all libraries will have to be loaded together at once, effectively turning them into a single fat library, and that is how the Python bindings work, but you said that's not something you wanted to do.

@HGuillemet
Copy link
Collaborator Author

HGuillemet commented Jan 20, 2019

Since pretty much all OpenCV modules define identifiers in the root "cv" namespace, all libraries will have to be loaded together at once,

I think I see. This is not a problem for classes in the cv package. Classes in the same package could have a static { } that results in loading different native libraries. But for the global class we would need to load all the libraries that define global functions or variables. Is that right ?

but you said that's not something you wanted to do.

I don't know what you are referencing but yes, loading all opencv libraries when you need just the core functionalities must be avoided.

@HGuillemet
Copy link
Collaborator Author

About your suggestion for the new package layout:

  • I like the replacement of javacpp by the submodule (if you don't have plan to use the bytedeco name for many other projects). cvkernels and opencv_java could be put directly in org.bytdeco.opencv
  • Ok for global/helper/presets subpackages. But what is a helper class ? Is it only the class from which the global class inherits instead of inheriting from the presets class ? Or also the non-automatically generated classes like AbstractArray ?
  • Ok for dropping the .library. package for mono-library modules. Dropping it in
    all cases could maybe work too (that is org.bytedeco.opencv.Mat, org.bytedeco.opencv.CvFont, ...), but, for opencv, I see some classes with same name in different libraries (*OpticalFlow).

@saudet
Copy link
Member

saudet commented Jan 22, 2019

Ideally, the triplet of global, helper and presets classes should be only 1 class, but because we're generating one from the other, this isn't possible. That's basically the only reason why it's split in 3, and isn't related to other helper classes such as AbstractArray, so I'd put those in the same package as the rest of the generated classes.

@HGuillemet
Copy link
Collaborator Author

HGuillemet commented Jan 22, 2019

Ok.
And don't you think we can use org.bytedeco.module instead of org.bytedeco.module.lib for those, in all cases ?

Technically, I don't see much reason in using separate packages. There may be cases where libraries are not meant to be loaded at the same time, for instance different flavours of the same library exporting different implementations for the same symbols. But we can always configure a special case for these libraries using the target parameter in the annotations. I don't know what are the *OpticalFlow classes in opencv that are presently defined in more than 1 package and if they are actually the same classes or not.

Practically, it may be nice to keep things separated just for the clarity of documentation.

So no strong opinion about this here.

If ever you want to keep the door open for mapping C++ namespaces one day (when it's important to preserve them), the layout could be:

org.bytedeco.module.SomeClassFromUnamedModule
org.bytedeco.module.ns1.SomeClass
org.bytedeco.module.global.unnamed
org.bytedeco.module.global.ns1

We can imagine a parameter in the presets saying if we want a "library-wise" or "namespace-wise" layout.

BTW, we need to change ClassProperties in javacpp so that the global class is not necessarily in target package if we go this way.

@saudet
Copy link
Member

saudet commented Jan 23, 2019

Hum, so you want to mix library names with namespaces. Even C++ doesn't do that, it just ignores library names and hopes the linker resolves everything. I don't see this working nicely. FYI, the Panama team is mapping the names of header files to class names for C right now:
http://hg.openjdk.java.net/panama/dev/raw-file/tip/doc/panama_foreign.html
But the names of header files change between every single minor version of libraries. I don't see this working well either, I really don't want to go there.

We can already specify a FQN for the @Properties(global=... value. It doesn't need to be in the same package, but yeah I think there is a bug here. It should check if there is already a "." in it.

saudet added a commit to bytedeco/javacpp that referenced this pull request Jan 23, 2019
@HGuillemet
Copy link
Collaborator Author

HGuillemet commented Jan 23, 2019

There were actually 2 independent points in my last post:

  1. whether the .library. package component is really necessary in, e.g. org.bytedeco.opencv.opencv_core.Mat. (just a question, I have no opinion about the answer).
  2. what would be the the package layout if we want to preserve namespaces. (about this one, I believe there may be libraries where we would like to preserve namespaces, so it'd be good to choose a layout that is compatible with a future feature that allows to map namespaces).

In fact, with your last layout suggestion:

org.bytedeco.opencv.global.opencv_core
org.bytedeco.opencv.global.opencv_imgproc
org.bytedeco.opencv.helper.opencv_core
org.bytedeco.opencv.helper.opencv_improc
org.bytedeco.opencv.presets.opencv_core
org.bytedeco.opencv.presets.opencv_imgproc
org.bytedeco.opencv.opencv_core.Mat

and if we activate some "keepNamespace" option in the presets, we could have the exact same layout but with:

org.bytedeco.opencv.opencv_core.cv.Mat

and with global class org.bytedeco.opencv.global.opencv_core having inner classes like org.bytedeco.opencv.global.opencv_core.cv.
Thus we have namespace preservation AND we avoid the problem you pointed out above about the need to load all libraries putting symbols in the same namespace simultaneously. This is what you call "mixing library names with namespaces" but I think that would work. Users will do:

import org.bytedeco.opencv.opencv_core.ns.*;
import static org.bytedeco.opencv.global.opencv_core.ns.*;

in lieu of the C++ using namespace ns.

Again, this is not something "I want", just some ideas for the future that should be kept in mind when making choices now for JPMS support.

To link my 2 points, another idea is to have a second option "keepLibraryNames", that when turned off gets rid of the library names in the package names, and replaces the library name by the module name in the global class name. This could be the default behaviour for mono-library modules.

@saudet
Copy link
Member

saudet commented Jan 23, 2019

We can name classes whatever we want, that's not an issue, but what do we gain doing this? Your imports are going to look a little different, and that's it. If you want to do this for the case where we have clashing names, they are still going to clash when trying to import them together. People are not going to like to use FQNs just because instead of giving them different names you put them in different packages!

Anyway, would you have an actual example that exemplifies the usefulness of tinkering with names like that inconsistently between libraries?

@HGuillemet
Copy link
Collaborator Author

No I don't right now. Let's give up point 2.
I can redo the modularization of opencv if you'd like, but I need your decision about point 1: separated packages for non-global classes or not ?

@saudet
Copy link
Member

saudet commented Jan 24, 2019

Ok, so let's try again with my last proposal? It should actually be easier to refactor projects that way than what we tried before.

As for the library component in the package name, yes, let's leave it there for consistency. It only changes the import statements, which BTW provide a useful hint about which libraries we're actually loading.

BTW, if you're wondering about what happens with namespaces, they are not thrown away completely. They still show up in the documentation:

@Namespace(value="cv")
 @NoOffset
public static class opencv_core.Mat
extends opencv_core.AbstractMat

http://bytedeco.org/javacpp-presets/opencv/apidocs/org/bytedeco/javacpp/opencv_core.Mat.html

@HGuillemet HGuillemet deleted the jpms_ffmpeg branch January 31, 2019 00:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants