Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project.jl as an extension to Projec.toml #1828

Closed
aminya opened this issue May 19, 2020 · 12 comments
Closed

Project.jl as an extension to Projec.toml #1828

aminya opened this issue May 19, 2020 · 12 comments

Comments

@aminya
Copy link

aminya commented May 19, 2020

It would be nice if we can have a Project.jl as an extension to Project.toml in our root directory to perform custom things that are not covered in Project.toml.

Pkg.instantiate() should run this file after it processes Project.toml.

Some use cases:

Interactive experience

function prompt(message::String="")::String
   print(message)
   return chomp(readline())
end

use_newfeature  = prompt("Do you want to use the new feature of this package? (Y, N)")
# do some stuff with use_newfeature  

backend_lib = prompt("Which XML library do you want to use as the backend?")
# do some stuff with backend_lib 

Dependency based on the VERSION or OS

using Pkg
@static if VERSION > v"1.3"
	Pkg.add("AcuteML")
else
	Pkg.add(PackageSpec(url="https://github.com/aminya/AcuteML.jl", version=v"0.5"))
end

Unregistered packages

I might want to add a dependency that is not registered yet. I can simply do it in the Project.jl file.

using Pkg
Pkg.add(PackageSpec(url="https://github.com/aminya/AcuteML.jl", rev="master"))

Adding compat methods based on the VERSION or OS

@static if VERSION < v"1.3"
	Base.write("src/compat.jl","""
    	static_hasmethod(args...) = hasmethod(args...)
	""")
	# inside the package, one can use `@static isfile("compat.jl")`
else
	Pkg.add("Tricks") # some package with compat `julia = 1.3`
end

Custom script after installation

include("src/build.jl")
include("src/postinstallation.jl")

Run the tests after installation

using Pkg
Pkg.test()

and many other examples.

This is similar to deps/build.jl, but being in the root, separates the Pkg and init stuff from building.

@DilumAluthge
Copy link
Member

This reminds me of installing Python packages. When you install a python package, it can run basically any code it wants. This makes installing Python packages a nightmare.

Personally I do not like the idea of running arbitrary code when I instantiate a project.

@aminya
Copy link
Author

aminya commented May 19, 2020

This reminds me of installing Python packages. When you install a python package, it can run basically any code it wants. This makes installing Python packages a nightmare.
Personally I do not like the idea of running arbitrary code when I instantiate a project.

You can run any code in Julia too. Not having this option means that people will run these stuff inside their __init__. It is just deferred and will be repeated! This makes a nightmare each time you want to use the package (rather only once).

This:

  • Forces the mixture of the runtime code and package management code
  • Prevent Julia to implement proper Tree Shaking. If the __init__ functions were empty, Julia could do way more optimizations than it is possible now.
  • Prevents tools such as PackageCompiler work properly.
  • For having fully static Julia, we need to separate Project.jl from the packages.
  • Makes people use Require for their optional deps
  • Makes the loading time of the package slow. See here for example.
  • Makes it very hard to use a package with high Julia compat. See here
  • ...

The limitation is not desirable all the time. Sometimes people want more flexibility, and not having that results in undesirable things that are very hard to fix later.

Running the code is already allowed by the Pkg.build. I don't understand what is different here, and what is the reason for the objection.

Side thing: not having this option does not help security. See here.

@KristofferC
Copy link
Sponsor Member

Dependency based on the VERSION or OS
I might want to add a dependency that is not registered yet. I can simply do it in the Project.jl file.

Dependencies and features should be declarative. If we want OS/VERSION dependent dependencies or unregistered dependencies then that would be added by adding syntax for it to the existing Project file, not by running some script file where things are queried. Interactive things during package install strike me as particularly bad.

In general, I would say we want to move more things towards more declarative and less arbitrary code style (cf the artifact system over the build scripts) so this would be a step in the wrong direction in my opinion.

@StefanKarpinski
Copy link
Sponsor Member

100% agree with @KristofferC. Arguing that making things less declarative somehow makes it easier to statically compile things doesn't make sense to me.

@aminya
Copy link
Author

aminya commented May 22, 2020

If you can give me solutions for the things I said in this issue, I would appreciate it!

Personally, I will abuse deps/build.jl with include("../Project.jl") for package management until this is added to Pkg.

I don't understand why Julia is different here...

@KristofferC. Pkg.add adds deps to the Project.toml. What is the difference here?

Even if it edited the Manifest, static compilation mostly should use Manifest.toml, not Project.toml. You want all the dependencies not just the direct ones (unless wanna do incremental compilation).

When you have all the code in init, you can't do anything.

Interactive things during package install strike me as particularly bad.

That was just an example of the possibilities! I would use global parameters and isdefined instead of prompting the user directly.

@DilumAluthge
Copy link
Member

DilumAluthge commented May 22, 2020

@staticfloat has opened some good issues and pull requests regarding this kind of stuff.

The goal is to eventually make package directories completely immutable.

This is possible with a combination of:

  1. The Project.toml file
  2. Artifacts - this has already been implemented into Pkg.
  3. "Scratch spaces" - Implement Scratch Spaces #1833
  4. "Preferences" - Add Preferences subsystem #1835

I think that all of the stuff you want to do can be covered by those four.

For example:

  • Preferences such as "do you want to use feature X" and "which backend do you want to use" would go into "preferences".
  • Installing different binaries depending on what operating system you are on would go into "artifacts."

If there is something that you want to do that is not covered by Project.toml, artifacts, scratch spaces, or preferences, it might be best to open a specific issue to track that feature request. Then we can figure out how to implement it in a declarative way.

@DilumAluthge
Copy link
Member

Also, for what it is worth, (and this may be off-topic): I think it would be cool for us to eventually completely remove deps/build.jl.

I am hoping that this will be possible eventually. But I will let @staticfloat correct me as to the feasibility of this.

@KristofferC
Copy link
Sponsor Member

Let me explain why doing package operations like

I might want to add a dependency that is not registered yet. I can simply do it in the Project.jl file.

or

Dependency based on the VERSION or OS

in a generic post-script hook is not such a good idea. The way Pkg (and many other package managers) works is that they gather up all the packages you want to have installed and what compatibility bounds they have as well as all packages and versions it knows about (from registries) and sends this to a "resolver".
The job of the resolver is to give back a set of package versions so that all dependencies are fulfilled and all compatibility info is adhered to.

However, if you run another package operation as a part of the installation of a package, you call back into the resolver to give you a new set of versions. Those versions might be different than in the first resolver call because the package you added in the post hook can introduce new compatibility bounds. So now you are re-running the resolver and might install new packages which might have their own post-installation hooks and might do Pkg operations calling back into the resolver etc. It seems likely that you can even end up in cycles here where you are just spinning around running post-installation hooks forever.

Therefore, we want to be able to up-front gather all packages that are going to be installed so we only have to run the resolver once. This is done by e.g. making sure that we know what dependencies and compat info packages have without having to execute arbitrary Julia code.

@StefanKarpinski
Copy link
Sponsor Member

StefanKarpinski commented May 22, 2020

@DilumAluthge: I think it would be cool for us to eventually completely remove deps/build.jl.

Yes, that would be great. Not sure if we'll ever be able to fully get there, but we want to at least eliminate as much non-declarative package setup as possible.

@aminya: I don't understand why Julia is different here...

Allowing people to run arbitrary code when installing, configuring or setting up packages is certainly the easiest thing to do form a design perspective and it's very seductive. I suspect that's why so many systems do it—it's easy and maximally flexible.

But it really ruins reproducibility, predictability and portability. If you run arbitrary code that can look at anything on the system it's running on when configuring or installing a package, then how to even install a project implicitly depends on all of the global mutable state of the environment it happens to be running on. If arbitrary code can be run to determine the dependencies of a package, then it's not even possible to definitively say what a package's dependencies are 🙀

Most of Pkg work in 1.0 and since has been in the exact opposite direction of this: we're trying to make as much of package installation and setup declarative and immutable. This proposal does the opposite and thus is antithetical to the philosophy of Pkg.

@StefanKarpinski
Copy link
Sponsor Member

StefanKarpinski commented May 22, 2020

Another point: CMake is a build system not a package manager. Build system have to run code—that's how you build things. That's a totally different situation. Conan is a package manager but it invokes an arbitrary build script, so it makes sense it would also allow arbitrary execution; they're just not in a position to constrain things more even if it would be beneficial. Rust is/has a build system, not a package manager. I don't believe that Cargo actually lets you run arbitrary code to determine what the dependencies of a package are. Even if it does, that doesn't mean we should.

@staticfloat
Copy link
Sponsor Member

One of the primary things that impeded making my PyEnv.jl being useful was the fact that pip installation often must run arbitrary code in order to determine which packages to install. It makes trying to install dependencies for a Python package (especially for a foreign platform/architecture) a real chore.

Because you can't know what packages need to be installed beforehand, you end up with longer install times (you can't fetch all packages in parallel, you have to fetch some packages, start installing them, then go fetch the dependencies that are missing from those dependencies, start installing those dependencies, and recurse), and you end up needing to actually set up every environment that you'd want to be able to install into, rather than being able to just collect everything you need for all platforms from the get-go. The Pythonistas know this is an issue, and they're moving toward a fully-declarative model with .whl files that don't allow this kind of flexibility, because it is a disaster for reproducibility.

@StefanKarpinski
Copy link
Sponsor Member

you can't fetch all packages in parallel, you have to fetch some packages, start installing them, then go fetch the dependencies that are missing from those dependencies, start installing those dependencies, and recurse

Not only is this slow, but it means that Python projects can't actually do version resolution correctly for reasons like what @KristofferC described. You need a fully dependency graph in advance in order to do version resolution—and even with that graph, it's a non-trivial problem. If you have to run code for a version to find out what the dependencies are, that means that to do proper version resolution, you'd have to run the code for every single version of any package that you're considering installing, just to find out what it depends on and then run the setup code for that, and so on. All just to get a graph that you can use to compute which versions of which packages you actually need to install. This unfortunate design choice in pip is probably one of the major reasons why package management is such an unreliable mess in Python. Once your dependency graph requires running arbitrary code to generate, you're basically screwed. Even with a static graph it's already a very hard problem!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants