Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support building fully remotely #260

Open
wmertens opened this issue Jan 27, 2015 · 28 comments
Open

Support building fully remotely #260

wmertens opened this issue Jan 27, 2015 · 28 comments

Comments

@wmertens
Copy link
Contributor

If you're running nixops on your OS X laptop and deploy, it will first download a bunch of things to your laptop and fail at some point because you're on Darwin. Next, you log in on the deployed remote host and add your own ssh key so that nix-build can use the host. Then, you deploy again and it starts uploading what it just downloaded. This is not a nice user story.

It would be great if there was support for a buildhost, which runs the deployment build and then pushes to the other nodes in the network. One or more of the machines in the network are marked as workers and one is marked as nixops master. If the nixops master is defined it will perform all downloads and coordinate builds, and all other machines are populated from it.

Something like that? Or at the very least documenting a workflow like "deploy a VM from this image, do this to make it into a regular nixos machine with nixops and copy your network definitions on there" would be nice.

@copumpkin
Copy link
Member

👍 ran into the exact same issue

@domenkozar
Copy link
Member

I have a patch somewhere...

@wmertens
Copy link
Contributor Author

@domenkozar about that patch... send it to me and I'll clean it up? Or is it ready to go?

@edolstra
Copy link
Member

In theory this is already supposed to work. From the code:

        # If we're not running on Linux, then perform the build on the
        # target machines.  FIXME: Also enable this if we're on 32-bit
        # and want to deploy to 64-bit.
        if platform.system() != 'Linux' and os.environ.get('NIX_REMOTE') != 'daemon':
            if os.environ.get('NIX_REMOTE_SYSTEMS') == None:
                remote_machines = []
                for m in sorted(selected, key=lambda m: m.index):
                    key_file = m.get_ssh_private_key_file()
                    if not key_file: raise Exception("do not know private SSH key for machine ‘{0}’".format(m.name))
                    # FIXME: Figure out the correct machine type of ‘m’ (it might not be x86_64-linux).
                    remote_machines.append("root@{0} {1} {2} 2 1\n".format(m.get_ssh_name(), 'i686-linux,x86_64-linux', key_file))
                    # Use only a single machine for now (issue #103).
                    break
                remote_machines_file = "{0}/nix.machines".format(self.tempdir)
                with open(remote_machines_file, "w") as f:
                    f.write("".join(remote_machines))
                os.environ['NIX_REMOTE_SYSTEMS'] = remote_machines_file
            else:
                self.logger.log("using predefined remote systems file: {0}".format(os.environ['NIX_REMOTE_SYSTEMS']))

            # FIXME: Use ‘--option use-build-hook true’ instead of setting
            # $NIX_BUILD_HOOK, once Nix supports that.
            os.environ['NIX_BUILD_HOOK'] = os.path.dirname(os.path.realpath(nixops.util.which("nix-build"))) + "/../libexec/nix/build-remote.pl"

            load_dir = "{0}/current-load".format(self.tempdir)
            if not os.path.exists(load_dir): os.makedirs(load_dir, 0700)
            os.environ['NIX_CURRENT_LOAD'] = load_dir

@domenkozar
Copy link
Member

Yeah the work that needs to be done is to check if platforms match and provide a way to override the default to on or off.

@domenkozar
Copy link
Member

@edolstra
Copy link
Member

Hm, calling uname on every machine seems unnecessary. Presumably NixOps already knows the architecture of the remote machines (e.g. from the nixpkgs.system option).

@wmertens
Copy link
Contributor Author

@edolstra actually there are two problems and my report above doesn't highlight them properly.

  • Remote building doesn't work out of the box, due to nix-build not getting SSH credentials from nixops
  • When building remotely does work, all packages are still downloaded locally and pushed from local. This is very slow.

@wmertens
Copy link
Contributor Author

How about designating a push host, from which other systems get populated? Defaulting to localhost but if set to the build host and if SSH behaviour is fixed, that would behave as I'd prefer.

@edolstra
Copy link
Member

The problem with a push host is that there is no guarantee that it has connectivity to the other machines (e.g. if you deploy a network with EC2 machines in different regions).

@wmertens
Copy link
Contributor Author

@edolstra But in that case you simply keep the push host as localhost and
then things are as they are now.

I'm now wondering how to make that work. The push host needs to initiate
the builds so it needs to run nix-daemon and the .drv files need to make it
over there, but other than that it doesn't need anything right?

So deploying from scratch with a remote push host would be:

  • nixops scans config for hosts, brings up servers
  • if remote push host
    • build .drv files with correct platform
    • once push host up, copy .drv files over, run nix-daemon and
      instantiate

Would that work?

On Wed, Apr 15, 2015 at 2:27 PM Eelco Dolstra notifications@github.com
wrote:

The problem with a push host is that there is no guarantee that it has
connectivity to the other machines (e.g. if you deploy a network with EC2
machines in different regions).


Reply to this email directly or view it on GitHub
#260 (comment).

@wmertens
Copy link
Contributor Author

Forgot to add:

  • Once everything built, use the push host to push everything and
    activate like normally

On Wed, Apr 15, 2015 at 3:12 PM Wout Mertens wout.mertens@gmail.com wrote:

@edolstra But in that case you simply keep the push host as localhost and
then things are as they are now.

I'm now wondering how to make that work. The push host needs to initiate
the builds so it needs to run nix-daemon and the .drv files need to make it
over there, but other than that it doesn't need anything right?

So deploying from scratch with a remote push host would be:

  • nixops scans config for hosts, brings up servers
  • if remote push host
    • build .drv files with correct platform
    • once push host up, copy .drv files over, run nix-daemon and
      instantiate

Would that work?

On Wed, Apr 15, 2015 at 2:27 PM Eelco Dolstra notifications@github.com
wrote:

The problem with a push host is that there is no guarantee that it has
connectivity to the other machines (e.g. if you deploy a network with EC2
machines in different regions).


Reply to this email directly or view it on GitHub
#260 (comment).

@domenkozar
Copy link
Member

Just for reference, this also addresses #195

@ryanartecona
Copy link
Contributor

I just downloaded and started playing with nixops, starting with examples from the manual, and hit this issue pretty quickly. I'm also too new to Nix{,OS,ops} to unstick myself even by manually fixing things up as described in this issue description. No fun!

@jezen
Copy link

jezen commented Jan 16, 2016

👍

I'm trying to deploy and it's taken ~8 hours so far.

@domenkozar
Copy link
Member

@aszlig can you post here your findings? Just for future reference :)

@aszlig
Copy link
Member

aszlig commented Feb 27, 2016

The problem with NIX_REMOTE_SYSTEMS is that it still needs setup on the deployment machine, because you can't just "insert" build hooks into an already existing Nix daemon. So it only works if you unset NIX_REMOTE but you'd need to have write access to the local store.

Another idea I had was to instantiate the individual machines, copy-closure the .drv to the target machines and run a nix-build over there. The problem however is that the results are still needed on the deployment machine, but we can't copy them back unless we're in trusted-users, so there still is setup required. Also this doesn't properly divide the builds among all the machines in the deployment, so slow target machines could still be the bottleneck.

@wmertens
Copy link
Contributor Author

wmertens commented Aug 3, 2016

I made issue #483 for allowing builds on OS X to work out of the box for everyone.

I will leave this one open to discuss the push-host idea more. @aszlig would the push-host described above fix your comments?

@gilligan
Copy link
Contributor

FWIW I put together something to easily set up a linux remote builder running docker (https://github.com/holidaycheck/nix-remote-builder). Based on the code that @edolstra pasted having a remote builder configured would be a workaround for this I assume?

Obviously not needing the remote builder in the first place would be better I suppose.

@domenkozar
Copy link
Member

Relevant #412

@samuela
Copy link
Member

samuela commented Oct 3, 2020

This feature is unfortunately a dealbreaker for me. I love Nix, but I also have to run on macOS. I was surprised to learn that this is not how nixops actually works. IMHO it severely limits the potential userbase for nixops and puts a damper on growth relative to other tools like terraform.

Also seems that this would solve a number of other issues brought up in the past: #560, #976.

@jezen
Copy link

jezen commented Oct 5, 2020

This feature is unfortunately a dealbreaker for me. I love Nix, but I also have to run on macOS. I was surprised to learn that this is not how nixops actually works. IMHO it severely limits the potential userbase for nixops and puts a damper on growth relative to other tools like terraform.

Also seems that this would solve a number of other issues brought up in the past: #560, #976.

FWIW, I did end up getting this working just fine with the help of this article:

https://medium.com/@zw3rk/provisioning-a-nixos-server-from-macos-d36055afc4ad

@tobiasBora
Copy link

I think I still don't get why we need to download everything on the client side. Couldn't we just add an option to push the configuration.nix file on all the servers, and then run a switch from all the servers? That way, most of the time the servers will just download precompiled binaries from the nixos cache so it will be much more efficient to download it from there rather than from the client (who may have a very poor connection).
And if one really wants to avoid building stuff several time when having multile servers, then we could setup an optional cache/builder machine.
Also, this issue is not only about MacOs, but you have the same troubles when the client and the server runs on different architecture (in my case the client is x86_64 and the server is a rasperry pi aarch server).

@wmertens
Copy link
Contributor Author

wmertens commented Oct 7, 2020

@tobiasBora It's not as easy as it sounds because the configuration needs to be complete. So in fact the configuration.nix would need to set the environment to what it is on the build host, import the nixops deployment configuration and extract its host configuration, and include all its nix dependencies. Possible, but not trivial.

But indeed, adding your own cache is probably pretty easy and then what you propose would be really nice. The added bonus is being able to run nixos-rebuild on the server and it working.

@samuela
Copy link
Member

samuela commented Oct 7, 2020

@tobiasBora It's not as easy as it sounds because the configuration needs to be complete. So in fact the configuration.nix would need to set the environment to what it is on the build host, import the nixops deployment configuration and extract its host configuration, and include all its nix dependencies. Possible, but not trivial.

I'm not super familiar with nixops, but this doesn't actually sound that hard. And in fact all these steps could still on the client machine IIUC. The client machine could evaluate the nixops expression, send the right subexpressions to the right machines and then run nixos-rebuild on each of those machines over ssh.

@wmertens
Copy link
Contributor Author

wmertens commented Oct 7, 2020

Indeed, this is all technically possible and even desirable, and if this were JavaScript someone would probably have already done it.

I know there is a Haskell parser for Nix, perhaps that project can do the heavy lifting.

There's also nixjs written by @svanderburg but I don't know if it is at a level that it can extract dependency trees from Nix.

Even something that copies all the files that might be needed instead of only the files that are needed would be great.

@samuela
Copy link
Member

samuela commented Oct 7, 2020

Even something that copies all the files that might be needed instead of only the files that are needed would be great.

Yeah, this seems like a good way to go. And pretty easy too!

@tobiasBora
Copy link

tobiasBora commented Oct 7, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants