-
Notifications
You must be signed in to change notification settings - Fork 0
Artifact Specification #8
Comments
Make sure to talk about reproducibility. Nix vs Docker. The CI/CD system and staging of the artifacts. Also fixed output derivation. @mokuki082 Can you add in all the links I sent to you. |
While investigating the Artifact Specification. We are comparing Docker Artifacts vs Nix Artifacts:
|
The Artifact composes several ideas:
With regards to initialisation routines, there can be complexity when the artifact specifies not a single executable, but a group of executables that is executed under an init process. Sort of like supervisord. Think erlang supervisor processes. If we take Docker containers as an example. Often the final image layer has the injection of a the file What the executable might then do, is bind to ports (if they are networked application). Or perform operations on a filesystem path (if they are batch application). The V1 spec, specifies what ports are available to be exposed by the image. So if the container has MySQL, then the V1 spec specifies that 3306 is exposed. Or if it is a postgres container, there is metadata that says that 5432 is exposed. This information is important, because these ports are often fixed "hardcoded" within the image layer. If this is true, we can't change it at the launch stage. Preferably these port bindings would not be something that is hardcoded into the container. And we could dynamically inject these parameters into the launching phase. Let's think of the launching phase of a function. This is the difference between a bound parameter, and a hardcoded value. For example see this: https://docs.docker.com/engine/reference/builder/#expose (The Dockerfile is the build expression for a Docker container). It seems that most containers do hardcode the port expose. If this is true, port expose is a property of the Artifact, and thus must be understood by the Orchestrator when it wants it to an IP address. Note that it appears that port bindings by the internal application tend to be hardcoded by the Docker community simply because they expect that the user of the container will do port mappings. Implementation wise, if every Automaton is given their own IP address. If Automaton A requires Automaton B. And B exposes port 8080. This means A's address to B must include the port mapping. A shouldn't care exactly what port it is. But the relay system must map A's address to that port. If the port expose is a property of the artifact, this means while the Orchestrator/Relay can choose to use a random port on A's address, it must eventually map to the specified port that has been hardcoded into the image. Now what do we actually mean by hardcoding? Most network executables allow you to specify the port as a command line parameter. However the port parameter may be fixed internally an entrypoint script. Or even by the dockerfile. @mokuki082 Can you check how OCI spec considers this, and where is the entrypoint/cmd information stored in the new container specification. If this is not part of the "image", our Artifact Specification will need to address it some how. There's an important considerations to entrypoint/cmd: https://www.ctl.io/developers/blog/post/dockerfile-entrypoint-vs-cmd/ Basically we should always be using the exec syntax, not the shell syntax. Note that the TCP/IP model is not the same as the OSI model. They are different. For more info see: https://tools.ietf.org/html/rfc3439#section-3 If we are utilising containers without the Dockerfile. Then I suspect that the container format will need some metadata that indicates what things are configurable and what things are not. If the internal port is configurable, that would be nice, but we cannot expect this. So we have to work around the lowest common denominator. |
Consider the problem of whether the Artifact Specification species a fixed output derivation ("Nix nomenclature") or the actual instructions on how to build the artifact. There's a difference between:
vs
Both expressions express reproducibility or at least immutable content addressing. The first gives you instructions on how to build the artifact. This can be utilised by the CI/CD system to build the artifact for the orchestrator to deploy. In fact the orchestrator just needs the artifact in the build cache. If it doesn't have it, it triggers the build instructions. The other expression is slightly different because the build expression is not self-contained. It relies on something external that is assumed to exist. It's sort of like a magic value. Assuming the upstream sources are reliable (either by using IPFS) or otherwise. What does this mean for the Artifact Specification? Should the Artifact specification have actual build expression encoded in it? Or should it just point to the fixed output derivation hash of some external build system. Like Docker or Nix? We can be flexible here. As long as it meets the abstract requirements of "reproducible builds" https://reproducible-builds.org/. The Artifact Specification can either embed an entire build expression. Or it can point to an external build file. But we have to be careful here, pointing an external file is a source of mutability. What if that file got changed? We can operate similar to Nix here, where the evaluation of this source, considers the contents of the file at that point, and saves the output with the hash of all the inputs. If that file changes, then the nix expression will produce a different output hash. But that is problematic because other Automatons may be composed with Automatons, and Automatons are all addressable via content hashes. Which means as soon as that file changes, the composing expressions will no longer work (if they were using hashes). So... instead perhaps relying on external files must use a fixed output derivation. Just like how nix refers to external sources via a
The problem comes with using We also have to be concerned with DockerFiles that are not reproducible, because they may have aspects that are impure. Like network downloads. This is the same problem that nixpkgs community faces when they are given packages that have build scripts that perform impure operations or rely on impure properties. Usually this means the nixpkgs community rewrites the build expressions or edits them to make them pure. Can we detect impurities statically? Not in a guaranteed manner. But we can again rely on Nix's way to do it. They use a pure "container" to run builds, and if the builds fail in their CI system (Hydra), then it is rejected by the Nix community. See |
I think that the configuration that an automaton depends on (e.g. Dockerfile) should not effect the addressing of the automaton, but the output of that configuration should. In the end, all we care about is the output of the configuration, rather than the configuration itself. For example, just because we added a newline in a dockerfile, shouldn't require us to change the automaton's address, and even all the other automatons that interacts/depends on this automaton. In the case of the dockerfile, we might want to compare the produced OCI image manifest rather than the dockerfile itself, and in the case of nixos, we would care about the hash of the actual package being installed in the system rather than the build script itself. |
If we imagine that |
One way to deal with this hash and change problem is to have a composition of hashes. Even if the Automaton hash changes, if the Artifact hash doesn't change, then we can reuse the built artifact, rather than rebuilding speculatively. |
Perhaps these should have hashes as well, since the
|
Just had an idea today about this. Certain artifacts may require certain hardware properties in order to run. This is also a constraint on the deployment, and which node can supply the resources required. For example an Automaton may require access to the GPU. These properties may not be recorded in the Artifact image. But instead recorded in the Artifact specfication just like the Image Index in the OCI spec. The Image Index lists multiple artifacts, one for different CPU architectures. But we could do something for GPU architectures and other hardware properties. I'd prefer something that would be a composable list of constraints, rather than magic strings like "x86 with CUDA GPU... etc", this may be mapped to Node tests or however we are supplying the Nodes. |
So far we've had some ideas on how the artifact component could look like. Dockerfile
Problems:
Nix ExpressionNix has a special API dockerTools which allows creation of Docker images. Of course we probably will be looking into writing our own API because dockerTools is quite limited and unstable, but it is a nice entrypoint for me since I haven't got much experiences in writing Nix expressions. I'll investigate this into more details. |
We can do more than just nix expressions that create a container. It may eventually be possible to use any old nix expression. According the image spec, a container is just series of filesystem layers unioned together. It is not necessary for that to be the basis of an Automaton, it's just fashionable right now. Using nix expressions opens up other avenues, such as ISOs, VMs, unikernels, plain executables... etc. But yes for now we shall focus on Docker/OCI containers. |
With regards to using nix expressions for artifact spec, when If we want to address the nix expression used to generate an artifact, we could hash the store derivation generated from the nix expression. |
We may need to generate nix expressions. Like conversion of Architect constructs to Nix constructs, and other data structure or on-disk nix expression manipulation. To do this, we can look at how dhall-nix works (https://hackage.haskell.org/package/dhall-nix). Notice it depends on hnix (https://hackage.haskell.org/package/hnix). It calls itself a haskell implementation of Nix. Alternatively consider: https://hackage.haskell.org/package/language-nix Also can you try building a docker/OCI image directly using just nix expressions? What should we use as our execution language? I'm guessing scripting language as that's usually what is used. But I'm wondering if our Architect specification can have a cross language quasiquoter, and just allow direct specification of Nix or whatever build language that is deterministic. |
As a clarification, I'm referring to grammar composition (this is often a useful thing when having embedded external DSLs). See: atom-haskell/language-haskell#88 for usage in an IDE context. Also: http://lambda-the-ultimate.org/node/4489 It would be nice to be able to even refer to a nix expression file written outside and deal with that somehow. But I still need confirmation on how to make the entire thing content addressed when the very fact of importing an external file is already non-deterministic (as it involves IO). I'll need to investigate how more about how the GHC system does quasiquotation and maybe we can lift some features out of it. |
Just to clarify what I meant with an example, if we write a nix expression that import sources such as "./builder.sh" or "http://source.html". This is not deterministic because the content of "./builder.sh" could change the next time you run the same nix expression, which means the expected output could changed even though the nix expression remains the same. Therefore the hash of a nix expression is not enough to be used as a reference of a particular derivation. But what we can do, is that we can recursively generate a store derivation on all sources and dependencies used in the nix expression, which produces a content hash of all the sources mentioned in a derivation in a format like |
The new nix 2.0 adds some new features and mentions this interesting fact:
Seems that there are different evaluation modes that can produce certain useful properties. |
Nix store paths are composed purely of its input, and is not the output of the build. This can be demonstrated by the fact that nix store derivations are made before the build process. However, this introduces a possibility of duplicating outputs. In Nix, this is usually not a problem because all its packages are from a centralised server, i.e. Nixpkgs. If someone writes two nix derivations that produces the same output, they are likely to be told something that this package already exists. But for a distributed system with multiple operators, this may become problematic if the operators are not communicating to each other and write different expressions that produces the same output, which causes duplicate data. |
After some investigation on how nix generates store paths, here's what I found (in short): Input Path
OutPathBefore the final store derivaiton hash is computed, Nix computes the output path of where the final output is going to be built, i.e. the value in the
Fixed-output DerivationA derivation can take three special attributes:
If a derivation contains these special attributes, a special
Then we compute the final path just like step 3 in the first section.
Final Derivation-- to be updated. |
The artifact specification is intended to represent the backing artifact that implements the protocol specification. For now we have 2 options and this depends on the subtrate that we want to support. Given that we have specified our subtrate for now to be 64 bit intel linux running NixOS, the 2 artifacts are Docker/OCI Container Images and Nix Archives. This model should be extendable to include other artifact formats (like the Java stuff).
Docker Image Specification
Previously Docker has bee using V1 image spec. By using
docker save | tar -x
on the an ubuntu image, we can see how the V1 spec is structured. (Currently V1 spec is only used indocker load
anddocker save
, but not in the actual deployment or processing of the image itself)There are a few problems:
manifest.json
file in the root of the image describes the layers hierarchy already, the layers doesn't need to have a specificjson
file that also has the hash of the parent layer.entrypoint
,cmd
,domain name
, which should be container specific but not layer specific.Now, Docker is aware of these issues above and designed a V2 spec. The V2 Spec adopts the OCI image specification and is a lot more sophisticated. It's description is as follows:
mediaType
is introduced to indicate the types of each configuration component.Some mediaTypes in Docker are interchangable with the mediaTypes in the OCI spec (See Compatibility Matrix).
ubuntu@<content_hash>
. This disallow us to refer to a container solely by its content hash. We might have to implement out own image registry if we want to do this differently. (How to Pull an image by digest.)differences seems to be:
A fuller hash verfication detail could be found in the OCI descriptor specification. However this doesn't really mention what is being hashed but instead the way to verify a hash.
Relation to Artifact Spec
Relation to the underlying implementation
ubuntu@<hash>
) This could be solved by making our own registry, however further studies are needed to know the complexity of this problem. (Potentially we might also be able to use ipfs instead of container registry)RunC
is a CLI for container creation/destruction at a lower level,libcontainer
is the library that it uses to create OCI complaint containers. It contains simple mechanisms to initiate containers, but without docker's orchestrator and monitoring on the containers, which is exactly what we want.Next Step
OCI Runtime specification is needed to understand how to run the container from the orchestrator.DONEThe text was updated successfully, but these errors were encountered: