Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPG verification and signing of packages #208

Open
muziker opened this issue Apr 17, 2019 · 17 comments
Open

GPG verification and signing of packages #208

muziker opened this issue Apr 17, 2019 · 17 comments

Comments

@muziker
Copy link

muziker commented Apr 17, 2019

Currently, any package can be uploaded to the repository, without verification of whom that uploaded the compiled tarballs.

Is it possible that instead of submitting compiled tarballs into the site, there is an upload script that resides on the user's machine that hashes and signs the signature of that package before uploading to the site?

During the download phase, the tarball containing the signed hash can be used to verify the integrity of the tarball and at the same time provide authenticity about the person who uploaded it. Knowing that pki itself is a complicated thing, and not every developer wants to deal with maintaining private keys, still, the upload script could hide the details of key submission to a keyserver, while the download script could verify good signatures on the downloaded packages.

This would help in following the provenance of the tarball.

@umlaeute
Copy link
Contributor

the deken script attempts to pgp-sign any package and uploads the signature alongside the package itself (this is what you call the "upload phase").

the deken-plugin (as built into Pd, but which can also be updated separately; this is the "download phase"), currently doesn't do any signature verification (which of course should be fixed in the long run).

so the basic infrastructure is already there.

your additional suggestion is about "automatic signatures", which i either don't understand or at least don't see to have any positive effects.

signing packages is about adding trust about the origin of these packages.
creating a key and uploading it to a keyserver might be easily automatable, but it doesn't create a yota of trust. anybody and their dog can do that (and black-hat can do it "for" you).
the complicated part is connecting your key to your identity. this is not something we can automate. (this not being easily automatable is also the reason why pgp is not very successful (sure, pgp/gpg also has an ugly user-interface).

@umlaeute
Copy link
Contributor

umlaeute commented Oct 2, 2019

i think the biggest issue is, that I don't know yet how to design the GPG-verification.

a package can have one of the following states:

  • the GPG-signature is correct and the signer is known
  • the GPG-signature is correct but the signer is unknown
  • the GPG-signature is incorrect
  • the package has no GPG-signature at all (missing)

the only obvious reaction i can think of is to reject packages that have an incorrect GPG-signature..
there might be an option to accept (or not) packages with a missing signature.

But what to do about the rest?

  • If the signer is unknown, we might want to suggest (and help) to fetch the signature from a keyserver (urgh, that sounds like a lot of brittle work)
  • If the signer is known, what does this mean in terms of trust? not much (only that the user has downloaded the key from the keyserver once...).

We should have some WoT verification, e.g. allowing

  • only signatures for which we can establish a chain-of-trust to the user themselves
  • only signatures for which we can establish a chain-of-trust to some gatekeeper
  • ...
  • a combination thereof

An idea is to allow the user to explicitely specify a number of authorities (or rather their keys) from which they accept packages (and reject all packages signed by other people)

@muziker
Copy link
Author

muziker commented Oct 14, 2019

The main problem with public key infrastructure is, while diffie helman is fine for transferring of keys, the main crux of determining if the keys are valid in the first place, i.e. they belong to who they are purported to belong to, is not really addressed, other than through the "go forth and signing party" mantra of the web of trust.

The browsers do this by preinstalling a set of certificate authorities akin to pre shared keys, and regularly updating them through the repository update process. If pure data includes an independent set of keys with which we can trust, it'll be the first in creating a trusted keyring. So for instance, if developers get together at a pure data conference, and you verify using actual photo IDs, that could be one method. The other method depends on accrued reputation as a form of verification. For instance, miller puckette would be one with an accrued reputation that would entail, a key that is not his be found out quicker and revoked, and he would be less likely to upload malicious code. When two or more individuals with accrued reputation meet and they key sign each other, is when there begins a network effect of trust. They form the trust anchors by which the entire community can base the downloads from. And so, the first step is requiring developers to sign their uploads, and when their accrued reputation becomes significant and they meet others from the community, their uploaded keyring can serve as trust anchors for others.

As it is, the web of trust for most is simply downloading keys from a keyserver without actually being able to verify those keys. That's more like 'the thread of transient temporality' really.

@umlaeute
Copy link
Contributor

so what should we do?

@muziker
Copy link
Author

muziker commented Oct 15, 2019

Currently there isn't really a great way of doing this so. Fundamentally noone really wants to sift through uploaded code to verify that it functions the way it is supposed, so this scheme is mostly to focus on stopping downloading of code that is not the same as the one being uploaded:

Phase 1:

  1. Get all developers to sign their releases
    1.1 users can independently get the public keys from the sks keyservers.
  2. the upload server must keysign the releases too
  3. puredata is distributed with a set of public keys that users can import
    3.1 users can at least verify the downloads have been signed by the upload server
    3.2 users can verify developer keys from the sks keyservers
    3.3 users can verify distro distributed pure data public key with the github downloaded public key or CD distributed distro public key to verify the keys haven't been tampered with

Phase 2:

  1. There should be a way in which public keys, and the ownertrust of those individuals with high reputation can be imported. Ultimately if i download a program and it is signed by an individual whose keys are found to be trusted by this group of ownertrusts then i can safely trust that the download can be traced back to an actual developer.
  2. if there is a immutable log of <key, hash digest, program> tuple, then any tampering can be easily found.

@umlaeute
Copy link
Contributor

  1. Get all developers to sign their releases

this is probably the biggest hurdle in the entire game.

  1. the upload server must keysign the releases too

what's the "upload server"?

3.3 users can verify distro distributed pure data public key with the github downloaded public key or CD distributed distro public key to verify the keys haven't been tampered with

what does "github" have to do with this? what do you mean with "CD distributed"?
we don't use either of these channels to bring binaries (of neither Pd not externals) to the world.

@muziker
Copy link
Author

muziker commented Oct 16, 2019

  1. Getting developers to sign their releases entails some kind of key management on their part, which is usually a large hassle. Not least of which the management of subkeys is also pretty gnarly.

  2. the upload server is the server which receives the uploaded tar balls from deken

3.3 Github or for that matter any other entity that distributes PD, should have a copy of the public key. With the public key distributed by many entities, a method of verifying tampering would be for the user to download as many versions of pd to check for any changes to the public key. CD distribution is the method by which distros are distributed through compact discs in magazines, meetups, etc. CD distros have the advantage of being an immutable checkpoint where one can say at this point our sets of processes lead us to believe the distro is ok. A public key distributed with pd and included as a package on disc of a repo, would likewise serve as a nice physical immutable checkpoint

@muziker
Copy link
Author

muziker commented Oct 16, 2019

  1. Also part of the problem has always been the keys are on disk in their directory somewhere as opposed to being off disk

@umlaeute
Copy link
Contributor

thanks for your clarifications.

regarding "CD as a medium for distros": we already have Pd in linux distros like Debian. These distros come with a lot of packaged externals. and these distros come with an established way of guaranteeing the integrity of their packages (using gpg signatures on a distro level, reproducible builds,...).
So I think if somebody really wants to be sure that all of the externals they installed have a guaranteed integrity, then they should just use a linux distribution and the packages that come with it (We also have a deken plugin that integrates with apt (and is available via apt), so you can install Pd externals from your distro repository from the usual deken interface).

I mention this only, because in terms of developer-cooperation i think it totally illusory to force your ordinary Pd external developer to do a proper key management (as in: creating keys, uploading keys, extending key expiration dates,...; and using proper keylengths and password-protect their keys; and once the keys are password-protected to not forget the password during any period (extending 2 months) of inactivity).

Linux distributions have a different lever and thus they can (and do) force such things on their maintainers.

@Spacechild1
Copy link

Spacechild1 commented Oct 17, 2019

i think it totally illusory to force your ordinary Pd external developer to do a proper key management

+1. Personally, I don't know how to use GPG signatures at all ¯_(ツ)_/¯

Outside of Linux distros, Pd itself is usually downloaded directly from Miller's website and externals are downloaded via Deken (people need a puredata.info account to upload packages, so I could imagine doing some kind of checking/verification there).

Has there been any case of malware disguised as a Pd external? I don't think Pd users are a valuable target for criminals...

@muziker
Copy link
Author

muziker commented Oct 18, 2019

Some commentary about reproducible builds. I've taken a bunch of source and compiled them and checked against binaries on the system. They're always invariably different. At the moment, only certain languages can produce repro builds. Rust and go-lang are some. Everything else requires a certain amount of teeth pulling to work.

About some adversity in key usage, at this moment i'm trying to verify a ubuntu 19.10 iso, I download the hash sums, and the gpg file. The local user keyring doesn't have the public key for the gpg file. I go to pool.sks-keyservers.net and search for the ubuntu cd automatic signing key. The resulting page lists the searched key and a bunch of other users' keys that verify the searched key. I try to download the result and import it into the keyring. This does not work, there seems to be an issue with some bytes of the file. The page itself is http only, so i change it to https. The browser tells me the ssl cert is issued to a different site. I look for the key that is already on my system, distributed by the distro. But, using a key that is in an iso that comes from the same site that i've downloaded the hashes and gpg file from seems to be against the spirit of having independent verification.

So now i use gpg --recv-key to get the key from the hkp keyserver directly, but since i'm behind a proxy, this doesn't work. And to cap it off, dirmngr, which is supposed to handle ocsp, crl, has not been able to honour http_proxy environment variables for years.

I'm not sure why there's this extreme friction to getting a valid key off a server to verify these hashes.

(people need a puredata.info account to upload packages, so I could imagine doing some kind of checking/verification there).

gpg key generation is usually done on the local computer because the private key is meant to never be distributed anywhere. Doing a key generation on an external system and distributing that means there's a possibility of copying of the private key.

@umlaeute
Copy link
Contributor

Some commentary about reproducible builds. I've taken a bunch of source and compiled them and checked against binaries on the system. They're always invariably different. At the moment, only certain languages can produce repro builds. Rust and go-lang are some. Everything else requires a certain amount of teeth pulling to work.

well.
otoh, if you search through the packages that i maintain in Debian (this page lists all the packages I maintain; for the discussion at hand only the once that start with pd- are interesting, but I haven't found out how to limit the search results to those), you might notice that almost all Debian-packaged Pd-externals can be built reproducibly (in the last column, the lower checkbox indicates reproducible builds)

A big obstacle when attempting to build reproducible is that many builds include the build date (__DATE__, & __TIME__) in the binary artifacts.
And of course, you need exactly the same build toolchain (different compilers (or compiler versions) might give different results,...) - something that requires quite a bit of housekeeping, but which is managable on a distro level :-)

@muziker
Copy link
Author

muziker commented Oct 22, 2019

Repro builds: Having seen that list, I'd say there should be an easier way for developers to do that. For example, maybe an apt-get source gets a dockerfile to create an environment to do a repro build.

@umlaeute
Copy link
Contributor

probably, but that is not a problem for deken's "GPG verification and signing of packages".

@muziker
Copy link
Author

muziker commented Nov 7, 2019

Is there someone's gpg public key that i can download that has a large amount of already signed keys?

@umlaeute
Copy link
Contributor

umlaeute commented Nov 7, 2019 via email

@muziker
Copy link
Author

muziker commented Nov 7, 2019

So if a person signs another person's key and vice versa, and they update their keys on the keyserver, the web of trust entails that anyone downloading the public key will get a set of those keys as well. But first, finding these individuals is difficult if one is surrounded by non techies.

So based on that criteria, what is the argmax(KeyringSize(key)) i.e. whose key should i download such that i have the largest number of trusted( or marginally trusted) public keys

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants