-
Notifications
You must be signed in to change notification settings - Fork 627
Rethink the Third Party Modules registry #406
Comments
Anything to stop namesquatting, even if Deno packages are not dependent on being in |
|
The problem is similar to cybersquatting As long as a good module name is registered, others will no longer be able to submit the module with the name Even if this module has been deprecated and will not be updated for several years, it cannot be deleted @denoland developers should pay more attention to the development of core modules and standard libraries. Instead of wasting a lot of time processing database.json 's Pull Request |
Also, making PRs to I'm all for using a GitHub repository as the source of truth rather than a dedicated closed registry like NPM, but maybe a better way to do it is to have a repository called "x" that has a file for each package, e.g., If we use something like Cloudflare as the primary CDN for this registry, we can:
Of course, none of this is necessary if we decided to eliminate the registry altogether in favor of directly using GitHub URLs. However, I believe that not having a central registry will create a vacuum that a third-party registries will try and fulfill. Even from a "global bandwidth usage" and carbon footprint perspective, caching files in a central registry (thanks to the CDN) whose URLs everyone is using is better than the alternate of people using different registries to fetch the same files. In summary: I agree that the current single JSON file doesn't work as we scale, but I see the need for a single source of truth for third-party modules, just because of the ease of use that brings to developers. My proposal is to have a dedicated GitHub repository with each file representing a package, just like Microsoft's |
I agree. I had assumed, my own fault, that |
The Crystal community handled a similar problem by creating a few unofficial web apps that did no more than search GitHub for Crystal projects and return a list. |
The current Deno registry
The problems/features
ThoughtsDeno is decentralized. We are not locked in a single platform. Git providers, OSS providers, CDN, self-hosted, there are countless choices. Should we have a central registry? Most of us won't want another npm, but we have to tackle problems above in case of a healthy and thriving ecosystem. Solutions
ConclusionA central registry for third party Deno modules should be opposed. That said, we can still have unofficial websites that provides search and index function for Deno modules in various providers, and tools/helpers for managing imports/dependencies, examples mentioned above. |
I agree, we should then focus on completely independent packages. We are free to create a site (like go/x) where people can search for packages, like others can built too. However, if we're removing "official" package names, I highly recommend getting rid of deno.land/x entirely, because it's not different from any other GitHub proxy CDN, and if the URLs are going to be deno.land/x/github/username/repo/brach/file, then even non-Deno related projects are going to be served from that service, which doesn't make sense. There are several options today to serve GitHub URLs, ordered most to lease reliable (in my opinion):
Perhaps people should just use these, then? Might open up a healthy ecosystem of CDN URLs, and different Deno styleguides can recommend specific CDNs as well, based on their usage (like how in the jQuery days, Google's CDN because the de facto, overtaking jQuery's own URLs). In this case, I don't see the need to endorse a specific URL from Deno. |
Hopefully, my two cents will be helpful. Since there is a distinction between different types of database entries ( import module from "std:path/to/file.ts";
import module from "github:user/repo[@ref]:path/to/file.ts";
import module from "npm:[@scope/]package[@tag|@version]"; // resolves to .tgz, then unpacked to local cache
import module from "https://example.com/path/to/file.ts"; // would make sense to make "url:" qualifier implicit This would let us reuse a lot of existing stuff in an (imo) clear and elegant way. |
Some examples: // $DENO_STD_LIB_REPO = https://raw.githubusercontent.com/denoland/deno/master/std
import module from "std:archive/tar.ts";
// "std" resolves to $DENO_STD_LIB_REPO
// $PATH = archive/tar.ts
// $DENO_STD_LIB_REPO/$PATH =
// https://raw.githubusercontent.com/denoland/deno/master/std/archive/tar.ts
// also:
import module from "std:fs/copy.ts";
// https://raw.githubusercontent.com/denoland/deno/master/std/fs/copy.ts
import module from "std:std/std.std";
// https://raw.githubusercontent.com/denoland/deno/master/std/std/std.std // $GITHUB_DOMAIN = https://raw.githubusercontent.com
import module from "github:microsoft/typescript:src/tsc/tsc.ts";
// "github" resolves to $GITHUB_DOMAIN
// $USER = microsoft
// $REPO = typescript
// $REF = master (implied)
// $PATH = src/tsc.tsc.ts
// $GITHUB_DOMAIN/$USER/$REPO/$REF/$PATH =
// https://raw.githubusercontent.com/microsoft/typescript/master/src/tsc/tsc.ts
// also:
import module from "github:microsoft/typescript@master:src/tsc/tsc.ts";
// (same as above, with explicit ref)
import module from "github:microsoft/typescript@v3.9.2:src/tsc/tsc.ts";
// https://raw.githubusercontent.com/microsoft/typescript/v3.9.2/src/tsc/tsc.ts
import module from "github:github/github@github:github/github.github";
// https://raw.githubusercontent.com/github/github/github/github/github.github // $NPM_REGISTRY = https://registry.npmjs.org
import module from "npm:qs";
// "npm" resolves to $NPM_REGISTRY
// $SCOPE = (global)
// $NAME = qs
// $TAG = latest (implied)
// $VERSION =6.9.4 (calculated from tag, see https://registry.npmjs.org/qs)
// $NPM_REGISTRY/$SCOPE/$NAME/-/$NAME-$VERSION.tgz =
// https://registry.npmjs.org/qs/-/qs-6.9.4.tgz
// $PATH = lib/index.js (calculated from archive/package.json)
// $DENO_CACHE/npm/qs-6.9.4.tgz/lib/index.js
// also:
import module from "npm:express";
// https://registry.npmjs.org/express/-/express-4.17.1.tgz
// $DENO_CACHE/npm/express-4.17.1.tgz/index.js
import module from "npm:@angular/core";
// https://registry.npmjs.org/@angular/core/-/core-9.1.7.tgz
// $DENO_CACHE/npm/@angular/core-9.1.7.tgz/bundles/core.umd.js |
@parzhitsky This is incompatible with the web platform. Also this is an issue about |
Possible solution to Break it down into separate directories. Use a structure similar to This would solve the review problem if you can write a check and automatically merge the pull request if the This would create something similar to:
The Solves:
Doesn't solve:
My solution to Create a separate website like npmjs.org but don't host the packages. Let user define the name, repo and location of the module. It can also be a CDN to help users and serverless functions to import modules faster. Again the URLs will be of the format : This would not force people to register there modules like npm but will be a central registry for most of the modules. People can host them wherever they want and have a proxied URLs parallel to their own distribution method. Solves:
|
@maximousblk thanks for sharing your thoughts, which is similar to my solution, but there are some problems:
Anything aside from building a real database (another npmjs.org) will be a disaster in the future. As you said, it won't scale either.
If we ditch an official registry and build one (or two) third-party index, there are two ways to do it:
They do seem alike, but here's the difference: if we let people submit their modules, we will have troubles. Who's going to examine them? How do we update the info? Should we have an account system? What about name conflicting? Should we have namespaces? Is namespace the same as the account (if exists) username or platform's username? What if different people have the same username on different platforms? And by submitting we are implying that if you don't submit your module to some index, your efforts won't get much attention. You may know that See? We are just building another npm if we go for the first option. |
There are many other things going on, but I think it could be very relevant if the intention is to try and keep this under some control (for example, I have noticed this one - #269 (comment)) e.g - someone took an active stance to say "this module is not wanted") Maybe there should be a committee or very clear rules on what goes in and what's not?
|
What if the URL doesn't have the extention but the server replies with the In one of my experiments, I discovered that this works with the CLI, but haven't tried it in a script. I'll do some tests and post it here. |
You mean EDIT: Sorry, I just test the following code with Deno 1.0.0 and it runs fine. Either my memory is wrong or the rule got changed at some time. import isInteger from 'https://cdn.pika.dev/lodash.isinteger@%5E4.0.4'
console.log(isInteger(42)) |
I'm new to JavaScript so, you can't expect much from me. If that doesn't work, then URLs will be of the format : |
Things are different between package managers for system and programs.
I myself maintain an index for
Definitely better if we do have one, but the problem becomes who is qualified and willing to devote effort and time.
When Deno becomes widely used people will go for CDNs (GitHub raw is not CDN and should only be used for simple cases), just check CDN stats.
This is where things like GitHub stars and curation list (like https://github.com/denolib/awesome-deno) comes in. |
I'm more interested with the idea of using bitTorrent protocol to store the module. The developer give magnet link ( or similar ) with the trailing
|
@Nexxkinn I couldn't resist plugging hyperdrive (or |
Oh yeah, Hypercore protocol is much more suited in this case. But we might need an additional tag for version control like in my previous post. |
If there is no center registry,it means no registry mirror. |
I've been thinking about this for quite some time now and came to the conclusion that it's pretty much impossible to satisfy everyone with a single solution. I think it's important to remember what Deno is trying to achieve at the core, which is discard the need for something like npm. That does however not elminate the need for a registry, which allows searching and browsing modules. For that reason I've been working on a service for some time now, trying to incorporate the idea of being able to host a module wherever you like and at the same time provide a consistent way of importing those modules in your project. If you're interested, have a look at https://deno.to. To sum up the concept, you can publish your module on GitHub, GitLab or BitBucket and add it to the website, manually. It's more or less a vanity URL service, similar to https://deno.land/x, but improved. All it does is redirect to the URL you provided. It also addresses most of the issues mentioned above, such as
The only disadvantage is the need for an account system, but it won't be possible to have a service like this without it. I'm still working on getting an alpha version up and running, but my time is somewhat limited at the moment. The consideration could be made to just open source it and let the community build it as a whole. If there are better domain name suggestions, feel free to let me know and I'll buy them if needed. I figured deno.to was clever because it redirects to somewhere. I'm definitely going to keep working on it, the nice thing about Deno is that we don't depend on just one solution. If this turns out to not be the way to go, it can always be discarded. Feel free to share suggestions, improvements or criticism. |
@hacked solution is so good but can not solve scale problem. If we combine @millette and @Nexxkinn solution with @hacked solution, we can solve scale problem. Big companies can invest for CDN but indie developer could not there for if we use hyperdrive, we can solve scale problem. We need redirect service as Hyper URLs are hard to use,
We can use @hacked slove for redirects to the Hyper URL and it will give solutions for other issues too. We just need support for hypercore protocol ( Maybe deno can download modules using Merkle trees and seed modules. Hypercore works like lightweight blockchain crossed with BitTorrent there for it much secure, speed and decentralized way. import { Foo } from "hyper://deno.to/ryan/foo/4.17.1"; CC: @mafintosh |
Regarding |
I appreciate you sharing your thoughts. This isn't the highest priority thing for us, so I think we'll be continuing database.json for the foreseeable future. I don't want to complicate the system by adding a database until I really am feeling the pain of database.json. |
@hacked Please keep working on https://deno.to/. I think this will make a lot of things easier for the community and help manage packages better |
I know I probably shouldn't mention PHP in these parts but @hacked's solution does sound similar to Packagist, which is essentially just a front door to GitHub. And to the scale point, I don't believe Packagist had any major issues with scale as ultimately they don't host anything either. I feel if adoption of Deno is going to become widespread then a package manager, even if it is merely a facade, is inevitable. I would definitely encourage @hacked to continue his exploration of this topic. |
I realize this issue is closed, but wanted to add my 2 cents... maybe this can be referenced when the topic becomes higher priority or while experimented with in https://deno.to @hacked proposal is nice, and I would like to just add to it by suggesting a way to guarantee package content integrity. I'm proposing that name, in addition to the readable part, also contains package Content Identifier (CID), enabling compiler, on the other side, to recalculate the content hash on the first import occurrence, and verify that the content of the bundle was not altered. This would effectively have each bundle release represented by unique and immutable token:
This token would be published together with the package name & description to the deno.registry, a public/trusted website enlisting packages and proxying to the hosting cdns - by preppending serving cdn’s domain to it :
Examples:
This doesn’t solve all attack vectors, but, possibly provides higher resistance to tempering with bundle content than current solutions. Downside is the need for trusted deno.registry database :( Authoring side: publishing/releasing
Consuming side: importingso, for this POC, consumer would grab the URL of the package from deno.registry and:
For better DX on the consuming side, shorter, readable package name could be used, using local compiler config files to get the full URL?
|
@srdjan I agree that there should be some sort of integrity check that what is downloaded is actually what was intended to be downloaded. But such a check needs to be implemented in Deno itself before any registry (or more generally URL) can make use of it. See #200 for the relevant discussion. Also the shorter import syntax would need to be supported by Deno and is not up to the registry to decide. I doubt anything that would make import not browser compatible will be implemented which is a good thing torwards a platform independent scripting language. |
I'm not sure whether or not I should proceed with building (yet another) registry. Someone has started to work on something similar a while back, it's available at https://module.land/. |
As you said the frontend is built on top of the mindset of a typical package manager. This bring us right back into NPM land with all the configuration files. But if we could consolidate efforts that would be even better. Pinging @pagoru from @moduleland. |
Hey @hacked, you can help us with @moduleland if you want, we want to respect the principles of deno as much as we could, so, if people help with the project with they ideas, can be really useful. |
If we need to keep everything decentralized, why is there a registry in the first place? A search engine that could search deno modules would fully accomplish the needs of decentralization, namespace and search deno.land could just do a custom Google (or whatever) search, or even use its own search engine |
@hazae41 Searching is harder than it seems. How do you guarantee to 1) find only Deno modules 2) find all Deno modules on the Web? No search engine can do that, not even Google. My initial post focused exactly on this. And even limited to Github the search would end in the same issues. To not disadvantage anybody the best solution is to offer a registry where anyone can sign up and publish the URL to their module, no matter where this URL is. The URL could even be on a web page to which there are no links on the web, meaning no search engine could find it (Except if you told the search engine where to look. But then you could as well just publish the URL on the registry.) |
Experimental support for using any GH/NPM repo landed (#659). Any other provider supported by deno.land, can be also mapped via virtual db patterns :)
|
All this |
First, thanks for expressing your ideas in a coherent and organized manner! I agree with mostly everything.
I would only be interested if there are no limitations on where I publish my module. And it does not matter if nobody ever can find my module and use it -- except me. I'm not hiding my module -- I'm just not interested in promoting and advertising it. I want to be free to have no limitation on my module's lack of popularity.
The following is a use case that was real for me and how I handled it; I may be revealing my naivete, but I will share anyway in case it brings a viewpoint that helps others to solve the Deno third party module problem. I ported an NPM module that I wrote to use with Deno in a server-only context. It's an in-memory key-value storage module with parameterized ttl functionality. Initially I did not publish it anywhere. It existed only on my local file system (in its own local directory). And the import statement in my Deno code imported it using local file system syntax. Since I deploy the server as a compiled binary (via Deno's compile mechanism, which is fantastic, though ymmv), which is built locally, import statetements for third party modules don't execute at run-time. However, I have two different API servers that use this ttl_storage module. In the spirit of reusability and Deno's import statement via URL syntax, I just put the module's "mod.ts" file up on a web server that I own, accessable via a URL. Deno (v1.16.1) would not accept my custom URL that wasn't coming from deno.land/x/. "What's this?" I thought at the time -- this is hardly decentralized. Moreover, when I discovered that if you want to be able to import a third party module, it must be published on Github only. At this point I concluded that NPM's approach was no worse than Deno's approach and was superior to what Deno had implemented. So I decided to bite the bullet and publish the little ttl storage module up on Github and link it to deno.land/x. It solved my problem, but it would have been so much easier to just put my module up on my own web server and I'm done. Again, I'm not forcing anyone to use this module. It only matters that I can reuse a library to keep things DRY. (Additionally, being forced to publish code to Microsoft to get this to work makes me think that there are some deals between Deno and Microsoft that have yet to come to light. Call me cynical, but if you don't understand this, then it's not me that's naive.) |
@gold I'm a little confused about what your problem is here. Deno DOES support self hosted modules no matter the origin and it should be working just as you described. This specific issue is meant to discuss origins for modules hosted on deno.land/x, not modules for Deno itself If you want to discuss/debug the problem you are facing, why don't you hop into the discord server instead? |
@Soremwar Thanks for letting me know that "Deno DOES support self hosted modules." I tried again. For some reason, the original site's URL was reachable via a browser, but Deno had encountered an unexpectedf TLS handshake error during a build. I just now put the module on another one of my personal sites and Deno's import statement worked perfectly. I now feel like a complete idiot! (feel free to use my simple ttl_storage module at deno.land/x) |
This issue has already been brought up multiple times in the old repo #26,
#117 and in this issue comment on the main repo. In the meantime this new repo came up, and the old issues were archived. Now with v1 dropping and users starting to adopt Deno, this becomes more and more important, which is why I want to open an issue for it.
The Problem
Modules can be added to the
database.json
to make them show up on the Third Party Modules pagehttps://deno.land/x/
and available under the URLhttps://deno.land/x/MODULE_NAME@BRANCH/SCRIPT.ts
.There are a few reasons why this is a suboptimal idea.
If Deno becomes even only half as popular as Node, there would be hundreds of modules added each day. It will totally flood the commit history and make the repo hard to maintain. Every PR needs to be verified by a maintainer, otherwise people could manipulate entries in any way. This is bad. Just looking at the recent commit history most of the commits are already related to adding entries for Third Party Modules.
Since module names must be unique, this will give rise to bad naming conflicts. Only the first module can be added, leaving anybody else with the same module name left out. This will be more like a domain name registry where all the good names are already taken. Even worse, entries for old unmaintained modules will block forever, because somebody out there might still depend on the URL so they can't be removed. This is even worse than domain names, which can at least expire.
The current system allows only links to GitHub registries. This might currently not be a problem, since GitHub is the defacto standard Git repository host, but inherently being locked into one host by choice of a too narrow system is never good. Also limiting the registry to GitHub is effectively centralizing it, which achieves the opposite effect of what Deno tries to achieve by using the ES module system.
In summary, the current module registry doesn't scale. It is high maintanance, requires unique module names, and is centralised.
A solution
What people need is an overview of modules compatible with Deno and a convenient way to link to them.
Currently the Third Party Modules registry solves both, albeit in a very limited way as highlighted above. Entries in the
database.json
are used to build the overview, as well as generate the links available throughhttps://deno.land/x/...
.I believe these two features shouldn't be made dependent on each other, but looked at individually.
There needs to be a way for developers to add (and remove) modules that scales well for lots of modules each day. Also it shouldn't be to the burden of the Deno maintainers, there should be an automated system of some sorts.
One idea would be to generate the overview dynamically by some form of a search algorithm. Essentially a Google for Deno modules. For example, a table with entries of all repositories that are tagged "deno" (https://github.com/topics/deno). The single source of truth is the repository and the generated table would adapt dynamically to any changes of the repo, like name, description, or even deletion. Repositories wouldn't need to duplicate the name and description in another place and keep maintaining it if something changes. This would scale much more nicely, as no manual work is needed anymore and Deno maintainers could get back to writing code instead of verifying PRs. Repositories just need to add the tag "deno" to publish their modules which most already have anyways.
There are still some drawbacks to this that should be mentioned. There will be false positive matches by repositories adding the search tag for an unrelated reason. This could be mitigated by a sufficiently specific tag, but never fully solved. Also it doesn't solve the centralization problem, as other hosting providers could be added to the search algorithm, but only so many. Modules hosted on a private website would be hard to add to the search algorithm.
If the module overview from above had a convenient way to obtain the URL of a module, then there would be little use case for an URL shortener like
https://deno.land/x/...
. For example, there could be a "Copy URL" button in the dynamically generated table which directly copies the github.com URL such that it can be pasted into the code, similar to what Google Fonts does.Direct URLs also solve the issue of centralization. Obtaining all modules through the same domain makes the registry fragile. What if
https://deno.land/x/...
experienced outages, or even attacks? Also the URL shortener goes against the idea of the decentralization that Deno tries to achieve with the ES module system. Making a whole ecosystem depend on the uptime of a single domain is not good. Only raw direct links to wherever the module is hosted are truly decentralized, with no dependency on a single URL proxy.Btw. GitHub URLs are actually pretty readable already making another point for abandoning the URL shortener
https://deno.land/x/...
.Demo
I made a simple demo of repositories tagged with "deno" on GitHub which however still has the shortcomings mentioned above.
There are several false positives, like personal projects, editor extensions, deno itself, etc.
The
mod.ts
link serves as a standin for the "Copy URL" button. It also shows another problem of the search algorithm approach, as there is no predictable way to link to the entry script. Relying on a convention would not work, and the link would need to point to the repo where people need to read the actual documentation to find out what to import.Also the console throws an error after the 10th page, since GitHub API limits unauthenticated requests to 10 per minute. Sicne each page only loads 30 items, the script would need a GitHub authentication to show more than 300 results.
TLDR
The Third Party Modules page
https://deno.land/x/
could generate an overview of compatible modules with Deno dynamically from the "deno" tag on GitHub, to not depend on manual addition of modules todatabase.json
. The URL service could be expanded to work as a full URL alias or just be abandoned altogether in favor of convenient links to GitHub directly from the table.Summary
If Deno grows just as huge as we all hope, then the current Third Party Module registry page just won't do. We should fix it rather soon with the few hundred modules, while we still got a chance.
EDIT: When I started this issue I favored the dynamic search approach. However by now I believe an actual registry would be better, due to decentrailzation not being limited by the search algorithm. The registry should directly link to the modules, wherever they are hosted, be it on GitHub or a tiny personal website (not like NPM, which actually hosts the modules). If the registry makes it easy to obtain the URL, imagine Google Fonts, then there is little reason for using a centralized URL shortener.
EDIT: There is a great proposal by
@hacked for a registry with an example implementation.
The text was updated successfully, but these errors were encountered: