Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: compute module format based on package.json visible to declarationDir/outDir #54546

Closed
wants to merge 6 commits into from

Conversation

andrewbranch
Copy link
Member

@andrewbranch andrewbranch commented Jun 5, 2023

Note: there are good reasons not to publish dual CJS/ESM npm packages. This is not a blanket endorsement of the practice. However, we’ve noticed that many existing dual packages ship types that don’t correctly describe the JS contents of the package. I believe this is in no small part because doing dual CJS/ESM emit with tsc is a cumbersome UX (and completely undocumented). This is a small change that significantly improves the experience.

In --module nodenext, we decide whether a file index.ts is ESM or CJS by looking up through its containing directories for the nearest package.json file so we can see whether it includes "type": "module". With this PR, we instead begin that search with the file’s declarationDir or outDir. This allows the developer to pre-seed an outDir per module format with a package.json, and run a compilation into each:

dist/esm/package.json:

{ "type": "module" }

dist/cjs/package.json:

{}
# Produce an ESM build with types
tsc --module nodenext --outDir dist/esm --declaration index.ts

# Produce a CJS build with types
tsc --module nodenext --outDir dist/cjs --declaration index.ts

These builds can be linked with a solution-style tsconfig.json:

// tsconfig.json
{
  "files": [],
  "references": [
    { "path": "./tsconfig.cjs.json" },
    { "path": "./tsconfig.esm.json" },
  ]
}

Note that in this configuration, where the input files are covered by two different tsconfigs, TS Server picks uses the second one in the array for errors and intellisense. #54476 discusses ideas for improvements to the editor and orchestration experience.

@typescript-bot typescript-bot added Author: Team For Uncommitted Bug PR for untriaged, rejected, closed or missing bug labels Jun 5, 2023
@@ -3508,11 +3526,58 @@ export function createProgram(rootNamesOrOptions: readonly string[] | CreateProg
return result;
}

function getCreateSourceFileOptions(fileName: string, moduleResolutionCache: ModuleResolutionCache | undefined, host: CompilerHost, options: CompilerOptions): CreateSourceFileOptions {
function getImpliedNodeFormatForFile(fileName: string) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m open to suggestions for a different name to differentiate from ts.getImpliedNodeFormatForFile which is public API, but I intend to expose this one as program.getImpliedNodeFormatForFile, and add JSDoc to the top-level standalone one that says you should use the program version if possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think functionality needs to be moved under getImpliedNodeFormatForFileWorker with optional new methods that are required since we expose that method publicly and we want it to be same as what program does to figure out file's implied format.

Comment on lines +3554 to +3566
getOutputDeclarationFileNameWithoutConfigFile(
fileNameForModuleFormatDetection,
options,
!host.useCaseSensitiveFileNames(),
currentDirectory,
getAssumedCommonSourceDirectory) ||
getOutputJSFileNameWithoutConfigFile(
fileNameForModuleFormatDetection,
options,
!host.useCaseSensitiveFileNames(),
currentDirectory,
getAssumedCommonSourceDirectory) ||
fileNameForModuleFormatDetection;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered checking to see if a result computed from declarationDir disagrees with a result computed from outDir and issue an error, but that seems like such an edge case that it wouldn’t be worth spending time doing that for every source file that could be influenced by it. I decided to prefer declarationDir since that seems most likely to give a library author the same analysis that their users will see. But if someone tries really hard, they can make an invalid situation by using separate outDir and declarationDir with conflicting package.json files in each.

@andrewbranch andrewbranch changed the title Compute module format based on package.json visible to declarationDir/outDir Proposal: compute module format based on package.json visible to declarationDir/outDir Jun 5, 2023
@andrewbranch andrewbranch added the Experiment A fork with an experimental idea which might not make it into master label Jun 5, 2023
@andrewbranch
Copy link
Member Author

@typescript-bot pack this

@typescript-bot
Copy link
Collaborator

typescript-bot commented Jun 5, 2023

Heya @andrewbranch, I've started to run the tarball bundle task on this PR at 8a7c262. You can monitor the build here.

@andrewbranch
Copy link
Member Author

@typescript-bot test top100

@typescript-bot
Copy link
Collaborator

typescript-bot commented Jun 5, 2023

Heya @andrewbranch, I'm starting to run the diff-based top-repos suite on this PR at 8a7c262. Hold tight - I'll update this comment with the log link once the build has been queued.

Update: The results are in!

@typescript-bot
Copy link
Collaborator

typescript-bot commented Jun 5, 2023

Hey @andrewbranch, I've packed this into an installable tgz. You can install it for testing by referencing it in your package.json like so:

{
    "devDependencies": {
        "typescript": "https://typescript.visualstudio.com/cf7ac146-d525-443c-b23c-0d58337efebc/_apis/build/builds/155361/artifacts?artifactName=tgz&fileId=92734ACF3FB984F50DE3DB226FAB5407EE92F508175D59EC64F4C3E2778A4C7A02&fileName=/typescript-5.2.0-insiders.20230605.tgz"
    }
}

and then running npm install.


There is also a playground for this build and an npm module you can use via "typescript": "npm:@typescript-deploys/pr-build@5.2.0-pr-54546-4".;

@typescript-bot
Copy link
Collaborator

@andrewbranch Here are the results of running the top-repos suite comparing main and refs/pull/54546/merge:

Something interesting changed - please have a look.

Details

lyswhut/lx-music-desktop

1 of 6 projects failed to build with the old tsc and were ignored

src/common/tsconfig.json

  • error TS2213: The project root is ambiguous, but is required to determine the module format of output '.js' files. Supply the rootDir compiler option to disambiguate.
    • Project Scope

src/main/tsconfig.json

  • error TS2213: The project root is ambiguous, but is required to determine the module format of output '.js' files. Supply the rootDir compiler option to disambiguate.
    • Project Scope

src/renderer-lyric/tsconfig.json

  • error TS2213: The project root is ambiguous, but is required to determine the module format of output '.js' files. Supply the rootDir compiler option to disambiguate.
    • Project Scope

src/renderer/tsconfig.json

  • error TS2213: The project root is ambiguous, but is required to determine the module format of output '.js' files. Supply the rootDir compiler option to disambiguate.
    • Project Scope

@Andarist
Copy link
Contributor

Andarist commented Jun 6, 2023

As you know - I'm not a fan of dual packages so I'm not really advocating here anyhow on their behalf, I'm not invested in the game. I have some thoughts on the DX of this improvement though.

  1. I feel that something like this is a step in the right direction when it comes to enabling dual package emits. Since src is singular - you need to find a way to "fork" it into two somehow.
  2. I don't think that pre-seeding a directory will work super well with existing workflows:
  • a super popular setup involves using a Rollup-based library builder (there are a couple of them, such as tsup and more). Those tools often don't use TS directly to do the declaration emit (or at the very least they don't typecheck sources and what they generate). In those situations, typechecking usually happens in parallel or something and it's using noEmit: true. They also often clean the dist dir before doing a fresh build. I assume that you envision that those tools would pre-seed the appropriate directories with those package.json files so TS could pick them up. I think this introduces an unneeded waterfall dependency between those jobs here. What if one would like to do JS emit and DTS emit completely in parallel? This would require a third script that would just pre-seed this for tsc. I think that TS should be able to create those package.json virtually in those situations.
  • people might actually use tsc for emitting both JS and DTS. In this case, the pre-seeding script would be required but I honestly have a hard time justifying the need for it. I think that the tsc command should be able to do this on its own (without noEmit: true). The motivation for all of this here is to smooth the DX - I feel that requiring additional scripts isn't that.
  • there are likely other combinations of things to consider here as well

@@ -596,7 +596,21 @@ export function getOutputDeclarationFileName(inputFileName: string, configFile:
);
}

function getOutputJSFileName(inputFileName: string, configFile: ParsedCommandLine, ignoreCase: boolean, getCommonSourceDirectory?: () => string) {
/** @internal */
export function getOutputDeclarationFileNameWithoutConfigFile(inputFileName: string, options: CompilerOptions, ignoreCase: boolean, currentDirectory: string, getCommonSourceDirectory: () => string) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need separate functions for these. whats different. i couldnt make out from the function,
configFile used in the helper method only needs options and filenames (root file names) so we have that in program so we could change to Pick<ParsedCommandLine, "options" | "fileNames"> and that should work right. or am i missing something.

@jakebailey
Copy link
Member

One thing I think I'd like to see (which maybe overlaps with @Andarist's commentary) is some way to do this that doesn't involve pre-seeding a package.json (allowing it to come later as a part of a final build step); if we already need multiple tsconfig files to emulate this, I feel like it would make sense to have some sort of config that can specify whether or not .js/.ts files are to be interpreted as commonjs or module. That way, you could still "just clone" a repo and get a working setup without committing what feel like build artifacts. But, maybe that's unsound given a single compilation may actually contain many package.jsons and so it'd be difficult to properly scope to the "right" code. And maybe it's sorta moot as without actually putting down the package.json file, the code tsc just emitted wouldn't even run, or may not even work when loaded by other projects?

That being said, I do like the proposal, given it effectively makes safe the packaging method that @isaacs is now using for his packages (e.g. https://unpkg.com/browse/glob@10.2.6/dist/).

@Andarist
Copy link
Contributor

Andarist commented Jun 6, 2023

That being said, I do like the proposal, given it effectively makes safe the packaging method that @isaacs is now using for his packages (e.g. https://unpkg.com/browse/glob@10.2.6/dist/).

The existence of https://unpkg.com/browse/glob@10.2.6/dist/cjs/package.json in this form... so confusing 🙈 Can't wait for some resolvers to be tripped over non-root exports in this package.json.

@isaacs
Copy link

isaacs commented Jun 6, 2023

I don't have any strong feelings about the details in this approach, but my overall response to the idea is that it feels like y'all are going out of your way to address my niche use case in particular without me even having asked, and I rarely feel so loved in OSS, so thank you very much for making my day ❤️

Shipping hybrid TS modules is indeed a bit of a house of cards currently, but enough people are using CJS (and will continue for the foreseeable future) that it is a worthwhile endeavor imo, whether it's ill-advised or not. It would be really awesome if I could have some set of configs so that I just run tsc and get all the right kinds of files in predictable places. I don't see any way around having two separate tsconfig files (or at least, separate config objects), because sometimes you do have to change other settings between the two.

For example, https://github.com/tapjs/processinfo is a hybrid module, but some of the files are cjs only, and some are esm only, so they have different include/exclude lists. Anything that uses require.resolve() needs to have entirely separate implementations, because import.meta.url is radioactive outside of ESM, and require needs to be loaded using import.meta.url outside of CJS, requiring this kind of post-build step to work properly.

The existence of https://unpkg.com/browse/glob@10.2.6/dist/cjs/package.json in this form... so confusing 🙈 Can't wait for some resolvers to be tripped over non-root exports in this package.json.

Hah, oof. Yeah, that's a bug. When I added a bin script, I made it import from package.json to get the current version, and didn't notice that meant picking up the actual entire package.json. Thanks for bringing it to my attention :)

isaacs added a commit to isaacs/node-glob that referenced this pull request Jun 6, 2023
'npm publish' runs 'npm prepare', but does _not_ run postprepare,
leading to a 'works on my machine' problem.

Via: microsoft/TypeScript#54546 (comment)
@jakebailey
Copy link
Member

Hah, oof. Yeah, that's a bug. When I added a bin script, I made it import from package.json to get the current version, and didn't notice that meant picking up the actual entire package.json. Thanks for bringing it to my attention :)

Here I had thought that was part of the trick (back when I saw it originally), oops.

@Andarist
Copy link
Contributor

Andarist commented Jun 6, 2023

In this case, it was likely just creating a CJS package scope and nothing else. exports are ignored in non-root package.json files (unless you self-import).

Hah, oof. Yeah, that's a bug. When I added a bin script, I made it import from package.json to get the current version, and didn't notice that meant picking up the actual entire package.json. Thanks for bringing it to my attention :)

Nice! ❤️ When re-reading my comment now - I think that it could be taken as snarky or smth, it definitely was not my intention, just noting down something I noticed ;p It's super cool that you addressed this

@isaacs
Copy link

isaacs commented Jun 6, 2023

I think that it could be taken as snarky or smth, it definitely was not my intention, just noting down something I noticed ;p

No worries, I usually try to assume positive intent.

Turns out I'd already had a fix in place, but the fix was in postprepare rather than prepare, which is run on a manual npm run prepare, but isn't run on npm publish, so it was potentially an even more annoying bug than I'd thought.

@andrewbranch
Copy link
Member Author

Thanks for the feedback. I take all these points, but to clarify my suggested workflow, I assume that the outDir package.json files will be committed, while the rest of the outDir contents would typically gitignored. This could be done with scripts, but there’s little reason not to just commit them IMO. So if using other tools in combination with tsc, they don’t need to add package.jsons, they just need to not delete the non-gitignored one that’s already there.

I feel like it would make sense to have some sort of config that can specify whether or not .js/.ts files are to be interpreted as commonjs or module

The only version of this I could see is something that would emit the appropriate package.json into the outDir. That would eliminate the need to commit them, and might make it easier for other tsconfig-reading tools to understand what’s about to happen. If we did something like that, I would suggest it as an addition to what I have in this PR, not a replacement for it.

@emmatown
Copy link
Contributor

emmatown commented Jun 6, 2023

I can't see any tests around this so I'm curious if a package.json in the outDir/declarationDir would also be used for resolving with the imports field (& exports field for self-reference)? I would imagine it should since that's how it would work after emit, right?

@andrewbranch
Copy link
Member Author

In theory, all module resolution should happen from output files, but in practice, it never does. #37378 is essentially a feature request for that.

@emmatown
Copy link
Contributor

emmatown commented Jun 6, 2023

That would also imply that this should be allowed too then right?

// tsconfig.json
{
  "compilerOptions": {
    "module": "nodenext",
    "outDir": "./dist"
  }
}
// src/index.ts
import { a } from "./a";

console.log(a);

// dist/a.d.ts
export declare const a: string;

Which seems quite unintuitive and frustrating if e.g. there used to be a file at src/a.ts and you deleted it but there's now a stale .d.ts file in the dist folder (and running tsc won't delete that) so you're still allowed to import ./a from src/index.ts as if src/a.ts exists

(I'm leaving this comment here rather than on the issue you linked since from my (maybe incorrect?) reading of that, it seems like the consensus there was to not change module resolution like this PR is doing?)

@andrewbranch
Copy link
Member Author

This PR doesn’t change module resolution at all. The only outDir-based logic applied is finding package.jsons for purposes of module format detection.

@Andarist
Copy link
Contributor

Andarist commented Jun 7, 2023

I take all these points, but to clarify my suggested workflow, I assume that the outDir package.json files will be committed, while the rest of the outDir contents would typically gitignored. This could be done with scripts, but there’s little reason not to just commit them IMO.

There might be little reason to not commit them but it's usually just way simpler to not commit them which also extends to...

So if using other tools in combination with tsc, they don’t need to add package.jsons, they just need to not delete the non-gitignored one that’s already there.

It's just way easier to just rm -rf dist (explicite or conceptually).

So overall, I think that it's a somewhat big ask for other tools to tweak how they work today so those files don't get removed.

The only version of this I could see is something that would emit the appropriate package.json into the outDir. That would eliminate the need to commit them, and might make it easier for other tsconfig-reading tools to understand what’s about to happen. If we did something like that, I would suggest it as an addition to what I have in this PR, not a replacement for it.

Playing devil's advocate - if emitting them is not the table here, what exact value the current proposed approach would have (or rather, under what circumstances one would like to use/it would kick in)? I wonder if you need it if you'd go with emitting those package.jsons. If I understand correctly, the emitting mode~ would assume replacing the potentially existing package.jsons - so it's not exactly just an addition on top of what is proposed here.

@andrewbranch
Copy link
Member Author

The value here is mostly for people who won’t use a third-party tool at all, or will only use a third-party tool to post-process what tsc builds.

it's a somewhat big ask for other tools to tweak how they work

I don’t really expect other tools to change, which reinforces why I’m making this PR. If tsc makes it reasonably easy to do type-accurate and runtime-safe dual emit, and it can be shown that every third party tool is runtime-unsafe and/or produces broken/unchecked type declarations, then I think new projects will adopt this, and a small number of existing projects with fairly simple builds that only leverage other tools to do dual emit might switch.

If I understand correctly, the emitting mode~ would assume replacing the potentially existing package.jsons - so it's not exactly just an addition on top of what is proposed here.

I don’t really see a problem with that. It’s not hard to say the package.json that matters is the hypothetical emitted one ?? the one that's there.

Is there something in particular you would suggest instead?

@fatcerberus
Copy link

fatcerberus commented Jun 7, 2023

FWIW isn't it kind of annoying to have to .gitignore "entire dist/ folder except for this one file"?

@fatcerberus
Copy link

fatcerberus commented Jun 7, 2023

@Andarist

It's just way easier to just rm -rf dist (explicite or conceptually).

This is why I love git clean -xdf1, as it won't delete tracked files. You do have to be careful about untracked files though.

Footnotes

  1. If in doubt, git clean -ndf to see what would be deleted

@andrewbranch
Copy link
Member Author

FWIW isn't it kind of annoying to have to .gitignore "entire dist/ folder except for this one file"?

Yeah, it’s kind of annoying for 30 seconds when it doesn’t work the first time and you have to ask Copilot how to do it and then never look at your .gitignore again.

@andrewbranch
Copy link
Member Author

I think the most significant problem with this approach is that it ignores the possibility of TS bundlers and runtimes that consume input files and also care about package.json "type". By analyzing only the output file structure, we solve things for Node, but break them for ts-node. Any solution here needs to ensure the input and output structures agree. I talked through some ideas with @DanielRosenwasser and have an alternative proposal that addresses that problem as well as some of the usability concerns raised in this PR. I’ll open a fresh issue shortly.

@andrewbranch andrewbranch deleted the feature/outDirPackageJson branch June 9, 2023 21:00
@fatcerberus
Copy link

Any solution here needs to ensure the input and output structures agree.

Maybe I’m just not using my imagination enough but this constraint sounds like it would prevent you from doing dual packages from the same source entirely

@andrewbranch
Copy link
Member Author

#54593

@fatcerberus yeah kind of—the main thing I’m thinking is we need to not make agreement a huge pain for projects that aren’t doing dual emit.

@knightedcodemonkey
Copy link

I'm a fan of the intent here for sure. I'm not a fan of the idea of needing two package.json files, especially when that's what exports is supposed to preclude. I've been noodling away at dual packages in Babel (babel-dual-package) and TypeScript (duel) over the past month, and in regards to tsc I believe I've mostly got it working by generating a second tsconfig.json file for the dual build, so that all one has to do is update the exports field to match the build output. However, I could not find a set of compiler options to allow this to happen seamlessly without one of the new file extension's module system being corrupted, so I had to copy files over from one build into the other to preserve the module system. Updating the package.json type field dynamically also created compiler errors, and the road from there just added more complexity I wasn't interested in dealing with. Overall, a second package.json seems like an anti-pattern.

I'd love to see this happen, one way or another.

@andrewbranch
Copy link
Member Author

that's what exports is supposed to preclude

I might be misunderstanding what you’re saying, but if not, I think you may have the wrong idea about what exports conditions do. Conditions cannot affect the format of a file they point to:

{
  "exports": {
    "import": "./esm.js",
    "require": "./cjs.js"
  }
}

No matter how they are accessed, Node will interpret both esm.js and cjs.js as CommonJS, because their extension is .js and the package.json does not have "type": "module". If I import this package, the resolution will go through the import condition to esm.js, but then, Node will attempt to parse esm.js as a script, and will crash with a syntax error when it encounters import or export syntax.

@knightedcodemonkey
Copy link

knightedcodemonkey commented Jul 31, 2023

To my understanding, conditional exports in a package.json are supposed to properly map a file to any module system based on the condition applied, i.e. import or require. The particular package I've been working on @knighted/duel chooses to explicitly define file extensions for each condition where relevant (because that's the world we're in now) to distinguish the CJS variant from the ESM variant, or vice versa. I do this by updating specifiers via @knighted/specifier.

I've added a test to @knighted/duel to ensure the runtime behavior of CJS-first packages work as expected with the conditional exports defined in package.json. The commit is referenced above. You should be able to check that out and run something like npm test and then node test/__fixtures__/cjsProject/dist/index.js to see that there are no runtime errors thrown. I'm on the latest Node.js, v20.5.0.

In short, my approach does update file extensions and associated specifiers. The package.json file in test looks like this:

{
  "version": "0.0.0",
  "type": "commonjs",
  "exports": {
    ".": {
      "import": {
        "types": "./dist/mjs/index.d.mts",
        "default": "./dist/index.mjs"
      },
      "require": {
        "types": "./dist/index.d.ts",
        "default": "./dist/index.js"
      },
      "default": "./dist/index.js"
    },
    "./package.json": "./package.json"
  }
}

@andrewbranch
Copy link
Member Author

conditional exports in a package.json are supposed to properly map a file to any module system based on the condition applied

“supposed to” by convention, not by any magic in the implementation. Your example works only because the file extensions agree. Conditions are only used during module resolution; they have zero impact on module format detection. Conditions are really just arbitrary strings, but Node applies the import and require conditions to its resolution process based on the requesting syntax, so conventionally configured exports can ensure that they don’t feed an ES module to require. But exports are free to mix it up and cause require(esm) errors, or on the other hand feed a CJS module to an import ("import": "./cjs.cjs"), which would be unconventional but won’t actually cause errors, since you can always import a CJS module in Node.js.

@knightedcodemonkey
Copy link

knightedcodemonkey commented Jul 31, 2023

Your example works only because the file extensions agree.

You're right, it does work.

I will invoke your stated design goals 4 and 7 here. Whether you want to support this or not is up to you. I don't particularly need it 👋

@Andarist
Copy link
Contributor

Andarist commented Aug 1, 2023

If anything Andrew is preserving the runtime behavior of all JavaScript code here (goal 7). Non-conventional packages might exist out there and they should be supported by TS, it should be able to reflect their runtime behavior. Therefore the assumption about the module format based on the condition alone is just something that TS can't do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Author: Team Experiment A fork with an experimental idea which might not make it into master For Uncommitted Bug PR for untriaged, rejected, closed or missing bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants