Skip to content

Commit

Permalink
feat: Manually walk ts sources dependency graph
Browse files Browse the repository at this point in the history
to determine a set of files to pass to tsickle. Also implements a
tranasformer that rewrites goog.requireType calls to module-dts to
global variable aliases. For more information: see the respective
file comments. This partly addresses
#334.
  • Loading branch information
theseanl committed Sep 17, 2020
1 parent 303a7bb commit aabd397
Show file tree
Hide file tree
Showing 41 changed files with 555 additions and 54 deletions.
123 changes: 123 additions & 0 deletions packages/tscc/src/graph/TypescriptDependencyGraph.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
/**
* @fileoverview Starting from a provided set of files, it walks Typescript SourceFiles that are
* referenced from previous SourceFiles.
*
* This information is provided to tsickleHost so that only such referenced files are processed by
* tsickle. This is mainly concerned with what files to use to generate externs. Why not just feed
* every `.d.ts` file to generate externs? Currently Typescript's type inclusion often includes "too
* many files" -- If tsconfig.json does not specify `types` compiler option, it will include every
* type declarations in `./node_modules/@types`, `../node_modules/@types`,
* `../../node_modules/@types`. Such a behavior is actually OK for usual TS usecase, because types
* anyway do not affect the Typescript transpilation output. However, in our setup they are all used
* to generate externs, and the more the type declarations, the more it takes to compile and the
* more it is prone to errors.
*
* An easy way(for me) would be to require users to provide every such package's name. But sometimes
* a package(A) may implicitly refers to another package(B)'s type declarations, and that package B
* also needs to be provided to tsickle, so this way requires users to __know__ what other packages
* this package A refers to, which requires users to inspect its contents, and this is not
* ergonomic.
*
* At the other extreme, we can include every .d.ts that typescript "sees". This will lead to the
* most correct behavior in some sense, because this is something you see in your IDE. But this may
* potentially lead to enormous amount of externs file and slow down the compilation as it will
* include everything in `node_modules/@types` directory unless you use types[root] compiler option.
* This may also cause more bugs coming from incompatibility between typescript and the closure
* side.
*
* Therefore, an intermediate approach is taken here. We use the same module resolution logic to
* find out which files were explicitly referenced by user-provided file. This requires discovering
* files that are either (1) imported (2) triple-slash-path-referenced (3)
* triple-slash-types-referenced. However, some declaration files that augments the global scope may
* not be discoverable in this way, so we add external modules provided in spec file and any module
* that is indicated in `compilerOptions.types` tsconfig key to this.
*
* There are some work going on from TS's side in a similar vein.
* {@link https://github.com/microsoft/TypeScript/issues/40124}
*
* Currently, this is done using an unexposed API of Typescript. I'm not sure why this is unexposed
* -- there are APIs such as `getResolvedModuleFileName/setResolvedModuleFileName`, but not
* something to iterate over resolved module file names.
*/
import * as ts from 'typescript';
import {getPackageBoundary} from '../tsickle_patches/patch_tsickle_module_resolver';
import path = require('path');

interface SourceFileWithInternalAPIs extends ts.SourceFile {
resolvedModules?: Map<string, ts.ResolvedModuleFull | undefined>;
resolvedTypeReferenceDirectiveNames: Map<string, ts.ResolvedTypeReferenceDirective | undefined>
}

export default class TypescriptDependencyGraph {
constructor(
private host: ts.ScriptReferenceHost
) {}
private visited: Set<string> = new Set();
private defaultLibDir = path.dirname(ts.getDefaultLibFilePath(this.host.getCompilerOptions()));

private isDefaultLib(fileName:string) {
return fileName.startsWith(this.defaultLibDir);
}
private isTslib(fileName:string) {
return getPackageBoundary(fileName).endsWith(path.sep + 'tslib' + path.sep);
}
private walk(fileName: string) {
if (typeof fileName !== 'string') return;

// Default libraries (lib.*.d.ts) files and tslib.d.ts are not processed by tsickle.
if (this.isDefaultLib(fileName)) return;
if (this.isTslib(fileName)) return;

// add file to visited
if (this.visited.has(fileName)) return;
this.visited.add(fileName);

const sf = <SourceFileWithInternalAPIs>this.host.getSourceFile(fileName);

/**
* Files imported to the current file are available in `resolvedModules` property.
* See: Microsoft/Typescript/src/compiler/programs.ts `ts.createProgram > processImportedModules`
* function. It calls `setResolvedModule` function for all external module references -->
* This is the (only, presumably) place where all the external module references are available.
*/
if (sf.resolvedModules) {
for (let entry of sf.resolvedModules) {
this.walk(entry?.[1]?.resolvedFileName);
}
}
/**
* Files referenced from the current file via /// <reference path="...." /> are available in
* `referencedFiles` property. Unlike the previous `resolvedModules`, this is a public API.
* See: Microsoft/Typescript/src/compiler/programs.ts `ts.createProgram > processReferencedFiles`
* These are always initialized, so no if check is needed: see ts.Parser.parseSourceFile
*/
for (let ref of sf.referencedFiles) {
// Unlike the above API, this is not a resolved path, so we have to call TS API
// to resolve it first. See the function body of `processReferencedFiles`.
const resolvedReferencedFileName = ts.resolveTripleslashReference(ref?.fileName, fileName);
this.walk(resolvedReferencedFileName);
}
/**
* Files referenced from the current file via /// <reference type="..." /> are available in
* `resolvedTypeReferenceDirectiveNames` internal API. This is also available in `typeReferencedFile`,
* but it does not contain information about the file path a type reference is resolved.
* See: Microsoft/Typescript/src/compiler/programs.ts `ts.createProgram > processTypeReferenceDirectives`
* see how this function calls `setResolvedTypeReferenceDirective` to mutate `sf.resolvedTypeRefernceDirectiveNames`.
*/
if (sf.resolvedTypeReferenceDirectiveNames) {
for (let entry of sf.resolvedTypeReferenceDirectiveNames) {
this.walk(entry?.[1]?.resolvedFileName);
}
}
}
addRootFile(fileName:string) {
this.walk(fileName);
}
hasFile(fileName:string) {
return this.visited.has(fileName);
}
// Currently this is only used in tests.
iterateFiles() {
return this.visited.values();
}
}
9 changes: 9 additions & 0 deletions packages/tscc/src/shared/array_utils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,12 @@ export function flatten<T>(array: T[][]): T[] {
return out;
}

export function union<T>(array1: T[], array2: T[]): T[] {
let out: T[] = array1.slice();
for (let i = 0, l = array2.length; i < l; i++) {
let el = array2[i];
if (out.includes(el)) continue;
out.push(el);
}
return out;
}
112 changes: 112 additions & 0 deletions packages/tscc/src/shared/escape_goog_identifier.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
/**
* @fileoverview A valid goog.module name must start with [a-zA-Z_$] end only contain [a-zA-Z0-9._$].
* This file provides an analogue of Javascript escape/unescape function pair for string identifiers
* for goog.module, goog.provide, etc.
* One does not lose information after escaping so that we can faithfully map converted module names
* to the original TS source file's name.
*/
import path = require('path');

function codePoint(char: string) {return char.codePointAt(0);}

const LOWERCASE_A_CODE_POINT = codePoint('a');
const LOWERCASE_Z_CODE_POINT = codePoint('z');
const UPPERCASE_A_CODE_POINT = codePoint('A');
const UPPERCASE_Z_CODE_POINT = codePoint('Z');

const PERIOD_CODE_POINT = codePoint('.');
const LOWER_DASH_CODE_POINT = codePoint('_');
const DOLLAR_SIGN_CODE_POINT = codePoint('$');

const ZERO_CODE_POINT = codePoint('0');
const NINE_CODE_POINT = codePoint('9');

const SEP = path.sep

function isLatin(code: number) {
return (LOWERCASE_A_CODE_POINT <= code && code <= LOWERCASE_Z_CODE_POINT)
|| (UPPERCASE_A_CODE_POINT <= code && code <= UPPERCASE_Z_CODE_POINT);
}
function isNumber(code: number) {
return ZERO_CODE_POINT <= code && code <= NINE_CODE_POINT;
}
function isLowerDash(code: number) {
return code === LOWER_DASH_CODE_POINT;
}
function isPeriod(code: number) {
return code === PERIOD_CODE_POINT;
}
function isDollarSign(code: number) {
return code === DOLLAR_SIGN_CODE_POINT;
}

/**
* Latin ⟹ Latin
* number ⟹ number
* "_" ⟹ "_"
* path separator ⟹ "." (for ergonomical reason)
* "." ⟹ "$."
* Any other character ⟹ "$" followed by length 4 base36 representation of its code point,
* left-padded with 0.
*
* This requires that the first character is not a path separator, in order to make sure that
* the resulting escaped name does not start with ".", which is disallowed in goog.module. One should
* always feed relative paths.
*/
export function escapeGoogAdmissibleName(name: string): string {
let out = "";
if (name[0] === SEP) throw new TypeError("Name cannot start with a path separator");
for (let char of name) {
let code = codePoint(char);
if (isLatin(code) || isNumber(code) || isLowerDash(code)) {
out += char;
} else if (char === SEP) {
out += ".";
} else if (isPeriod(code)) {
out += "$.";
} else {
out += "$" + code.toString(36).padStart(4, "0");
}
}
return out;
}

export function unescapeGoogAdmissibleName(escapedName: string): string {
let out = "";
let i = 0;
let code: number;
// charCodeAt returns NaN when an index is out of range.
while (!isNaN(code = escapedName.charCodeAt(i))) {
if (isLatin(code) || isNumber(code) || isLowerDash(code)) {
out += escapedName[i];
i++;
} else if (isPeriod(code)) {
out += SEP;
i++;
} else if (isDollarSign(code)) {
// If the next character is ".", add "."
if (isPeriod(escapedName.charCodeAt(i + 1))) {
out += ".";
i += 2;
} else {
// Read next 4 chars
try {
let base32Codes = parseInt(escapedName.substr(i + 1, 4), 36)
out += String.fromCodePoint(base32Codes);
i += 5;
} catch (e) {
console.log(escapedName);
throw new RangeError(`Invalid characters between position ${i + 1} and ${i + 4}`);
}

}
} else {
throw new RangeError(`Invalid character at position ${i}`);
}
}
return out;
}

export function escapedGoogNameIsDts(escapedName: string) {
return escapedName.endsWith("$.d$.ts");
}
104 changes: 104 additions & 0 deletions packages/tscc/src/transformer/dts_requiretype_transformer.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
/**
* @fileoverview Transforms `const tsickle_aaaa = goog.requireType(.....)` calls to external modules
* into const tsickle_aaaa = mangled$namespace$declared$in$externs. When certain external module's
* main type declaration file merely reexports some other file,
*
* (details to be tested, some other file or some other file in another module?)
*
* tsickle inserts such requireType statements referencing that file directly.
*
* Type declarations in such files are already declared in externs, so we can just alias that variable
* with a namespace on which the file's declarations are written.
*
* This code was mostly same as the one we've used to transform goog.require("a-external_module")
* before we've switched to gluing module method.
*
* Codes are copied from commit
* 1c9824461fcb71814466729b9c1424c4a60ef4ce (feat: use gluing modules for external module support)
*
* TODO: improve comment here and documentation.
*/

import * as ts from 'typescript';
import ITsccSpecWithTS from '../spec/ITsccSpecWithTS';
import {TsickleHost} from 'tsickle';
import {moduleNameAsIdentifier} from 'tsickle/src/annotator_host';
import {namespaceToQualifiedName, isGoogRequireLikeStatement} from './transformer_utils';
import {escapedGoogNameIsDts, unescapeGoogAdmissibleName} from '../shared/escape_goog_identifier';

/**
* This is a transformer run after ts transformation, before googmodule transformation.
*
* In order to wire imports of external modules to their global symbols, we replace
* top-level `require`s of external modules to an assignment of a local variable to
* a global symbol. This results in no `goog.require` or `goog.requireType` emit.
*/
export default function dtsRequireTypeTransformer(spec: ITsccSpecWithTS, tsickleHost: TsickleHost)
: (context: ts.TransformationContext) => ts.Transformer<ts.SourceFile> {
const externalModuleNames = spec.getExternalModuleNames();
return (context: ts.TransformationContext): ts.Transformer<ts.SourceFile> => {
return (sf: ts.SourceFile): ts.SourceFile => {
function maybeExternalModuleRequireType(
original: ts.Statement, importedUrl: string, newIdent: ts.Identifier
) {
const setOriginalNode = (range: ts.Statement) => {
return ts.setOriginalNode(ts.setTextRange(range, original), original);
}
// We are only interested in `requireType`ing .d.ts files.
if (!escapedGoogNameIsDts(importedUrl)) return null;
// If imported url is external module, no need to handle it further.
if (externalModuleNames.includes(importedUrl)) return null;

// origUrl will be a file path relative to the ts project root.
let origUrl = unescapeGoogAdmissibleName(importedUrl);

// We must figure out on what namespace the extern for this module is defined.
// See tsickle/src/externs.js for precise logic. In our case, goog.requireType(....d.ts)
// will be emitted for "module .d.ts", in which case a mangled name derived from a
// .d.ts file's path is used. See how `moduleNamespace`, `rootNamespace` is constructed
// in tsickle/src/externs.js.
// This relies on the heuristic of tsickle, so must be carefully validated whenever tsickle updates.
let mangledNamespace = moduleNameAsIdentifier(tsickleHost, origUrl);

if (newIdent.escapedText === mangledNamespace) {
// Name of the introduced identifier coincides with the global identifier,
// no need to emit things.
return setOriginalNode(ts.createEmptyStatement());
}
// Convert `const importedName = goog.requireType("module d.ts")` to:
// `const importedName = mangledNamespace;`
return setOriginalNode(ts.createVariableStatement(
undefined,
ts.createVariableDeclarationList(
[
ts.createVariableDeclaration(
newIdent,
undefined,
namespaceToQualifiedName(mangledNamespace)
)
],
tsickleHost.es5Mode ? undefined : ts.NodeFlags.Const)
));
}

function visitTopLevelStatement(statements: ts.Statement[], sf: ts.SourceFile, node: ts.Statement) {
lookupExternalModuleRequire: {
let _ = isGoogRequireLikeStatement(node, "requireType");
if (!_) break lookupExternalModuleRequire;

// Do Things TODO
let {importedUrl, newIdent} = _;
const require = maybeExternalModuleRequireType(node, importedUrl, newIdent);
if (!require) break lookupExternalModuleRequire;
statements.push(require);
return;
}
statements.push(node);
}

const stmts: ts.Statement[] = [];
for (const stmt of sf.statements) visitTopLevelStatement(stmts, sf, stmt);
return ts.updateSourceFileNode(sf, ts.setTextRange(ts.createNodeArray(stmts), sf.statements));
}
}
}
Loading

0 comments on commit aabd397

Please sign in to comment.