From 21dc346b1686ec58216d53c47cb40963df2f0a7e Mon Sep 17 00:00:00 2001 From: Andrey Lushnikov Date: Thu, 23 Apr 2020 19:52:06 -0700 Subject: [PATCH] devops: auto-correct links in our documentation (#1955) --- docs/README.md | 2 +- docs/core-concepts.md | 12 +- docs/loading.md | 8 +- utils/doclint/Source.js | 3 +- utils/doclint/cli.js | 16 ++- utils/doclint/preprocessor/index.js | 214 +++++++++++++++++++++------- utils/doclint/preprocessor/test.js | 45 +++--- 7 files changed, 203 insertions(+), 97 deletions(-) diff --git a/docs/README.md b/docs/README.md index 70c6b53d1f02b..dd3fe2ef0d06e 100644 --- a/docs/README.md +++ b/docs/README.md @@ -7,7 +7,7 @@ - [Pages and frames](./core-concepts.md#pages-and-frames) - [Selectors](./core-concepts.md#selectors) - [Auto-waiting](./core-concepts.md#auto-waiting) - - [Node.js and browser execution contexts](./core-concepts.md#node-js-and-browser-execution-contexts) + - [Node.js and browser execution contexts](./core-concepts.md#nodejs-and-browser-execution-contexts) - [Object & element handles](./core-concepts.md#object--element-handles) 1. [Input](./input.md) - [Text input](./input.md#text-input) diff --git a/docs/core-concepts.md b/docs/core-concepts.md index 6c5fbd527f595..edfe9708de3e9 100644 --- a/docs/core-concepts.md +++ b/docs/core-concepts.md @@ -14,14 +14,14 @@ the following primitives. - [Pages and frames](#pages-and-frames) - [Selectors](#selectors) - [Auto-waiting](#auto-waiting) - - [Node.js and browser execution contexts](#node-js-and-browser-execution-contexts) + - [Node.js and browser execution contexts](#nodejs-and-browser-execution-contexts) - [Object & element handles](#object--element-handles)
## Browser -A [`Browser`](../api.md#class-browser) refers to an instance of Chromium, Firefox +A [`Browser`](api.md#class-browser) refers to an instance of Chromium, Firefox or WebKit. Playwright scripts generally start with launching a browser instance and end with closing the browser. Browser instances can be launched in headless (without a GUI) or headful mode. @@ -44,7 +44,7 @@ maximize what a single instance can do through multiple browser contexts. ## Browser contexts -A [`BrowserContext`](../api.md#class-browsercontext) is an isolated incognito-alike +A [`BrowserContext`](api.md#class-browsercontext) is an isolated incognito-alike session within a browser instance. Browser contexts are fast and cheap to create. Browser contexts can be used to parallelize isolated test executions. @@ -71,13 +71,13 @@ const context = await browser.newContext({ #### API reference -- [class `BrowserContext`](./api.md#class-browser-context) +- [class `BrowserContext`](./api.md#class-browsercontext)
## Pages and frames -A Browser context can have multiple pages. A [`Page`](../api.md#class-page) +A Browser context can have multiple pages. A [`Page`](api.md#class-page) refers to a single tab or a popup window within a browser context. It should be used to navigate to URLs and interact with the page content. ```js @@ -104,7 +104,7 @@ console.log(page.url()); window.location.href = 'https://example.com'; ``` -A page can have one or more [Frame](../api.md#class-frame) objects attached to +A page can have one or more [Frame](api.md#class-frame) objects attached to it. Each page has a main frame and page-level interactions (like `click`) are assumed to operate in the main frame. diff --git a/docs/loading.md b/docs/loading.md index 7e7c5d566e1e4..5a4b00bb7daf8 100644 --- a/docs/loading.md +++ b/docs/loading.md @@ -64,7 +64,7 @@ Explicit loading handling may be required for more complicated scenarios though. ### Loading a popup -When popup is opened, explicitly calling [`page.waitForLoadState()`](#pagewaitforloadstatestate-options) ensures that popup is loaded to the desired state. +When popup is opened, explicitly calling [`page.waitForLoadState()`](api.md#pagewaitforloadstatestate-options) ensures that popup is loaded to the desired state. ```js const [ popup ] = await Promise.all([ page.waitForEvent('popup'), @@ -76,7 +76,7 @@ await popup.evaluate(() => window.globalVariableInitializedByOnLoadHandler); ### Unusual client-side redirects -Usually, the client-side redirect happens before the `load` event, and `page.goto()` method automatically waits for the redirect. However, when redirecting from a link click or after the `load` event, it would be easier to explicitly [`waitForNavigation()`](#pagewaitfornavigationoptions) to a specific url. +Usually, the client-side redirect happens before the `load` event, and `page.goto()` method automatically waits for the redirect. However, when redirecting from a link click or after the `load` event, it would be easier to explicitly [`waitForNavigation()`](api.md#pagewaitfornavigationoptions) to a specific url. ```js await Promise.all([ page.waitForNavigation({ url: '**/login' }), @@ -88,7 +88,7 @@ Notice the `Promise.all` to click and wait for navigation. Awaiting these method ### Click triggers navigation after a timeout -When `onclick` handler triggers a navigation from a `setTimeout`, use an explicit [`waitForNavigation()`](#pagewaitfornavigationoptions) call as a last resort. +When `onclick` handler triggers a navigation from a `setTimeout`, use an explicit [`waitForNavigation()`](api.md#pagewaitfornavigationoptions) call as a last resort. ```js await Promise.all([ page.waitForNavigation(), // Waits for the next navigation. @@ -108,7 +108,7 @@ await page.waitForFunction(() => window.amILoadedYet()); await page.screenshot(); ``` -When clicking on a button triggers some asynchronous processing, issues a couple GET requests and pushes a new history state multiple times, explicit [`waitForNavigation()`](#pagewaitfornavigationoptions) to a specific url is the most reliable option. +When clicking on a button triggers some asynchronous processing, issues a couple GET requests and pushes a new history state multiple times, explicit [`waitForNavigation()`](api.md#pagewaitfornavigationoptions) to a specific url is the most reliable option. ```js await Promise.all([ page.waitForNavigation({ url: '**/invoice#processed' }), diff --git a/utils/doclint/Source.js b/utils/doclint/Source.js index c966a58bd6eae..8656803840fe6 100644 --- a/utils/doclint/Source.js +++ b/utils/doclint/Source.js @@ -121,7 +121,8 @@ class Source { * @return {!Promise>} */ static async readdir(dirPath, extension = '') { - const filePaths = (await recursiveReadDir(dirPath)).filter(fileName => fileName.endsWith(extension)); + extension = extension.toLowerCase(); + const filePaths = (await recursiveReadDir(dirPath)).filter(fileName => fileName.toLowerCase().endsWith(extension)); return Promise.all(filePaths.map(filePath => Source.readFile(filePath))); } } diff --git a/utils/doclint/cli.js b/utils/doclint/cli.js index 21a665a516a2f..382a9feb243f2 100755 --- a/utils/doclint/cli.js +++ b/utils/doclint/cli.js @@ -18,6 +18,7 @@ const playwright = require('../../index.js'); const path = require('path'); const Source = require('./Source'); +const Message = require('./Message'); const {spawnSync} = require('child_process'); @@ -44,8 +45,8 @@ async function run() { const readme = await Source.readFile(path.join(PROJECT_DIR, 'README.md')); const contributing = await Source.readFile(path.join(PROJECT_DIR, 'CONTRIBUTING.md')); const api = await Source.readFile(path.join(PROJECT_DIR, 'docs', 'api.md')); - const troubleshooting = await Source.readFile(path.join(PROJECT_DIR, 'docs', 'troubleshooting.md')); - const mdSources = [readme, api, contributing, troubleshooting]; + const docs = await Source.readdir(path.join(PROJECT_DIR, 'docs'), '.md'); + const mdSources = [readme, api, contributing, ...docs]; const preprocessor = require('./preprocessor'); const browserVersions = await getBrowserVersions(); @@ -54,13 +55,15 @@ async function run() { chromiumVersion: browserVersions.chromium, firefoxVersion: browserVersions.firefox, }))); - messages.push(...preprocessor.ensureInternalLinksAreValid([api])); + messages.push(...preprocessor.autocorrectInvalidLinks(PROJECT_DIR, mdSources, getRepositoryFiles())); + for (const source of mdSources.filter(source => source.hasUpdatedText())) + messages.push(Message.warning(`WARN: updated ${source.projectPath()}`)); const browser = await playwright.chromium.launch(); const page = await browser.newPage(); const checkPublicAPI = require('./check_public_api'); const jsSources = await Source.readdir(path.join(PROJECT_DIR, 'src')); - messages.push(...await checkPublicAPI(page, mdSources, jsSources)); + messages.push(...await checkPublicAPI(page, [api], jsSources)); await browser.close(); for (const source of mdSources) { @@ -126,6 +129,11 @@ async function getChromeVersion() { return version.trim().split(' ').pop(); } +function getRepositoryFiles() { + const out = spawnSync('git', ['ls-files'], {cwd: PROJECT_DIR}); + return out.stdout.toString().trim().split('\n').map(file => path.join(PROJECT_DIR, file)); +} + async function getFirefoxVersion() { const isWin = os.platform() === 'win32' || os.platform() === 'cygwin'; const out = spawnSync(playwright.firefox.executablePath(), [isWin ? '/version' : '--version'], undefined); diff --git a/utils/doclint/preprocessor/index.js b/utils/doclint/preprocessor/index.js index f5622754a38ac..3e7bf63e679ee 100644 --- a/utils/doclint/preprocessor/index.js +++ b/utils/doclint/preprocessor/index.js @@ -14,6 +14,7 @@ * limitations under the License. */ +const path = require('path'); const Message = require('../Message'); function runCommands(sources, {libversion, chromiumVersion, firefoxVersion}) { @@ -21,13 +22,14 @@ function runCommands(sources, {libversion, chromiumVersion, firefoxVersion}) { const isReleaseVersion = !libversion.includes('-'); const messages = []; - const commands = []; for (const source of sources) { const text = source.text(); const commandStartRegex = //ig; const commandEndRegex = //ig; let start; + const sourceEdits = new SourceEdits(source); + // Extract all commands from source while (start = commandStartRegex.exec(text)) { // eslint-disable-line no-cond-assign commandEndRegex.lastIndex = commandStartRegex.lastIndex; const end = commandEndRegex.exec(text); @@ -35,57 +37,39 @@ function runCommands(sources, {libversion, chromiumVersion, firefoxVersion}) { messages.push(Message.error(`Failed to find 'gen:stop' for command ${start[0]}`)); return messages; } - const name = start[1]; + const commandName = start[1]; const from = commandStartRegex.lastIndex; const to = end.index; - const originalText = text.substring(from, to); - commands.push({name, from, to, originalText, source}); commandStartRegex.lastIndex = commandEndRegex.lastIndex; - } - } - const changedSources = new Set(); - // Iterate commands in reverse order so that edits don't conflict. - commands.sort((a, b) => b.from - a.from); - for (const command of commands) { - let newText = null; - if (command.name === 'version') - newText = isReleaseVersion ? 'v' + libversion : 'Tip-Of-Tree'; - else if (command.name === 'chromium-version') - newText = chromiumVersion; - else if (command.name === 'firefox-version') - newText = firefoxVersion; - else if (command.name === 'chromium-version-badge') - newText = `[![Chromium version](https://img.shields.io/badge/chromium-${chromiumVersion}-blue.svg?logo=google-chrome)](https://www.chromium.org/Home)`; - else if (command.name === 'firefox-version-badge') - newText = `[![Firefox version](https://img.shields.io/badge/firefox-${firefoxVersion}-blue.svg?logo=mozilla-firefox)](https://www.mozilla.org/en-US/firefox/new/)`; - else if (command.name === 'toc') - newText = generateTableOfContents(command.source.text(), command.to, false /* topLevelOnly */); - else if (command.name === 'toc-top-level') - newText = generateTableOfContents(command.source.text(), command.to, true /* topLevelOnly */); - else if (command.name.startsWith('toc-extends-')) - newText = generateTableOfContentsForSuperclass(command.source.text(), 'class: ' + command.name.substring('toc-extends-'.length)); - if (newText === null) - messages.push(Message.error(`Unknown command 'gen:${command.name}'`)); - else if (applyCommand(command, newText)) - changedSources.add(command.source); + let newText = null; + if (commandName === 'version') + newText = isReleaseVersion ? 'v' + libversion : 'Tip-Of-Tree'; + else if (commandName === 'chromium-version') + newText = chromiumVersion; + else if (commandName === 'firefox-version') + newText = firefoxVersion; + else if (commandName === 'chromium-version-badge') + newText = `[![Chromium version](https://img.shields.io/badge/chromium-${chromiumVersion}-blue.svg?logo=google-chrome)](https://www.chromium.org/Home)`; + else if (commandName === 'firefox-version-badge') + newText = `[![Firefox version](https://img.shields.io/badge/firefox-${firefoxVersion}-blue.svg?logo=mozilla-firefox)](https://www.mozilla.org/en-US/firefox/new/)`; + else if (commandName === 'toc') + newText = generateTableOfContents(source.text(), to, false /* topLevelOnly */); + else if (commandName === 'toc-top-level') + newText = generateTableOfContents(source.text(), to, true /* topLevelOnly */); + else if (commandName.startsWith('toc-extends-')) + newText = generateTableOfContentsForSuperclass(source.text(), 'class: ' + commandName.substring('toc-extends-'.length)); + + if (newText === null) + messages.push(Message.error(`Unknown command 'gen:${commandName}'`)); + else + sourceEdits.edit(from, to, newText); + } + sourceEdits.commit(messages); } - for (const source of changedSources) - messages.push(Message.warning(`GEN: updated ${source.projectPath()}`)); return messages; }; -/** - * @param {{name: string, from: number, to: number, source: !Source}} command - * @param {string} editText - * @return {boolean} - */ -function applyCommand(command, editText) { - const text = command.source.text(); - const newText = text.substring(0, command.from) + editText + text.substring(command.to); - return command.source.setText(newText); -} - function getTOCEntriesForText(text) { const ids = new Set(); const titles = []; @@ -122,20 +106,142 @@ function getTOCEntriesForText(text) { /** * @param {string} text */ -function ensureInternalLinksAreValid(sources) { - const messages = []; +function autocorrectInvalidLinks(projectRoot, sources, allowedFilePaths) { + const pathToHashLinks = new Map(); for (const source of sources) { const text = source.text(); - const availableLinks = new Set(getTOCEntriesForText(text).map(entry => entry.id)); - const internalLinkRegex = /\]\(#([#\w\-]*)\)/g; - let match; - while ((match = internalLinkRegex.exec(text)) !== null) { - const link = match[1]; - if (!availableLinks.has(link)) - messages.push(Message.error(`Found invalid link: #${match[1]}`)); + const hashLinks = new Set(getTOCEntriesForText(text).map(entry => entry.id)); + pathToHashLinks.set(source.filePath(), hashLinks); + } + + const messages = []; + for (const source of sources) { + const allRelativePaths = []; + for (const filepath of allowedFilePaths) { + allRelativePaths.push('/' + path.relative(projectRoot, filepath)); + allRelativePaths.push(path.relative(path.dirname(source.filePath()), filepath)); } + const sourceEdits = new SourceEdits(source); + let offset = 0; + const edits = []; + + const lines = source.text().split('\n'); + lines.forEach((line, lineNumber) => { + const linkRegex = /\]\(([^\)]*)\)/gm; + let match; + while (match = linkRegex.exec(line)) { + const hrefOffset = offset + lineNumber + match.index + 2; // +2 since we have to skip ]( + const [, href] = match; + if (href.startsWith('http://') || href.startsWith('https://') || href.startsWith('mailto:')) + continue; + const [relativePath, hash] = href.split('#'); + const hashOffset = hrefOffset + relativePath.length + 1; + + let resolvedPath = resolveLinkPath(source, relativePath); + let hashLinks = pathToHashLinks.get(resolvedPath); + + if (!hashLinks) { + // Attempt to autocorrect + const newRelativePath = autocorrectText(relativePath, allRelativePaths); + if (!newRelativePath) { + messages.push(Message.error(`Bad link in ${source.projectPath()}:${lineNumber + 1}: file ${relativePath} does not exist`)); + continue; + } + resolvedPath = resolveLinkPath(source, newRelativePath); + hashLinks = pathToHashLinks.get(resolvedPath); + sourceEdits.edit(hrefOffset, hrefOffset + relativePath.length, newRelativePath); + } + + if (!hash || hashLinks.has(hash)) + continue; + + const newHashLink = autocorrectText(hash, [...hashLinks]); + if (newHashLink) { + sourceEdits.edit(hashOffset, hashOffset + hash.length, newHashLink); + } else { + messages.push(Message.error(`Bad link in ${source.projectPath()}:${lineNumber + 1}: hash "#${hash}" does not exist in "${path.relative(projectRoot, resolvedPath)}"`)); + } + } + offset += line.length; + }); + + sourceEdits.commit(messages); } return messages; + + function resolveLinkPath(source, relativePath) { + if (!relativePath) + return source.filePath(); + if (relativePath.startsWith('/')) + return path.resolve(projectRoot, '.' + relativePath); + return path.resolve(path.dirname(source.filePath()), relativePath); + } +} + +class SourceEdits { + constructor(source) { + this._source = source; + this._edits = []; + } + + edit(from, to, newText) { + this._edits.push({from, to, newText}); + } + + commit(messages = []) { + if (!this._edits.length) + return; + this._edits.sort((a, b) => a.from - b.from); + for (const edit of this._edits) { + if (edit.from > edit.to) { + messages.push(Message.error('INTERNAL ERROR: incorrect edit!')); + return; + } + } + for (let i = 0; i < this._edits.length - 1; ++i) { + if (this._edits[i].to > this._edits[i + 1].from) { + messages.push(Message.error('INTERNAL ERROR: edits are overlapping!')); + return; + } + } + this._edits.reverse(); + let text = this._source.text(); + for (const edit of this._edits) + text = text.substring(0, edit.from) + edit.newText + text.substring(edit.to); + this._source.setText(text); + } +} + +function autocorrectText(text, options, maxCorrectionsRatio = 0.5) { + if (!options.length) + return null; + const scores = options.map(option => ({option, score: levenshteinDistance(text, option)})); + scores.sort((a, b) => a.score - b.score); + if (scores[0].score > text.length * maxCorrectionsRatio) + return null; + return scores[0].option; +} + +function levenshteinDistance(a, b) { + const N = a.length, M = b.length; + const d = new Int32Array(N * M); + for (let i = 0; i < N * M; ++i) + d[i] = 0; + for (let j = 0; j < M; ++j) + d[j] = j; + for (let i = 0; i < N; ++i) + d[i * M] = i; + for (let i = 1; i < N; ++i) { + for (let j = 1; j < M; ++j) { + const cost = a[i] === b[j] ? 0 : 1; + d[i * M + j] = Math.min( + d[(i - 1) * M + j] + 1, // d[i-1][j] + 1 + d[i * M + j - 1] + 1, // d[i][j - 1] + 1 + d[(i - 1) * M + j - 1] + cost // d[i - 1][j - 1] + cost + ); + } + } + return d[N * M - 1]; } function generateTableOfContents(text, offset, topLevelOnly) { @@ -177,4 +283,4 @@ function generateTableOfContentsForSuperclass(text, name) { return text; } -module.exports = {ensureInternalLinksAreValid, runCommands}; +module.exports = {autocorrectInvalidLinks, runCommands}; diff --git a/utils/doclint/preprocessor/test.js b/utils/doclint/preprocessor/test.js index a658979fd342f..94a8645c9249a 100644 --- a/utils/doclint/preprocessor/test.js +++ b/utils/doclint/preprocessor/test.js @@ -51,9 +51,8 @@ describe('runCommands', function() { Playwright XXX `); const messages = runCommands([source], OPTIONS_REL); - expect(messages.length).toBe(1); - expect(messages[0].type).toBe('warning'); - expect(messages[0].text).toContain('doc.md'); + expect(messages.length).toBe(0); + expect(source.hasUpdatedText()).toBe(true); expect(source.text()).toBe(` Playwright v1.3.0 `); @@ -63,9 +62,8 @@ describe('runCommands', function() { Playwright XXX `); const messages = runCommands([source], OPTIONS_DEV); - expect(messages.length).toBe(1); - expect(messages[0].type).toBe('warning'); - expect(messages[0].text).toContain('doc.md'); + expect(messages.length).toBe(0); + expect(source.hasUpdatedText()).toBe(true); expect(source.text()).toBe(` Playwright Tip-Of-Tree `); @@ -92,9 +90,8 @@ describe('runCommands', function() { #### page.$ #### page.$$`); const messages = runCommands([source], OPTIONS_REL); - expect(messages.length).toBe(1); - expect(messages[0].type).toBe('warning'); - expect(messages[0].text).toContain('doc.md'); + expect(messages.length).toBe(0); + expect(source.hasUpdatedText()).toBe(true); expect(source.text()).toBe(` - [class: page](#class-page) * [page.$](#page) @@ -113,9 +110,8 @@ describe('runCommands', function() { \`\`\` `); const messages = runCommands([source], OPTIONS_REL); - expect(messages.length).toBe(1); - expect(messages[0].type).toBe('warning'); - expect(messages[0].text).toContain('doc.md'); + expect(messages.length).toBe(0); + expect(source.hasUpdatedText()).toBe(true); expect(source.text()).toBe(` - [class: page](#class-page) @@ -131,9 +127,8 @@ describe('runCommands', function() { ### some [link](#foobar) here `); const messages = runCommands([source], OPTIONS_REL); - expect(messages.length).toBe(1); - expect(messages[0].type).toBe('warning'); - expect(messages[0].text).toContain('doc.md'); + expect(messages.length).toBe(0); + expect(source.hasUpdatedText()).toBe(true); expect(source.text()).toBe(` - [some link here](#some-link-here) @@ -150,9 +145,8 @@ describe('runCommands', function() { ## Second `); const messages = runCommands([source], OPTIONS_REL); - expect(messages.length).toBe(1); - expect(messages[0].type).toBe('warning'); - expect(messages[0].text).toContain('doc.md'); + expect(messages.length).toBe(0); + expect(source.hasUpdatedText()).toBe(true); expect(source.text()).toBe(` ## First @@ -173,9 +167,8 @@ describe('runCommands', function() { zzz `); const messages = runCommands([source], OPTIONS_REL); - expect(messages.length).toBe(1); - expect(messages[0].type).toBe('warning'); - expect(messages[0].text).toContain('doc.md'); + expect(messages.length).toBe(0); + expect(source.hasUpdatedText()).toBe(true); expect(source.text()).toBe(` v1.3.0 v1.3.0 @@ -187,9 +180,8 @@ describe('runCommands', function() { Playwright XXX `); const messages = runCommands([source], OPTIONS_REL); - expect(messages.length).toBe(1); - expect(messages[0].type).toBe('warning'); - expect(messages[0].text).toContain('doc.md'); + expect(messages.length).toBe(0); + expect(source.hasUpdatedText()).toBe(true); expect(source.text()).toBe(` Playwright 80.0.4004.0 `); @@ -201,9 +193,8 @@ describe('runCommands', function() { Playwright XXX `); const messages = runCommands([source], OPTIONS_REL); - expect(messages.length).toBe(1); - expect(messages[0].type).toBe('warning'); - expect(messages[0].text).toContain('doc.md'); + expect(messages.length).toBe(0); + expect(source.hasUpdatedText()).toBe(true); expect(source.text()).toBe(` Playwright 73.0b3 `);