Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[metadata] Separate bots detection utils #74000

Merged
merged 3 commits into from
Dec 17, 2024
Merged

Conversation

huozhi
Copy link
Member

@huozhi huozhi commented Dec 16, 2024

What

We have two types of bots as crawlers, headless browser bots and static fetcher bots.

  • The headless browser ones they're able to spin up a headless browser to open the website and execute javascript. If there's any JS manipulated metadata, it can still handle it.
  • The static fetcher ones can only handle the static html, rather than parsing and executing the JS code.

This PR separates the existing util we got for bot detection into 2: one for headless browsers and one for static fetcher. Since currently we only have bots detection in pages router and it's simply just need to ensure if it's the bot. So the final exported bot util didn't change its own implementation, still checking if it's either a headless browser or static fetcher. We'll use them for later implementation

Closes NDX-538

@ijjk ijjk added created-by: Next.js team PRs by the Next.js team. type: next labels Dec 16, 2024
Copy link
Member Author

huozhi commented Dec 16, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

@huozhi huozhi changed the title Separate bot utils and remove unused isBot var Separate bot utils Dec 16, 2024
@huozhi huozhi force-pushed the 12-16-separate_bot_utils branch from 1a2f19d to d61fcd8 Compare December 16, 2024 21:33
@ijjk
Copy link
Member

ijjk commented Dec 16, 2024

Stats from current PR

Default Build
General
vercel/next.js canary vercel/next.js 12-16-separate_bot_utils Change
buildDuration 24s 24.7s ⚠️ +739ms
buildDurationCached 20.6s 17.5s N/A
nodeModulesSize 410 MB 410 MB N/A
nextStartRea..uration (ms) 568ms 589ms N/A
Client Bundles (main, webpack)
vercel/next.js canary vercel/next.js 12-16-separate_bot_utils Change
1187-HASH.js gzip 51.4 kB 51 kB N/A
8276.HASH.js gzip 169 B 168 B N/A
8377-HASH.js gzip 5.36 kB 5.36 kB N/A
bccd1874-HASH.js gzip 53 kB 53 kB N/A
framework-HASH.js gzip 57.5 kB 57.5 kB N/A
main-app-HASH.js gzip 232 B 235 B N/A
main-HASH.js gzip 34.1 kB 34 kB N/A
webpack-HASH.js gzip 1.71 kB 1.71 kB N/A
Overall change 0 B 0 B
Legacy Client Bundles (polyfills)
vercel/next.js canary vercel/next.js 12-16-separate_bot_utils Change
polyfills-HASH.js gzip 39.4 kB 39.4 kB
Overall change 39.4 kB 39.4 kB
Client Pages
vercel/next.js canary vercel/next.js 12-16-separate_bot_utils Change
_app-HASH.js gzip 193 B 193 B
_error-HASH.js gzip 193 B 193 B
amp-HASH.js gzip 512 B 510 B N/A
css-HASH.js gzip 343 B 342 B N/A
dynamic-HASH.js gzip 1.84 kB 1.84 kB
edge-ssr-HASH.js gzip 265 B 265 B
head-HASH.js gzip 363 B 362 B N/A
hooks-HASH.js gzip 393 B 392 B N/A
image-HASH.js gzip 4.49 kB 4.49 kB N/A
index-HASH.js gzip 268 B 268 B
link-HASH.js gzip 2.35 kB 2.34 kB N/A
routerDirect..HASH.js gzip 328 B 328 B
script-HASH.js gzip 397 B 397 B
withRouter-HASH.js gzip 323 B 326 B N/A
1afbb74e6ecf..834.css gzip 106 B 106 B
Overall change 3.59 kB 3.59 kB
Client Build Manifests
vercel/next.js canary vercel/next.js 12-16-separate_bot_utils Change
_buildManifest.js gzip 749 B 746 B N/A
Overall change 0 B 0 B
Rendered Page Sizes
vercel/next.js canary vercel/next.js 12-16-separate_bot_utils Change
index.html gzip 523 B 523 B
link.html gzip 538 B 537 B N/A
withRouter.html gzip 519 B 520 B N/A
Overall change 523 B 523 B
Edge SSR bundle Size
vercel/next.js canary vercel/next.js 12-16-separate_bot_utils Change
edge-ssr.js gzip 128 kB 128 kB N/A
page.js gzip 204 kB 204 kB N/A
Overall change 0 B 0 B
Middleware size
vercel/next.js canary vercel/next.js 12-16-separate_bot_utils Change
middleware-b..fest.js gzip 671 B 666 B N/A
middleware-r..fest.js gzip 155 B 156 B N/A
middleware.js gzip 31.3 kB 31.2 kB N/A
edge-runtime..pack.js gzip 844 B 844 B
Overall change 844 B 844 B
Next Runtimes
vercel/next.js canary vercel/next.js 12-16-separate_bot_utils Change
523-experime...dev.js gzip 322 B 322 B
523.runtime.dev.js gzip 314 B 314 B
app-page-exp...dev.js gzip 324 kB 323 kB N/A
app-page-exp..prod.js gzip 128 kB 127 kB N/A
app-page-tur..prod.js gzip 141 kB 140 kB N/A
app-page-tur..prod.js gzip 136 kB 135 kB N/A
app-page.run...dev.js gzip 314 kB 314 kB N/A
app-page.run..prod.js gzip 124 kB 123 kB N/A
app-route-ex...dev.js gzip 37.5 kB 37.4 kB N/A
app-route-ex..prod.js gzip 25.5 kB 25.5 kB N/A
app-route-tu..prod.js gzip 25.5 kB 25.5 kB N/A
app-route-tu..prod.js gzip 25.3 kB 25.3 kB N/A
app-route.ru...dev.js gzip 39.1 kB 39 kB N/A
app-route.ru..prod.js gzip 25.3 kB 25.3 kB N/A
pages-api-tu..prod.js gzip 9.69 kB 9.69 kB
pages-api.ru...dev.js gzip 11.6 kB 11.6 kB
pages-api.ru..prod.js gzip 9.68 kB 9.68 kB
pages-turbo...prod.js gzip 21.7 kB 21.7 kB N/A
pages.runtim...dev.js gzip 27.5 kB 27.4 kB N/A
pages.runtim..prod.js gzip 21.7 kB 21.7 kB N/A
server.runti..prod.js gzip 916 kB 916 kB N/A
Overall change 31.6 kB 31.6 kB
build cache
vercel/next.js canary vercel/next.js 12-16-separate_bot_utils Change
0.pack gzip 2.08 MB 2.06 MB N/A
index.pack gzip 74 kB 72 kB N/A
Overall change 0 B 0 B
Diff details
Diff for middleware.js

Diff too large to display

Diff for edge-ssr.js

Diff too large to display

Diff for 1187-HASH.js

Diff too large to display

Diff for main-HASH.js

Diff too large to display

Diff for app-page-exp..ntime.dev.js
failed to diff
Diff for app-page-exp..time.prod.js

Diff too large to display

Diff for app-page-tur..time.prod.js

Diff too large to display

Diff for app-page-tur..time.prod.js

Diff too large to display

Diff for app-page.runtime.dev.js

Diff too large to display

Diff for app-page.runtime.prod.js

Diff too large to display

Diff for app-route-ex..ntime.dev.js

Diff too large to display

Diff for app-route-ex..time.prod.js

Diff too large to display

Diff for app-route-tu..time.prod.js

Diff too large to display

Diff for app-route-tu..time.prod.js

Diff too large to display

Diff for app-route.runtime.dev.js

Diff too large to display

Diff for app-route.ru..time.prod.js

Diff too large to display

Diff for pages-turbo...time.prod.js

Diff too large to display

Diff for pages.runtime.dev.js

Diff too large to display

Diff for pages.runtime.prod.js

Diff too large to display

Diff for server.runtime.prod.js

Diff too large to display

Commit: 620bd8c

@ijjk
Copy link
Member

ijjk commented Dec 16, 2024

Tests Passed

@huozhi huozhi changed the title Separate bot utils Separate bots detection utils Dec 17, 2024
@huozhi huozhi marked this pull request as ready for review December 17, 2024 21:45
@huozhi huozhi requested a review from ztanner December 17, 2024 21:45
@huozhi huozhi requested a review from ztanner December 17, 2024 22:33
Co-authored-by: Zack Tanner <1939140+ztanner@users.noreply.github.com>
@huozhi huozhi enabled auto-merge (squash) December 17, 2024 22:46
@huozhi huozhi merged commit d068202 into canary Dec 17, 2024
125 of 130 checks passed
@huozhi huozhi deleted the 12-16-separate_bot_utils branch December 17, 2024 23:09
@huozhi huozhi changed the title Separate bots detection utils [metadata] separate bots detection utils Dec 20, 2024
@huozhi huozhi changed the title [metadata] separate bots detection utils [metadata] Separate bots detection utils Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants