Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Scheduler] Profiling features #16145

Merged
merged 4 commits into from
Aug 14, 2019

Conversation

acdlite
Copy link
Collaborator

@acdlite acdlite commented Jul 17, 2019

Adds two new profiling features to Scheduler.

Sample-based approach

The main feature is a way to read the current state of the Scheduler from a worker, using a shared array buffer. The shared array buffer contains (only in the profiling and development builds)

  • the priority at which Scheduler is currently running (or 0 if it's not)
  • the size of the queue
  • the id of the currently running task (if applicable)
  • a unique "run id" per block (if applicable)

Regarding the last point each synchronous block of Scheduler work is given a unique run ID. This is different than a task ID because a single task will have more than one run if it yields with a continuation.

The worker thread can use this information to construct a flamegraph. The advantage of the sample-based approach is that it scales in proportion to the available resources — if the CPU slows down, the sample rate slows down accordingly.

Event-based approach

The disadvantage of the sample-based approach is that it doesn't during the very first part of page initialization, before the worker has initialized.

To address this problem, there's a separate feature to write to a log of events: startLoggingProfilerEvents and stopLoggingProfilerEvents.

stopLoggingProfilerEvents returns a typed, 32-bit integer array buffer (though not a shared array buffer). The log conforms to a custom event format. A profiler can then convert this log into a flamegraph.

An array buffer is used so that the log can be sent to a worker thread via postMessage and the structured clone algorithm.

The test suite works by printing a text representation of a Scheduler flamegraph. Example:

!!! Main thread              │          ██
Task 2 [User-blocking]       │        ░░░░██████
Task 1 [Normal]              │  ████████░░░░░░░░██████

Both approaches combined

The suggested approach is to use the event log during page start up, then switch to the sample-based approach once the profiler worker is ready.

@sizebot
Copy link

sizebot commented Jul 17, 2019

React: size: 🔺+2.7%, gzip: 🔺+2.5%

Details of bundled changes.

Comparing: 62b04cf...d374a95

react

File Filesize Diff Gzip Diff Prev Size Current Size Prev Gzip Current Gzip ENV
react.development.js +6.3% +4.4% 112.65 KB 119.74 KB 28.84 KB 30.1 KB UMD_DEV
react.production.min.js 🔺+2.7% 🔺+2.5% 12.64 KB 12.98 KB 5.02 KB 5.14 KB UMD_PROD
react.profiling.min.js +10.0% +9.8% 14.82 KB 16.29 KB 5.56 KB 6.1 KB UMD_PROFILING
react.development.js 0.0% 0.0% 72 KB 72 KB 18.93 KB 18.93 KB NODE_DEV
react.production.min.js 0.0% 0.0% 6.66 KB 6.66 KB 2.77 KB 2.77 KB NODE_PROD
React-dev.js 0.0% 0.0% 69.97 KB 69.97 KB 18.01 KB 18.01 KB FB_WWW_DEV
React-prod.js 0.0% -0.0% 17.32 KB 17.32 KB 4.53 KB 4.53 KB FB_WWW_PROD
React-profiling.js 0.0% -0.0% 17.32 KB 17.32 KB 4.53 KB 4.53 KB FB_WWW_PROFILING

scheduler

File Filesize Diff Gzip Diff Prev Size Current Size Prev Gzip Current Gzip ENV
scheduler-unstable_mock.development.js +41.1% +29.6% 16.71 KB 23.57 KB 4.1 KB 5.31 KB UMD_DEV
scheduler-tracing.profiling.min.js 0.0% +0.1% 3.25 KB 3.25 KB 992 B 993 B NODE_PROFILING
scheduler-unstable_mock.production.min.js 🔺+6.2% 🔺+4.8% 4.73 KB 5.03 KB 1.98 KB 2.08 KB UMD_PROD
Scheduler-dev.js +27.3% +17.9% 26.23 KB 33.38 KB 6.95 KB 8.19 KB FB_WWW_DEV
Scheduler-prod.js 🔺+3.3% 🔺+3.6% 15.92 KB 16.45 KB 3.59 KB 3.72 KB FB_WWW_PROD
Scheduler-profiling.js n/a n/a 0 B 20.23 KB 0 B 4.38 KB FB_WWW_PROFILING
scheduler-tracing.development.js 0.0% +0.1% 11.59 KB 11.59 KB 2.98 KB 2.98 KB NODE_DEV
scheduler-tracing.production.min.js 0.0% 🔺+0.3% 728 B 728 B 382 B 383 B NODE_PROD
scheduler-unstable_mock.development.js +41.5% +30.0% 16.52 KB 23.38 KB 4.04 KB 5.26 KB NODE_DEV
scheduler-unstable_mock.production.min.js 🔺+6.5% 🔺+4.8% 4.72 KB 5.03 KB 1.92 KB 2.02 KB NODE_PROD
SchedulerMock-dev.js +40.4% +29.7% 17.05 KB 23.94 KB 4.16 KB 5.4 KB FB_WWW_DEV
SchedulerMock-prod.js 🔺+2.5% 🔺+3.8% 12.14 KB 12.44 KB 2.84 KB 2.95 KB FB_WWW_PROD
scheduler.development.js +27.9% +17.9% 25.63 KB 32.78 KB 6.84 KB 8.07 KB NODE_DEV
scheduler.production.min.js 🔺+7.3% 🔺+5.4% 5.23 KB 5.61 KB 2.14 KB 2.26 KB NODE_PROD

Generated by 🚫 dangerJS

@acdlite acdlite force-pushed the scheduler-user-timing-events branch 10 times, most recently from e826e9f to 9f9e490 Compare July 18, 2019 22:47
@acdlite acdlite changed the title [WIP][Scheduler] Mark user-timing events [Scheduler] Mark user-timing events Jul 23, 2019
@acdlite acdlite requested a review from bvaughn July 23, 2019 22:37
@acdlite acdlite force-pushed the scheduler-user-timing-events branch from 9f9e490 to 84a03db Compare July 23, 2019 22:38
@acdlite acdlite marked this pull request as ready for review July 23, 2019 22:39
@acdlite acdlite force-pushed the scheduler-user-timing-events branch 2 times, most recently from a22e71c to 807db7f Compare July 23, 2019 22:48
@acdlite acdlite force-pushed the scheduler-user-timing-events branch from 807db7f to 4e772ce Compare August 2, 2019 18:24
Marks when Scheduler starts and stops running a task. Also marks when
a task is initially scheduled, and when Scheduler is waiting for a
callback, which can't be inferred from a sample-based JavaScript CPU
profile alone.

The plan is to use the user-timing events to build a Scheduler profiler
that shows how the lifetime of tasks interact with each other and
with unscheduled main thread work.

The test suite works by printing an text representation of a
Scheduler flamegraph.
@acdlite acdlite force-pushed the scheduler-user-timing-events branch from 4e772ce to 161aba9 Compare August 13, 2019 02:00
Array contains

- the priority Scheduler is currently running
- the size of the queue
- the id of the currently running task
@acdlite acdlite changed the title [Scheduler] Mark user-timing events [Scheduler] Profiling features Aug 13, 2019
@acdlite acdlite requested a review from sebmarkbage August 13, 2019 18:07
@acdlite acdlite force-pushed the scheduler-user-timing-events branch 3 times, most recently from 917d899 to 72f6bca Compare August 13, 2019 19:24
Events are written to an array buffer using a custom instruction format.
For now, this is only meant to be used during page start up, before the
profiler worker has a chance to start up. Once the worker is ready, call
`stopLoggingProfilerEvents` to return the log up to that point, then
send the array buffer to the worker.

Then switch to the sampling based approach.
@acdlite acdlite force-pushed the scheduler-user-timing-events branch from 8993f6e to e2d2709 Compare August 14, 2019 01:14
@acdlite acdlite force-pushed the scheduler-user-timing-events branch 2 times, most recently from 5a3e528 to 0f2cc1a Compare August 14, 2019 01:53
Each synchronous block of Scheduler work is given a unique run ID. This
is different than a task ID because a single task will have more than
one run if it yields with a continuation.
@acdlite acdlite force-pushed the scheduler-user-timing-events branch from 0f2cc1a to d374a95 Compare August 14, 2019 01:56
@acdlite acdlite merged commit a34ca7b into facebook:master Aug 14, 2019
gaearon pushed a commit to gaearon/react that referenced this pull request Aug 14, 2019
* [Scheduler] Mark user-timing events

Marks when Scheduler starts and stops running a task. Also marks when
a task is initially scheduled, and when Scheduler is waiting for a
callback, which can't be inferred from a sample-based JavaScript CPU
profile alone.

The plan is to use the user-timing events to build a Scheduler profiler
that shows how the lifetime of tasks interact with each other and
with unscheduled main thread work.

The test suite works by printing an text representation of a
Scheduler flamegraph.

* Expose shared array buffer with profiling info

Array contains

- the priority Scheduler is currently running
- the size of the queue
- the id of the currently running task

* Replace user-timing calls with event log

Events are written to an array buffer using a custom instruction format.
For now, this is only meant to be used during page start up, before the
profiler worker has a chance to start up. Once the worker is ready, call
`stopLoggingProfilerEvents` to return the log up to that point, then
send the array buffer to the worker.

Then switch to the sampling based approach.

* Record the current run ID

Each synchronous block of Scheduler work is given a unique run ID. This
is different than a task ID because a single task will have more than
one run if it yields with a continuation.
@@ -144,5 +160,9 @@
return global.React.__SECRET_INTERNALS_DO_NOT_USE_OR_YOU_WILL_BE_FIRED
.Scheduler.unstable_UserBlockingPriority;
},
get unstable_sharedProfilingBuffer() {
return global.React.__SECRET_INTERNALS_DO_NOT_USE_OR_YOU_WILL_BE_FIRED
.Scheduler.unstable_getFirstCallbackNode;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a copy paste mistake?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes :)

@@ -138,5 +154,9 @@
return global.React.__SECRET_INTERNALS_DO_NOT_USE_OR_YOU_WILL_BE_FIRED
.Scheduler.unstable_UserBlockingPriority;
},
get unstable_sharedProfilingBuffer() {
return global.React.__SECRET_INTERNALS_DO_NOT_USE_OR_YOU_WILL_BE_FIRED
.Scheduler.unstable_getFirstCallbackNode;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question

@@ -138,5 +154,9 @@
return global.React.__SECRET_INTERNALS_DO_NOT_USE_OR_YOU_WILL_BE_FIRED
.Scheduler.unstable_UserBlockingPriority;
},
get unstable_sharedProfilingBuffer() {
return global.React.__SECRET_INTERNALS_DO_NOT_USE_OR_YOU_WILL_BE_FIRED
.Scheduler.unstable_getFirstCallbackNode;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question

gaearon added a commit that referenced this pull request Aug 14, 2019
gaearon added a commit that referenced this pull request Aug 14, 2019
gaearon added a commit to gaearon/react that referenced this pull request Aug 15, 2019
typeof SharedArrayBuffer === 'function'
? new SharedArrayBuffer(profilingStateSize * Int32Array.BYTES_PER_ELEMENT)
: // $FlowFixMe Flow doesn't know about ArrayBuffer
new ArrayBuffer(profilingStateSize * Int32Array.BYTES_PER_ELEMENT);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IE9 is still technically supported by us (with polyfills) but AFAIK doesn't have ArrayBuffer. Let's be careful adding new runtime dependencies in unconditional paths, especially for DEV features.

@@ -152,6 +190,8 @@ function flushWork(hasTimeRemaining, initialTime) {
}
// Return whether there's additional work
if (currentTask !== null) {
markSchedulerSuspended(currentTime);
isHostCallbackScheduled = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've found this factoring very counterintuitive.

It's like Scheduler knows too much about the host config. It was surprising to see isHostCallbackScheduled = true here without any requests, and only later realize it "knows" host callback will be scheduled because it returned true, and it trusts host config to respect that.

There's similar implicitness in the error code path.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error code path seems inconsistent with what this code seems to imply.

The host config does schedule a callback in the error case (if this throws).
But the catch below doesn't have the corresponding lines:

            markSchedulerSuspended(currentTime);
            isHostCallbackScheduled = true;

Should it?

gaearon added a commit to gaearon/react that referenced this pull request Aug 22, 2019
sebmarkbage pushed a commit that referenced this pull request Aug 22, 2019
* Revert "Revert "[Scheduler] Profiling features (#16145)" (#16392)"

This reverts commit 4ba1412.

* Fix copy paste mistake

* Remove init path dependency on ArrayBuffer

* Add a regression test for cancelling multiple tasks

* Prevent deopt from adding isQueued later

* Remove pop() calls that were added for profiling

* Verify that Suspend/Unsuspend events match up in tests

This currently breaks tests.

* Treat Suspend and Resume as exiting and entering work loop

Their definitions used to be more fuzzy. For example, Suspend didn't always fire on exit, and sometimes fired when we did _not_ exit (such as at task enqueue).

I chatted to Boone, and he's saying treating Suspend and Resume as strictly exiting and entering the loop is fine for their use case.

* Revert "Prevent deopt from adding isQueued later"

This reverts commit 9c30b0b.

Unnecessary because GCC

* Start counter with 1

* Group exports into unstable_Profiling namespace

* No catch in PROD codepath

* No label TODO

* No null checks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants