Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: revise usage of property getters #1402

Closed
jerch opened this issue Apr 22, 2018 · 8 comments
Closed

proposal: revise usage of property getters #1402

jerch opened this issue Apr 22, 2018 · 8 comments

Comments

@jerch
Copy link
Member

jerch commented Apr 22, 2018

Due to some performance tests for the new parser stuff I encountered the internally used property getters being a major bottleneck. Here are some numbers for ls -lh /usr/lib on my ubuntu machine:

  1. master branch: ~8s in total, ~4.8s for Javascript
  2. moving .buffer to actionPrint method (see new parser #1399): ~6s in total, ~3.4s for Javascript

--> 1.5x as fast just by avoiding the .buffer getter? What about all those other getters 😉

@Tyriar It seems the property accessors are not optimized by most JS engines, at least Firefox and Chrome show the same bad performance for them while a direct access via a public attribute is much faster. I am aware that getter/setter provide a nice way of encapsulation while being able to change the underlying implementation, but with such a huge performance impact I propose to at least revise their widespread usage in the core parts. Maybe the access could be be realized by attributes, that are handled especially careful.

Somewhat offtopic:
Btw with an additional debounce buffer in the demo app server script I was able to get ls -lh /usr/lib with the changes in 2. down to under 3s in total. That is only 300% the time the native xterm needs for the same command - this is really amazing, congrats for getting the rendering part this fast!
Without the buffer the websocket eats much time itself sending single or two byte frames (thats the greyish thing in chrome summary pie). Not being a "bug" of xterm.js itself it still might be worth a note in the docs, if people are just copying the demo over to their apps.

Update: Fix timings and add some pics:

master branch
grafik

changes from 2. without debounce buffer
grafik

changes from 2. with debounce buffer
grafik

@Tyriar
Copy link
Member

Tyriar commented Apr 23, 2018

@jerch interesting 😮. One way we could remove getters and still keep many of the benefits is for properties that we're just exposing readonly version of them we can have the interface property marked as readonly and just expose the variable:

interface IBufferSet {
  // Usages via the IBufferSet will not allow assignment
  readonly activeBuffer: IBuffer;
}

class BufferSet implements IBufferSet {
  public activeBuffer: IBuffer;
}

// Instead of
class BufferSet {
  private _activeBuffer: IBuffer;
  public get activeBuffer(): IBuffer { return this._activeBuffer}
}

Terminal.buffer is a little different here though as it's exposing a property on buffers for convenience only. The workaround in #1399 is good.


What do you mean by additional debounce buffer?

@mofux
Copy link
Contributor

mofux commented Apr 23, 2018

Nice observation! I've seen the performance issue with the getters when experimenting with xterm.js a while ago. Buffer.lines is another getter that is used a lot that we can probably get rid of, because it always returns this._lines anyway.

Regarding the debounce buffer:
This is something I've never really thought about, but you are absolutely right, debouncing the buffer at the backend can dramatically reduce the amount of IO overhead between the server and the client. I just implemented a proof-of-concept in my own app and it was a 2x gain in speed when running commands that produce high throughput (seq 1000000 is a good test candidate).

@Tyriar
Debounce buffer means that you queue the data of the backend pty stream (like node-pty) up to a certain amount of bytes or time, so you can essentially send larger chunks of data with a single message to the fronend. If you observe the data stream on node-pty, you can notice that the data event (sometimes) fires for every single character that is written to the stream. This can cause tens of thousands of 'data' events per second. In my POC I collected and emitted data at 60 FPS to the frontend which made for a good experience.

@jerch
Copy link
Member Author

jerch commented Apr 23, 2018

@ debounce:
Yeah as mofux said, I simply aggregated bursts of small data events from the pty into one bigger chunk before sending to the websocket. The buffering was very small (2ms timeout or chunk size < 1024 bytes), still the effect was really big. I did not try to get in line with the 60 fps rule of chrome though.

@jerch
Copy link
Member Author

jerch commented Apr 23, 2018

Gonna close this since most should be addressed with #1403

About the debounce thing:
@Tyriar Is it worth to add some debounce functionality in node-pty? Idk if the IO overhead of small data events effects only websocket user, if it also effects local transport it might be good to give people some options to control it without the headache to implement it themselves.

@jerch jerch closed this as completed Apr 23, 2018
@Tyriar
Copy link
Member

Tyriar commented Apr 23, 2018

Is it worth to add some debounce functionality in node-pty?

Yes an option for this at the very least seems like a good idea for remote use cases. I have doubts that this would be faster in an Electron environment as there will be additional work/GC maintaining a buffer or strings, flushing it and concatenating them (unless there was a new API with an option which fired data with string[]?).

@jerch
Copy link
Member Author

jerch commented Apr 25, 2018

@Tyriar I am not aware of a string[] data event.

Imho this also depends on the way how the data is generated in the slave program - if a C program generates single byte writes to stdout and the system is almost idle, those bytes will most likely not be buffered up by the OS itself, instead be propagated to the pty one by one, with many expensive context switches under linux at least. Thats what I've encountered when I tested the alternative poller implementation for node-pty.
NodeJS will then create a new data event for every single byte - at that stage there cant be done much against it since it is the way libuv is implemented (well not quite true - since node-pty controls the C/C++ interface to the pty it could do some buffering at that level, but meh, thats not easy to get done right).

At the higher level inside the JS part of node-pty it boils down to the question if many callback invocations are more expensive than a tiny buffer with small delay and the expensive additional string/buffer copying + gc'ing. For websocket this seems to be the case, I guess because of the wrapping/unwrapping of a single byte into a much bigger frame (almost no data load) and the transport abstraction between (even local). When I moved the escape sequence parser to JS it originally used function callbacks for the transition actions and was 10x slower than with the switch it has now. This was mainly due to function calls with almost no data load and non local state handling (attribute lookups). The parser was busy with handling its object state and not with the data lol.

@Tyriar
Copy link
Member

Tyriar commented Apr 25, 2018

@jerch I mean if we need to send 3 data events, we could fire data with a string[] instead and avoid concatenation all together. Not sure if it's worth doing this or not.

@jerch
Copy link
Member Author

jerch commented Apr 26, 2018

Hmm interesting idea. I think only tests can tell if there is anything to save by an approach like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants