-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(fonts): make fonts async #7
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly just questions! we can talk about them irl in our meeting though
lib/pdffonts.js
Outdated
module.exports = require('bindings')('pdffonts'); | ||
const Bluebird = require('bluebird'); | ||
|
||
const pdffonts = require('bindings')('pdffonts'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: thoughts on capitalizing? PDFFonts
? PdfFonts
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PDFFonts
works for me
lib/pdffonts.js
Outdated
const pdffonts = require('bindings')('pdffonts'); | ||
|
||
module.exports = { | ||
fonts: Bluebird.promisify(pdffonts.fonts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason not to use Bluebird.promisifyAll
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason promisifyAll
didn't work...it throws Error: expected 2 arguments
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh. w e i r d.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out we can use promisifyAll
but I was just forgetting the prefix. Will investigate further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After further investigation it seems like the prefix is required. I'd prefer that the method name is fonts
, so I'll leave as is.
test/pdffonts.test.js
Outdated
}); | ||
}); | ||
|
||
it('handles Type 3 fonts', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious what a type 3 font is and why we're explicitly testing for it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't fully understand but its basically another type of font, characterized by the fact that "glyphs [are] defined using the full PostScript language" (from wiki). This enables them to do fancy stuff like shading, color or patterns in fonts. It does not support hinting which is a feature of certain font types that enables a font to adjust how it is rendered at small scales for readability.
For what it means to us - Type 3 fonts aren't that common in the wild. Enfocus says they're "primarily decorative ornaments fonts with a lot of tonal variations". Type 3 fonts can not have a name, they can just be a collection of user-defined vector graphics bundled into a "font". I've seen this happen in PDFs generated from latex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha. is there a reason we specifically have this test? was it initially not being included and this is to prevent regression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry, I didn't mention why. Type 3 fonts can have a null name, so I needed to add this check to prevent a null pointer error. This test covers that code path for coverage.
} | ||
|
||
NAN_MODULE_INIT(Init) { | ||
Nan::Set(target, Nan::New("fonts").ToLocalChecked(), Nan::GetFunction(Nan::New<FunctionTemplate>(Fonts)).ToLocalChecked()); | ||
// Set globalParams once. globalParams is a global variable and is referenced |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we just setting this once instead of during the function call? is this more efficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
globalParams
has already been declared once globally, so it only needs to be set once in order for it to be used. If we set it in the function call multiple invocations of the method could potentially re-set the value. I initially had it set and destroyed in the function call but I ran into intermittent segfaults - I believe it is because multiple threads were running and setting/destroying the same shared memory. I explain more IRL, @dckrooney helped me debug this.
src/pdffonts.cc
Outdated
return; | ||
// Validate that arguments are correct | ||
if (!info[0]->IsString()) { | ||
return Nan::ThrowError(Nan::New("expected arg 0: string filename").ToLocalChecked()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional: usually its a slightly worse ux when an async function can possibly throw
as well. cause then, depending on how you use it, youd have to wrap it in a try/catch
to handle a possible error and also handle the possible rejection. since this is intending on being private, i dont think its a huge deal, but just thought id mention it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked IRL - since the sync errors are handled by Bluebird the public facing functionality is that errors are thrown asynchronously. I'm considering just leaving that for now.
src/pdffonts.cc
Outdated
void HandleErrorCallback() { | ||
Nan::HandleScope scope; | ||
v8::Local<v8::Value> argv[] = { | ||
Nan::New(this->ErrorMessage()).ToLocalChecked(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what part of this makes it an Error
object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! I don't think it is returned as an Error
object, it is a string. I would prefer an Error
- I'll investigate how to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait...checking again I think it is returned in JS as an Error
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the default behavior is pretty much what I've written, so I'll take this out:
https://github.com/nodejs/nan/blob/77d0046d4a8378a0cd31d8110130471b44c99d8d/nan.h#L1869-L1876
fontArray | ||
}; | ||
|
||
callback->Call(2, argv); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the 2
arg here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first argument is argc
(I'm assuming the argument count) and the second is argv
, the argument vector: https://github.com/nodejs/nan/blob/master/doc/callback.md
src/pdffonts.cc
Outdated
callback->Call(2, argv); | ||
} | ||
|
||
void HandleErrorCallback() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this automatically get called over HandleOKCallback
if this->ErrorMessage()
is set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked IRL - I believe this is the case: https://github.com/nodejs/nan/blob/master/nan.h#L1804-L1813
src/pdffonts.cc
Outdated
|
||
const char *fontName; | ||
if (font->getName() == NULL) { | ||
fontName = "null"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we want the string "null"
over a null
object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah good point - theres probably a more appropriate value. Maybe a null object or an empty string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to make it the empty string
src/pdffonts.cc
Outdated
|
||
info.GetReturnValue().Set(fontArray); | ||
// Start the async worker | ||
Nan::AsyncQueueWorker(new FontsWorker( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(keep in mind that i have no idea what im talking about) since we use new
wont that put this on the heap so wed have to clean it up later? or does nan do that for us? or am i thinking about this wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my long-winded attempt at explaining V8 handle scopes - I believe that any V8/nan-related objects created are properly GC'd by V8/nan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! thanks again for the irl reivew! it was super helpful 👍
What
fonts
method to perform work asynchronously, away from the main Node event loop.fonts
now takes a callback function as a second argument, that it will call witherror, fonts
on completionWhy
We want to be use this package to log PDF fonts on API requests. Previously
fonts
was performing the work in the main Node event loop, thereby blocking other execution from happening on the loop. This PR performs the majority of the work in a separate worker loop, with the goal of improving performance so that it can be used in API without negative side effects.Performance
I ran some tests, the results can be viewed here. A quick summary of the results:
*
Notes
*
If you see the distribution of letter processing times you'll see the majority of times are between 0-40ms, but there are some resources that took over 1s. When those resources are isolated andfonts
is re-run on them, they do not take the same amount of time. My theory is that these times might have to do with the concurrency at which I'm running the tests on my laptop, as well as the resource limitations of my machine, and isn't representative of how it would perform in API.Resources