-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds node.js logic for networking tasks for PDF.js #8712
Conversation
src/display/node_stream.js
Outdated
}; | ||
|
||
function PDFNetworkStreamRangeReader(path, begin, end) { | ||
this._rangeRequest = fs.createReadStream(path, { begin, end, }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change begin=>start fs.createReadStream(path, { start, end, });
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done with new commit.
src/display/node_stream.js
Outdated
let fs = require('fs'); | ||
let assert = require('assert'); | ||
|
||
function createPromiseCapabilities() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't duplicate this functionality, just import it from src/shared/util.js
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few more changes:
fs.lstat
can be used to get length -- it is async so you can use it as a part of headersReady- rename PDFNetworkStream to PDFNodeStream (and class syntax can be used here)
- at "src/pdf.js", import setPDFNetworkStream and depending on
isNodeJS
, use it to set PDFNetworkStream or PDFNodeStream.
src/display/node_stream.js
Outdated
}, | ||
|
||
cancel(reason) { | ||
this._fullRequest.cancel(reason); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this._rangeRequeste.destroy(reason);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/display/node_stream.js
Outdated
} | ||
PDFNetworkStreamRangeReader.prototype = { | ||
get isStreamingSupported() { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return true;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/display/node_stream.js
Outdated
}, | ||
|
||
cancel(reason) { | ||
this._fullRequest.close(reason); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this._fullRequest.destroy(reason);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/display/node_stream.js
Outdated
}, | ||
|
||
get isStreamingSupported() { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return true;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
src/display/node_stream.js
Outdated
}, | ||
|
||
get isRangeSupported() { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return true;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
src/display/node_stream.js
Outdated
PDFNetworkStreamFullReader.prototype = { | ||
|
||
get contentLength() { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when fs.Stats are received, we will know stream length
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
67aaf16
to
712de53
Compare
src/display/network.js
Outdated
if (this._validateRangeRequestCapabilities()) { | ||
// Setting right content length. | ||
this._contentLength = | ||
parseInt(this.getResponseHeader('Content-Length'), 10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't just assigning a value, without doing any validation, mean that if the Content-Length header isn't present then this._contentLength
will be NaN
which probably isn't very good (compare with the old code which uses a !isInt(...)
check)?
The following might not matter, but asking anyway since I'm not sure:
With this patch, you're always setting this._contentLength
whereas before that was only done when range requests were supported; does this matter at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In refactoring patches we want to preserve initial logic as much as possible, so I recommend to return multiple values/object from validateRangeRequestCapabilities, e.g. {allowRangeRequests: ..., suggestedLength: ... }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this._contentLength
will beNaN
which probably isn't very good (compare with the old code which uses a !isInt(...) check)?
Yeah, you are right. Sorry I didn't noticed that. I will add check for isInt
. Thanks.
you're always setting this._contentLength whereas before that was only done when range requests were supported
This commit is not intended to change any functionality of the network.js file(just to extract the logic for range request support). I think setting this._contentLength
inside if (validateRangeRequestCapability) {...
would be better. Thanks for pointing out.
return multiple values/object from validateRangeRequestCapabilities, e.g.
{allowRangeRequests: ..., suggestedLength: ... }
Okay, we'll do this way. Thanks.
c30ca67
to
de29337
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a cursory review, and found some issues.
Some observations:
-
It seems that you have based the logic on
src/display/transport_stream.js
. This is not necessarily a bad thing, but I wonder whether this indeed makes most sense in Node.js. -
The implementation does not try to minimize resource usage. At the very least, once a request has finished and the response has yielded, it is probably sufficient to discard the underlying handle to the data (e.g. the IncomingMessage).
-
There are no unit tests that demonstrates the new functionality, let alone that the functionality works as expected. This could have caught some of the bugs that I mentioned below. Please add them. If you need inspiration: Currently we have one Node.js-specific test, namely https://github.com/mozilla/pdf.js/blob/5b5781b45d234666241bf3354c0d390315c31d1a/test/unit/display_svg_spec.js.
src/display/node_stream.js
Outdated
this.source = options.source; | ||
this.url = url.parse(this.source.url); | ||
this.isHttp = this.url.protocol === 'http' || | ||
this.url.protocol === 'https'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the protocol
property of a parsed URL ends with a colon, so replace 'http'
with 'http:'
(similarly for 'https'
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks for pointing out.
src/display/node_stream.js
Outdated
method: 'GET', | ||
headers: stream.httpHeaders, | ||
}; | ||
this.request = http.request(options, (response) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to check whether this._url.protocol
is https:
, and if so, use the https.request, and http.request otherwise. Similarly for other places where you use http.request
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks for the advice.
src/display/node_stream.js
Outdated
method: 'GET', | ||
headers: stream.httpHeaders, | ||
}; | ||
this.request = http.request(options, (response) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this.request
should be private, this._request
at the very least.
And also see my previous comment about http.request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
src/display/node_stream.js
Outdated
} | ||
|
||
cancel(reason) { | ||
this._rangeRequest.cancel(reason); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_rangeRequest
should not be a property of the base class, because the base class does not need to know about the implementation detail, so you should remove the use of _rangeRequest
here. Looking at the implementation, you should probably not directly expose _rangeRequest
, but possibly store a reference to the .read
method, maybe as _read
.
Besides, the _rangeRequest
object is not the same in the subclasses:
The argument of the callback to http.request
is an IncomingMessage, which does not have a cancel
method - https://nodejs.org/api/http.html#http_class_http_incomingmessage
In PDFNodeStreamFsRangeReader, the type is a ReadableStream, which does not have a cancel
method either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
http.request
is an IncomingMessage, which does not have acancel
method
Sorry for typo. This should be this._rangeRequest.close(reason)
but possibly store a reference to the .read method, maybe as _read.
Sorry, but I really don't understand your concern here. Should we have to remove cancel
from BaseRangeReader
class, and also not call this._rangeReder.read()
directly in read()
method ?
Can you please explain it a bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but possibly store a reference to the
.read
method, maybe as_read
.
As far as I understand, do you mean to store reference of .read
method of this._rangeRequest
in _read
property of PDFNodeStreamRangeReader
(same for PDFNodeStreamFsRangeReader
), somthing like: this._read = this._rangeRequest.read
, and then use this._read
inside read()
of base class?
src/display/node_stream.js
Outdated
this._loaded = 0; | ||
this._readCapability = createPromiseCapability(); | ||
|
||
this._rangeRequest = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be removed, see my note below about _rangeRequest
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -593,8 +569,6 @@ PDFNetworkStreamRangeRequestReader.prototype = { | |||
}, | |||
}; | |||
|
|||
setPDFNetworkStreamClass(PDFNetworkStream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In src/pdf.js, we are trying to detect the environment(using isNodeJS
), and then setting PDFNodeStream
or PDFNetworkStream
, see one of the TODO here: #8712 (review)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But in non-Node.js, the stream class is no longer set. That's not good.
It is safe to repeatedly call setPDFNetworkStreamClass
, so restore this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
webpack can package node_stream and network in one file and initialize these in some random order. so it's better to call setPDFNetworkStreamClass from api.js and remove this line from the network.js
de29337
to
b148a29
Compare
src/display/network.js
Outdated
}, | ||
|
||
_onHeadersReceived: | ||
function PDFNetworkStreamFullRequestReader_onHeadersReceived() { | ||
let { allowRangeRequests, suggestedLength, } = | ||
validateRangeRequestCapabilities({ | ||
getResponseHeader: this.getResponseHeader.bind(this), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't bind() -- create a function here e.g. getResponseHeader: (name) => { return fullRequestXhr.getResponseHeader(name); },
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
src/display/node_stream.js
Outdated
|
||
cancel(reason) { | ||
this._fullRequest.close(reason); | ||
this._fullRequest.destroy(reason); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if we need two of these. can we use either close() or destroy()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this._fullRequest.close(reason);
. Thanks.
src/display/node_stream.js
Outdated
} | ||
|
||
get isStreamingSupported() { | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to have it set to true here, if it's enabled in options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/display/node_stream.js
Outdated
} | ||
|
||
this._isStreamingSupported = !stream.source.disableStream; | ||
this._isRangeSupported = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to have the above options applied for Fs readers, can we move them into base reader?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/display/node_stream.js
Outdated
this._isRangeSupported = true; | ||
} | ||
this._length = suggestedLength; | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this entire function body into the handleResponse -- notice that this._headersCapability.resolve() still has to be called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks better than before. I am still expecting some unit tests to ascertain that the introduced logic works as expected (see my previous review for more details).
@@ -593,8 +569,6 @@ PDFNetworkStreamRangeRequestReader.prototype = { | |||
}, | |||
}; | |||
|
|||
setPDFNetworkStreamClass(PDFNetworkStream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But in non-Node.js, the stream class is no longer set. That's not good.
It is safe to repeatedly call setPDFNetworkStreamClass
, so restore this line.
src/display/node_stream.js
Outdated
this._length = stream.source.length; | ||
this._loaded = 0; | ||
|
||
this._fullRequest = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name is a bit confusing, because you use the same name elsewhere in the file.
Furthermore, _fullRequest
is not an accurate description of the value.
How about renaming _fullRequest
of the *FullReader classes to _readableStream
? This would also make the code more understandable for those who are already familiar with Node.js streams.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about renaming
_fullRequest
of the *FullReader classes to_readableStream
?
Sounds like a good idea. Do you think we should also rename _rangeRequest
with _readableStream
(?) in *RangeReader?
src/display/node_stream.js
Outdated
this.url = url.parse(this.source.url); | ||
this.isHttp = this.url.protocol === 'http:' || | ||
this.url.protocol === 'https:'; | ||
this.isFsUrl = !this.url.host; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is isFsUrl
supposed to say whether a URL refers to the filesystem?
If so, this is not the right way to say so. The file:
scheme is commonly used to refer to local files, and it can have a host part too - see e.g. https://en.wikipedia.org/wiki/File_URI_scheme
What are the expected values for url that you intend to support here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the expected values for url that you intend to support here?
I am checking here if url refers to the filesystem or not, so that we can choose between PDFNodeStream***Reader
or PDFNodeStreamFs***Reader
.
I am trying to parse URL using url.parse
, and then checking for host. In most of cases I found that url
s are of the form ../....path/to/some/pdf.pdf
or if url
is absolute home/userName/path/to/some/pdf.pdf
or file:///home/user/path/to/some/pdf.pdf
-- (and all these cases works fine with above check). The only case that I found failing is file://localhost/path/to/some/pdf.pdf
, and I think this can be fix by checking for protocol === file:
.
Is it sounds reasonable to you, or you have something else in your mind? If not, can you suggest me something concrete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The canonical URL of a local file is a file:
-URL, so I suggest to check whether the scheme is file:
.
I am trying to add some tests for Although running
[1]: Note that I am following |
6e9f924
to
f7aa67b
Compare
src/display/network.js
Outdated
@@ -14,10 +14,11 @@ | |||
*/ | |||
|
|||
import { | |||
assert, createPromiseCapability, globalScope, isInt, MissingPDFException, | |||
assert, createPromiseCapability, globalScope, MissingPDFException, | |||
UnexpectedResponseException | |||
} from '../shared/util'; | |||
import { setPDFNetworkStreamClass } from './api'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove setPDFNetworkStreamClass import from this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is some confusion at #8712 (comment). Done with new commit, Thanks.
c4bad64
to
d3537c0
Compare
src/display/node_stream.js
Outdated
return Promise.reject(this._reason); | ||
} | ||
|
||
let chunk = this._rangeRequest.read(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is duplication of code. The two implementations of read
are identical.
What stops you from putting read
in the base class?
If -for some reason- you cannot use this._rangeRequest.read()
in the base class, then you can create a method in the subclass (e.g. _read() { return this._rangeRequest.read(); }
and call this._read();
from the base class instead.
And also, I thought that you had replaced "_rangeRequest" with "_readableStream", but I can not find that change (any more?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What stops you from putting read in the base class?
Sorry, but not sure why when I am doing something like #8712 (comment), I am unable to read from the stream, see #8712 (comment).
Also _read() { return this._rangeReques.read(); }
creates the same issue.
And also, I thought that you had replaced "_rangeRequest" with "_readableStream", but I can not find that change (any more?).
Done with new commit. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The context is missing from the comment that you've linked. I think that the breakage is unrelated to the method being in the base class. You now literally have two (actually, after looking again THREE, BaseFullReader.prototype.read
is almost identical, except with the "length" property in onProgress
) identical methods read
in the sub classes.
The inheritance tree looks like this:
- BaseRangeReader
- PDFNodeStreamRangeReader extends BaseRangeReader
- PDFNodeStreamFsRangeReader extends BaseRangeReader
Your current code duplicates the read
method in the latter two classes (the implementation is byte for byte identical). The runtime behavior is identical regardless of where you put the implementation of the "read" method, so you should put the implementation in the base class (BaseRangeReader), to improve the maintainability of the code.
It seems that we are not passing the right But even if we do, then the error should still be "require is not defined" or something like "require("fs") is not a module", because it does not make any sense to run the Node.js-only test in the browser. At minimum, skip the test when you detect that the environment is not Node, e.g. as done here: pdf.js/test/unit/display_svg_spec.js Lines 112 to 114 in ba5dbc9
|
I've submitted a patch for the |
d3537c0
to
d6e330c
Compare
Okay, sure. Tests for |
src/display/node_stream.js
Outdated
|
||
this._readableStream = null; | ||
let rangeStr = start + '-' + (end - 1); | ||
stream.httpHeaders['Range'] = 'bytes=' + rangeStr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create new object for httpHeaders (before 'Range' is changed) -- we don't want to pollute original httpHeaders object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in new commit. Thanks,
src/display/node_stream.js
Outdated
|
||
this._readableStream = null; | ||
let rangeStr = start + '-' + (end - 1); | ||
stream.httpHeaders['Range'] = 'bytes=' + rangeStr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use bytes=${start}-${end - 1}
format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/display/node_stream.js
Outdated
path: this._url.path, | ||
method: 'GET', | ||
headers: stream.httpHeaders, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move options creating into separate function (we have similar code above) and add auth
and port
field, also probably agent
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created createRequestOptions
function that returns options
object. Also tried to add a simple http.Agent
, please check if it looks good to you. Thanks.
src/display/node_stream.js
Outdated
this._errored = true; | ||
this._reason = reason; | ||
this._readCapability.resolve(); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move all .on('readable'/'end'/'error',
into e.g. private _setReadableStream
into base class -- it can be use by Fs constructor as well.
The same thing needs to be done for "full" readers too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in new commit.
Note: I have moved this._readableStream
property in BaseRangeReader
from concrete classes and it works just fine for me.
src/display/node_stream.js
Outdated
} | ||
return Promise.resolve({ value: chunk, done: false, }); | ||
}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree with moving read() and cancal() into base class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in new commit. Thanks.
src/display/node_stream.js
Outdated
constructor(stream, start, end) { | ||
super(stream); | ||
|
||
this._readableStream = fs.createReadStream(this._url.path, { start, end, }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as mentioned above, use this._setReadableStream(fs.createReadStream(this._url.path));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/display/network_utils.js
Outdated
import { isInt } from '../shared/util'; | ||
|
||
function validateRangeRequestCapabilities({ getResponseHeader, isHttp, | ||
rangeChunkSize, disableRange, }) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add assert(rangeChunkSize > 0);
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
this._rangeChunkSize = stream.source.rangeChunkSize; | ||
if (!this._rangeChunkSize && !this._disableRange) { | ||
this._disableRange = true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this_disableRange and this._rangeChunkSize logic into base class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
d6e330c
to
f0b7682
Compare
src/display/node_stream.js
Outdated
|
||
let { allowRangeRequests, suggestedLength, } = | ||
validateRangeRequestCapabilities({ | ||
getResponseHeader: this.getResponseHeader.bind(this), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove getResponseHeader function below and inline the response.headers access here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
src/display/node_stream.js
Outdated
keepAlive: true, | ||
maxSockets: 10, | ||
}; | ||
let agent = url.protocol === 'http:' ? new http.Agent(agentOptions) : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
too complicated, let's remove agent for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/display/node_stream.js
Outdated
} | ||
|
||
cancel(reason) { | ||
this._readableStream.destroy(reason); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here and at the range base reader, we need to think "what if cancel is called before _readableStream is set". I recommend to extract on('error'
body into _error(reason) method and call it when this._readableStream is not set. in _setReadableStream we can check if this._errored = true
we can quickly destroy it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in new commit. Thanks.
f0b7682
to
0ce321b
Compare
I see |
If I pass HTTP URL is the getinfo.js I get the following error:
|
Strange! seems like we are getting wrong url in var pdfjsLib = require('../../build/lib/pdf.js');
pdfjs.getDocument('http://mozilla.github.io/pdf.js/web/compressed.tracemonkey-pldi-09.pdf').
then((doc) => {...}); Although, earlier running the above version of |
0ce321b
to
c9a9806
Compare
2250094
to
3493f14
Compare
Okay, I find out the problem. The headers received by This create problem here: mukulmishra18@3493f14#diff-35bcb37ef640f9436dd67f5077c08b55R271 -- (returns
|
test/unit/node_stream_spec.js
Outdated
|
||
describe('node_stream', function() { | ||
let pdf1 = | ||
'http://mozilla.github.io/pdf.js/web/compressed.tracemonkey-pldi-09.pdf'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This results in an external network request, and slow down the unit test if network is slow, or even cause the unit test to fail if network is somehow unavailable.
Please use a local file server in Node for testing. Don't forget to close the server regardless of test failure. To do so you can use http.createServer in a
beforeAllhook (and start the server on a random free port using
.listen(0), and call
.close()on the server in an
afterAllhook. The return value of
.listenhas an
addressmethod that contains the random port in the
portproperty. After starting the server, you can use something like
http://127.0.0.1:${port}/tracemonkey.pdf`.
And for the browser... Why does this test have a special case !isNode()
if the API that you are testing is only supposed to work on Node? Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this script is not added to jasmine-bootstrap.js -- let's also add assert(isNode());
just in case somebody will decide to add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this test have a special case !isNode() if the API that you are testing is only supposed to work on Node?
Sorry, removed with new commit, as we don't that any more.
Please use a local file server in Node for testing
Added with new commit. Can you please check if it looks good to you?
let's also add assert(isNode()); just in case somebody will decide to add it.
Done. Thanks.
test/unit/node_stream_spec.js
Outdated
let path = __non_webpack_require__('path'); | ||
let url = __non_webpack_require__('url'); | ||
pdf2 = | ||
url.parse(path.join(process.cwd(), './test/pdfs/tracemonkey.pdf')).href; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, pdf2
looks like /path/to/pdf.js/test/pdfs/tracemonkey.pdf
.
I was expecting a file://
prefix, because otherwise it would not be a URL. Does the test really pass? Regardless, can you make sure that the url really looks like a URL and not a path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was expecting a file:// prefix, because otherwise it would not be a URL.
Then I think adding file://
with process.cwd()
will help, something like : url.parse(path.join('file://' + process.cwd(), './test/pdfs/tracemonkey.pdf'))
src/display/node_stream.js
Outdated
super(stream); | ||
|
||
this._setReadableStream( | ||
fs.createReadStream(this._url.path, { start, end, })); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still have (node:90529) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 1): Error: Bad end offset: 65537
. Node.js operates with end
being inclusive. Make it end: end - 1,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in new commit. Thanks.
test/unit/node_stream_spec.js
Outdated
let range12Reader = stream1.getRangeReader(pdfLength - tailSize, pdfLength); | ||
|
||
let range21Reader = stream2.getRangeReader(pdfLength - tailSize - rangeSize, | ||
pdfLength - tailSize - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Must be pdfLength - tailSize
without - 1
based on our API definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, with Rob's requested changes and off-by-one fix in the fs range reader
let's remove fs.readFileSync() from the getinfo.js example to show that we implemented this functionality and it works |
3493f14
to
8414ac6
Compare
test/unit/node_stream_spec.js
Outdated
import { assert, isNodeJS } from '../../src/shared/util'; | ||
import { PDFNodeStream } from '../../src/display/node_stream'; | ||
|
||
// Make sure that we are not running this script is Node.js environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Hasn't this comment accidentally been inverted, since I'm assuming that it should be e.g. Make sure that we are only running this script in Node.js environments.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops..., Thanks for pointing out.
8414ac6
to
4bf45cc
Compare
4bf45cc
to
d16709f
Compare
/botio test |
From: Bot.io (Windows)ReceivedCommand cmd_test from @yurydelendik received. Current queue size: 0 Live output at: http://54.215.176.217:8877/c7f9509b9d69410/output.txt |
From: Bot.io (Linux m4)ReceivedCommand cmd_test from @yurydelendik received. Current queue size: 0 Live output at: http://54.67.70.0:8877/2ee68f404042106/output.txt |
From: Bot.io (Linux m4)SuccessFull output at http://54.67.70.0:8877/2ee68f404042106/output.txt Total script time: 16.50 mins
|
From: Bot.io (Windows)SuccessFull output at http://54.215.176.217:8877/c7f9509b9d69410/output.txt Total script time: 29.72 mins
|
Thank you for the patch. |
Adds node.js logic for networking tasks for PDF.js
This PR implements IPDFStream interface in node.js environment.
fs.createReadStream
is used to createReadableStream
, that can be used to stream PDF data incrementally.