Add file-like object support to Streaming API #2400

mthrok · 2022-05-18T17:22:30Z

This commit adds file-like object support to Streaming API.

Features

File-like objects are expected to implement read(self, n).
Additionally seek(self, offset, whence) is used if available.
Without seek method, some formats cannot be decoded properly.
- To work around this, one can use the existing decoder option to tell what decoder it should use.
- The set of decoder and decoder_option arguments were added to add_basic_[audio|video]_stream method, similar to add_[audio|video]_stream.
- So as to have the arguments common to both audio and video in front of the rest of the arguments, the order of the arguments are changed.
- Also dtype and format arguments were changed to make them consistent across audio/video methods.

Code structure

The approach is very similar to how file-like object is supported in sox-based I/O.
In Streaming API if the input src is string, it is passed to the implementation bound with TorchBind,
if the src has read attribute, it is passed to the same implementation bound via PyBind 11.

Refactoring involved

Extracted to Refactor Streamer implementation #2402
- Some implementation in the original TorchBind surface layer is converted to Wrapper class so that they can be re-used from PyBind11 bindings. The wrapper class serves to simplify the binding.
- add_basic_[audio|video]_stream methods were removed from C++ layer as it was just constructing string and passing it to add_[audio|video]_stream method, which is simpler to do in Python.
- The original core Streamer implementation kept the use of types in c10 namespace minimum. All the c10::optional and c10::Dict were converted to the equivalents of std at binding layer. But since they work fine with PyBind11, Streamer core methods deal them directly.

TODO:

Check if it is possible to stream MP4 (yuv420p) from S3 and directly decode (with/without HW decoding).

Summary: * Move the helper wrapping code in TorchBind layer to proper wrapper class for so that it will be re-used in PyBind11. * Move `add_basic_[audio|video]_stream` methods from C++ to Python, as they are just string manipulation. This will make PyBind11-based binding simpler as it needs not to deal with dtype. * Move `add_[audio|video]_stream` wrapper signature to Streamer core, so that Streamer directly deals with `c10::optional`.† † Related to this, there is a slight change in how the empty filter expression is stored. Originally, if an empty filter expression was given to `add_[audio|video]_stream` method, the `StreamReaderOutputStream` was showing it as empty string `""`, even though internally it was using `"anull"` or `"null"`. Now `StreamReaderOutputStream` shows the corresponding filter expression that is actually being used. Ref pytorch#2400 Pull Request resolved: pytorch#2402 Differential Revision: D36488808 Pulled By: mthrok fbshipit-source-id: e2bdc7325566b6fd4f1a2ede0cbd7406b5366bb5

Summary: * Move the helper wrapping code in TorchBind layer to proper wrapper class for so that it will be re-used in PyBind11. * Move `add_basic_[audio|video]_stream` methods from C++ to Python, as they are just string manipulation. This will make PyBind11-based binding simpler as it needs not to deal with dtype. * Move `add_[audio|video]_stream` wrapper signature to Streamer core, so that Streamer directly deals with `c10::optional`.† † Related to this, there is a slight change in how the empty filter expression is stored. Originally, if an empty filter expression was given to `add_[audio|video]_stream` method, the `StreamReaderOutputStream` was showing it as empty string `""`, even though internally it was using `"anull"` or `"null"`. Now `StreamReaderOutputStream` shows the corresponding filter expression that is actually being used. Ref #2400 Pull Request resolved: #2402 Reviewed By: nateanl Differential Revision: D36488808 Pulled By: mthrok fbshipit-source-id: 877ca731364d10fc0cb9d97e75d55df9180f2047

facebook-github-bot · 2022-05-19T17:08:42Z

@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2022-05-19T19:02:29Z

@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: This commit adds file-like object support to Streaming API. ## Features - File-like objects are expected to implement `read(self, n)`. - Additionally `seek(self, offset, whence)` is used if available. - Without `seek` method, some formats cannot be decoded properly. - To work around this, one can use the existing `decoder` option to tell what decoder it should use. - The set of `decoder` and `decoder_option` arguments were added to `add_basic_[audio|video]_stream` method, similar to `add_[audio|video]_stream`. - So as to have the arguments common to both audio and video in from of the rest of the arguments, the order of the arguments are changed. - Also `dtype` and `format` arguments were changed to make them consistent across audio/video methods. ## Code structure The approach is very similar to how file-like object is supported in sox-based I/O. In Streaming API if the input src is string, it is passed to the implementation bound with TorchBind, if the src has `read` attribute, it is passed to the same implementation bound via PyBind 11. ![Untitled drawing](https://user-images.githubusercontent.com/855818/169098391-6116afee-7b29-460d-b50d-1037bb8a359d.png) ## Refactoring involved - Extracted to pytorch#2402 - Some implementation in the original TorchBind surface layer is converted to Wrapper class so that they can be re-used from PyBind11 bindings. The wrapper class serves to simplify the binding. - `add_basic_[audio|video]_stream` methods were removed from C++ layer as it was just constructing string and passing it to `add_[audio|video]_stream` method, which is simpler to do in Python. - The original core Streamer implementation kept the use of types in `c10` namespace minimum. All the `c10::optional` and `c10::Dict` were converted to the equivalents of `std` at binding layer. But since they work fine with PyBind11, Streamer core methods deal them directly. - On Python side, the switch of binding happens in the constructor of `StreamReader` class. Since all the methods have to be delegated to the same set of binding, a backend was introduced, which is abstracted away from user code. ## TODO: - [x] Check if it is possible to stream MP4 (yuv420p) from S3 and directly decode (with/without HW decoding). Pull Request resolved: pytorch#2400 Differential Revision: D36520073 Pulled By: mthrok fbshipit-source-id: 3f79875e7635386283893a7c08cd19d4d0f8efa5

facebook-github-bot · 2022-05-19T20:47:21Z

This pull request was exported from Phabricator. Differential Revision: D36520073

Summary: This commit adds file-like object support to Streaming API. ## Features - File-like objects are expected to implement `read(self, n)`. - Additionally `seek(self, offset, whence)` is used if available. - Without `seek` method, some formats cannot be decoded properly. - To work around this, one can use the existing `decoder` option to tell what decoder it should use. - The set of `decoder` and `decoder_option` arguments were added to `add_basic_[audio|video]_stream` method, similar to `add_[audio|video]_stream`. - So as to have the arguments common to both audio and video in from of the rest of the arguments, the order of the arguments are changed. - Also `dtype` and `format` arguments were changed to make them consistent across audio/video methods. ## Code structure The approach is very similar to how file-like object is supported in sox-based I/O. In Streaming API if the input src is string, it is passed to the implementation bound with TorchBind, if the src has `read` attribute, it is passed to the same implementation bound via PyBind 11. ![Untitled drawing](https://user-images.githubusercontent.com/855818/169098391-6116afee-7b29-460d-b50d-1037bb8a359d.png) ## Refactoring involved - Extracted to pytorch#2402 - Some implementation in the original TorchBind surface layer is converted to Wrapper class so that they can be re-used from PyBind11 bindings. The wrapper class serves to simplify the binding. - `add_basic_[audio|video]_stream` methods were removed from C++ layer as it was just constructing string and passing it to `add_[audio|video]_stream` method, which is simpler to do in Python. - The original core Streamer implementation kept the use of types in `c10` namespace minimum. All the `c10::optional` and `c10::Dict` were converted to the equivalents of `std` at binding layer. But since they work fine with PyBind11, Streamer core methods deal them directly. - On Python side, the switch of binding happens in the constructor of `StreamReader` class. Since all the methods have to be delegated to the same set of binding, a backend was introduced, which is abstracted away from user code. ## TODO: - [x] Check if it is possible to stream MP4 (yuv420p) from S3 and directly decode (with/without HW decoding). Pull Request resolved: pytorch#2400 Differential Revision: D36520073 Pulled By: mthrok fbshipit-source-id: dd3b001cf122f97c408fcb1d79c01faa8ffc617a

facebook-github-bot · 2022-05-19T20:57:47Z

This pull request was exported from Phabricator. Differential Revision: D36520073

facebook-github-bot · 2022-05-19T22:43:39Z

@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2022-05-20T00:31:33Z

@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: This commit adds file-like object support to Streaming API. ## Features - File-like objects are expected to implement `read(self, n)`. - Additionally `seek(self, offset, whence)` is used if available. - Without `seek` method, some formats cannot be decoded properly. - To work around this, one can use the existing `decoder` option to tell what decoder it should use. - The set of `decoder` and `decoder_option` arguments were added to `add_basic_[audio|video]_stream` method, similar to `add_[audio|video]_stream`. - So as to have the arguments common to both audio and video in from of the rest of the arguments, the order of the arguments are changed. - Also `dtype` and `format` arguments were changed to make them consistent across audio/video methods. ## Code structure The approach is very similar to how file-like object is supported in sox-based I/O. In Streaming API if the input src is string, it is passed to the implementation bound with TorchBind, if the src has `read` attribute, it is passed to the same implementation bound via PyBind 11. ![Untitled drawing](https://user-images.githubusercontent.com/855818/169098391-6116afee-7b29-460d-b50d-1037bb8a359d.png) ## Refactoring involved - Extracted to pytorch#2402 - Some implementation in the original TorchBind surface layer is converted to Wrapper class so that they can be re-used from PyBind11 bindings. The wrapper class serves to simplify the binding. - `add_basic_[audio|video]_stream` methods were removed from C++ layer as it was just constructing string and passing it to `add_[audio|video]_stream` method, which is simpler to do in Python. - The original core Streamer implementation kept the use of types in `c10` namespace minimum. All the `c10::optional` and `c10::Dict` were converted to the equivalents of `std` at binding layer. But since they work fine with PyBind11, Streamer core methods deal them directly. - On Python side, the switch of binding happens in the constructor of `StreamReader` class. Since all the methods have to be delegated to the same set of binding, a backend was introduced, which is abstracted away from user code. ## TODO: - [x] Check if it is possible to stream MP4 (yuv420p) from S3 and directly decode (with/without HW decoding). Pull Request resolved: pytorch#2400 Differential Revision: D36520073 Pulled By: mthrok fbshipit-source-id: 9ceb5a2470abf3b764a12f3abe1355311ccc7eb4

facebook-github-bot · 2022-05-20T02:44:01Z

This pull request was exported from Phabricator. Differential Revision: D36520073

carolineechen · 2022-05-20T06:12:00Z

torchaudio/io/_stream_reader.py

@@ -232,51 +295,63 @@ class StreamReader:
            You can use this argument to change the input source before it is passed to decoder.

            Default: ``None``.
+
+        buffer_size (int):
+            The internal buffer size in byte. Unsed only when `src` is file-like object.


Suggested change

The internal buffer size in byte. Unsed only when `src` is file-like object.

The internal buffer size in byte. Used only when `src` is file-like object.

carolineechen · 2022-05-20T06:22:52Z

torchaudio/io/_stream_reader.py

+                If the source stream is exhausted before enough frames are buffered,
+                then the chunk is returned as-is."""
+
+_buffer_chunk_size = """Internal buffer size.


could you add default args to the optional parameters as well?

Summary: This commit adds file-like object support to Streaming API. ## Features - File-like objects are expected to implement `read(self, n)`. - Additionally `seek(self, offset, whence)` is used if available. - Without `seek` method, some formats cannot be decoded properly. - To work around this, one can use the existing `decoder` option to tell what decoder it should use. - The set of `decoder` and `decoder_option` arguments were added to `add_basic_[audio|video]_stream` method, similar to `add_[audio|video]_stream`. - So as to have the arguments common to both audio and video in from of the rest of the arguments, the order of the arguments are changed. - Also `dtype` and `format` arguments were changed to make them consistent across audio/video methods. ## Code structure The approach is very similar to how file-like object is supported in sox-based I/O. In Streaming API if the input src is string, it is passed to the implementation bound with TorchBind, if the src has `read` attribute, it is passed to the same implementation bound via PyBind 11. ![Untitled drawing](https://user-images.githubusercontent.com/855818/169098391-6116afee-7b29-460d-b50d-1037bb8a359d.png) ## Refactoring involved - Extracted to pytorch#2402 - Some implementation in the original TorchBind surface layer is converted to Wrapper class so that they can be re-used from PyBind11 bindings. The wrapper class serves to simplify the binding. - `add_basic_[audio|video]_stream` methods were removed from C++ layer as it was just constructing string and passing it to `add_[audio|video]_stream` method, which is simpler to do in Python. - The original core Streamer implementation kept the use of types in `c10` namespace minimum. All the `c10::optional` and `c10::Dict` were converted to the equivalents of `std` at binding layer. But since they work fine with PyBind11, Streamer core methods deal them directly. - On Python side, the switch of binding happens in the constructor of `StreamReader` class. Since all the methods have to be delegated to the same set of binding, a backend was introduced, which is abstracted away from user code. ## TODO: - [x] Check if it is possible to stream MP4 (yuv420p) from S3 and directly decode (with/without HW decoding). Pull Request resolved: pytorch#2400 Differential Revision: D36520073 Pulled By: mthrok fbshipit-source-id: 9ceb5a2470abf3b764a12f3abe1355311ccc7eb4

facebook-github-bot · 2022-05-21T01:12:02Z

@mthrok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: This commit adds file-like object support to Streaming API. ## Features - File-like objects are expected to implement `read(self, n)`. - Additionally `seek(self, offset, whence)` is used if available. - Without `seek` method, some formats cannot be decoded properly. - To work around this, one can use the existing `decoder` option to tell what decoder it should use. - The set of `decoder` and `decoder_option` arguments were added to `add_basic_[audio|video]_stream` method, similar to `add_[audio|video]_stream`. - So as to have the arguments common to both audio and video in front of the rest of the arguments, the order of the arguments are changed. - Also `dtype` and `format` arguments were changed to make them consistent across audio/video methods. ## Code structure The approach is very similar to how file-like object is supported in sox-based I/O. In Streaming API if the input src is string, it is passed to the implementation bound with TorchBind, if the src has `read` attribute, it is passed to the same implementation bound via PyBind 11. ![Untitled drawing](https://user-images.githubusercontent.com/855818/169098391-6116afee-7b29-460d-b50d-1037bb8a359d.png) ## Refactoring involved - Extracted to pytorch#2402 - Some implementation in the original TorchBind surface layer is converted to Wrapper class so that they can be re-used from PyBind11 bindings. The wrapper class serves to simplify the binding. - `add_basic_[audio|video]_stream` methods were removed from C++ layer as it was just constructing string and passing it to `add_[audio|video]_stream` method, which is simpler to do in Python. - The original core Streamer implementation kept the use of types in `c10` namespace minimum. All the `c10::optional` and `c10::Dict` were converted to the equivalents of `std` at binding layer. But since they work fine with PyBind11, Streamer core methods deal them directly. ## TODO: - [x] Check if it is possible to stream MP4 (yuv420p) from S3 and directly decode (with/without HW decoding). Pull Request resolved: pytorch#2400 Reviewed By: carolineechen Differential Revision: D36520073 Pulled By: mthrok fbshipit-source-id: 271c86a09bdddb1c66c19ce5586be663cb1f7725

facebook-github-bot · 2022-05-21T01:29:03Z

This pull request was exported from Phabricator. Differential Revision: D36520073

github-actions · 2022-05-21T23:53:31Z

Hey @mthrok.
You merged this PR, but labels were not properly added. Please add a primary and secondary label (See https://github.com/pytorch/audio/blob/main/.github/process_commit.py)

facebook-github-bot added the CLA Signed label May 18, 2022

mthrok force-pushed the ffmpeg-fileobj branch 3 times, most recently from d9ff9da to ab4394b Compare May 18, 2022 19:16

mthrok mentioned this pull request May 18, 2022

Refactor Streamer implementation #2402

Closed

mthrok force-pushed the ffmpeg-fileobj branch 7 times, most recently from d71374e to 675694d Compare May 19, 2022 04:11

mthrok force-pushed the ffmpeg-fileobj branch from 675694d to abe8ce9 Compare May 19, 2022 16:59

mthrok force-pushed the ffmpeg-fileobj branch from 1875a40 to 7656df1 Compare May 19, 2022 19:02

mthrok force-pushed the ffmpeg-fileobj branch from 7656df1 to 752cbea Compare May 19, 2022 20:47

mthrok force-pushed the ffmpeg-fileobj branch from 752cbea to 08fbf45 Compare May 19, 2022 20:57

mthrok force-pushed the ffmpeg-fileobj branch from 04b8f3e to beceba4 Compare May 20, 2022 02:43

mthrok marked this pull request as ready for review May 20, 2022 03:27

mthrok requested review from hwangjeff and carolineechen May 20, 2022 03:27

carolineechen reviewed May 20, 2022

View reviewed changes

mthrok force-pushed the ffmpeg-fileobj branch from b1f4942 to c69b274 Compare May 21, 2022 00:03

carolineechen approved these changes May 21, 2022

View reviewed changes

mthrok force-pushed the ffmpeg-fileobj branch from 22c1d4e to aa91aa0 Compare May 21, 2022 01:29

facebook-github-bot closed this in a984872 May 21, 2022

mthrok deleted the ffmpeg-fileobj branch May 22, 2022 00:24

mthrok added module: IO new feature labels May 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add file-like object support to Streaming API #2400

Add file-like object support to Streaming API #2400

mthrok commented May 18, 2022 •

edited

Loading

facebook-github-bot commented May 19, 2022

facebook-github-bot commented May 19, 2022

facebook-github-bot commented May 19, 2022

facebook-github-bot commented May 19, 2022

facebook-github-bot commented May 19, 2022

facebook-github-bot commented May 20, 2022

facebook-github-bot commented May 20, 2022

carolineechen May 20, 2022

mthrok May 20, 2022

carolineechen May 20, 2022

mthrok May 20, 2022

facebook-github-bot commented May 21, 2022

facebook-github-bot commented May 21, 2022

github-actions bot commented May 21, 2022

	The internal buffer size in byte. Unsed only when `src` is file-like object.
	The internal buffer size in byte. Used only when `src` is file-like object.

Add file-like object support to Streaming API #2400

Add file-like object support to Streaming API #2400

Conversation

mthrok commented May 18, 2022 • edited Loading

Features

Code structure

Refactoring involved

TODO:

facebook-github-bot commented May 19, 2022

facebook-github-bot commented May 19, 2022

facebook-github-bot commented May 19, 2022

facebook-github-bot commented May 19, 2022

facebook-github-bot commented May 19, 2022

facebook-github-bot commented May 20, 2022

facebook-github-bot commented May 20, 2022

carolineechen May 20, 2022

Choose a reason for hiding this comment

mthrok May 20, 2022

Choose a reason for hiding this comment

carolineechen May 20, 2022

Choose a reason for hiding this comment

mthrok May 20, 2022

Choose a reason for hiding this comment

facebook-github-bot commented May 21, 2022

facebook-github-bot commented May 21, 2022

github-actions bot commented May 21, 2022

mthrok commented May 18, 2022 •

edited

Loading