Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gfile: conditionally import tensorflow_io #5491

Merged
merged 9 commits into from
Jan 13, 2022
Merged

Conversation

yatbear
Copy link
Member

@yatbear yatbear commented Jan 7, 2022

Conditionally import tensorflow_io module for additional cloud file system support (https://www.tensorflow.org/io). If the module is missing, prompt the user to run pip install tensorflow_io.

Relevant issues:

#tensorflow_io

@yatbear yatbear linked an issue Jan 7, 2022 that may be closed by this pull request
@yatbear yatbear requested a review from nfelt January 7, 2022 22:27
Copy link
Contributor

@nfelt nfelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on!

tensorboard/backend/event_processing/data_ingester.py Outdated Show resolved Hide resolved
tensorboard/backend/event_processing/data_ingester.py Outdated Show resolved Hide resolved
tensorboard/backend/event_processing/data_ingester.py Outdated Show resolved Hide resolved
tensorboard/backend/event_processing/data_ingester.py Outdated Show resolved Hide resolved
tensorboard/backend/event_processing/data_ingester.py Outdated Show resolved Hide resolved
tensorboard/backend/event_processing/data_ingester.py Outdated Show resolved Hide resolved
tensorboard/backend/event_processing/data_ingester.py Outdated Show resolved Hide resolved
tensorboard/backend/event_processing/data_ingester.py Outdated Show resolved Hide resolved

def testTryToSupportTfio(self):
with mock.patch.object(tf.io, "gfile") as gfile_mock:
gfile_mock.return_value.get_registered_schemes = mock.MagicMock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this multi-level mocking necessary? I would have though it would suffice to just do

with mock.patch.object(tf.io.gfile, "get_registered_schemes")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, I initially mocked for getattr(tf.io.gfile, "get_registered_schemes", None) in the first layer, but now realized it's unnecessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made further changes in the mock due to attribute get_registered_schemes not found error: https://github.com/tensorflow/tensorboard/runs/4794613032?check_suite_focus=true

tensorboard/backend/event_processing/data_ingester_test.py Outdated Show resolved Hide resolved
mock_import.assert_not_called()
mock_gfile_exists.assert_called_once_with("gs://bucket/abc")

def testTryToSupportTfio_fallback_raiseError(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test might be simpler if you only have one error (the UnimplementedError from exists()) and don't also include the ImportError. Otherwise, it's sort of combining multiple error conditions, which seems unnecessarily complicated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This simpler version was the test above, this one is meant to check the error message (with no supported schemes if fallen back). But I can remove it if the check is unnecessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, I see the reasoning now since it affects the error message.

tensorboard/backend/event_processing/data_ingester_test.py Outdated Show resolved Hide resolved
path: A strings representing an input log directory.
Returns:
Filesystem scheme, None if the path doesn't contain one. The filesystem
scheme is usually separated by `://` from the local filesystem path if
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally optional but I'd tend to put a longer comment like this up above Args (and after the 1-liner docstring), vs hiding it inside Returns.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -30,6 +30,7 @@
from tensorboard.plugins.pr_curve import metadata as pr_curve_metadata
from tensorboard.plugins.scalar import metadata as scalar_metadata
from tensorboard.util import tb_logging
from tensorboard.compat import tf
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a "//tensorboard/compat:tensorflow" dep on the corresponding BUILD target for this.

Same goes for the test file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

mock_import.assert_not_called()
mock_gfile_exists.assert_called_once_with("gs://bucket/abc")

def testTryToSupportTfio_fallback_raiseError(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, I see the reasoning now since it affects the error message.

Copy link
Contributor

@nfelt nfelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, generally LG aside from the build deps!

One note - if possible, it's nice if you can avoid force-pushing the PR for cases where that would discard commits that have previous comments on them, since the GitHub UI makes it difficult to respond to previous comments if their commits were already discarded. It should usually be possible to just push new commits w/ whatever changes you've made to the PR, and if you need to merge in new changes from master that can be done w/ a merge commit.

class FileSystemSupport(tb_test.TestCase):
def testCheckFilesystemSupport(self):
with mock.patch.object(tf.io, "gfile", autospec=True) as mock_gfile:
mock_gfile.get_registered_schemes = mock.MagicMock(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think the explicit mock construction shouldn't be necessary. Re this comment:

Made further changes in the mock due to attribute get_registered_schemes not found error: https://github.com/tensorflow/tensorboard/runs/4794613032?check_suite_focus=true

From the errors there, it looks like get_registered_schemes not found is happening because this test is ending up using the TF stub rather than actually managing to import TF, since the error I see is:

AttributeError: <module 'tensorboard.compat.tensorflow_stub.io.gfile' from '...data_ingester_test.runfiles/org_tensorflow_tensorboard/tensorboard/compat/tensorflow_stub/io/gfile.py'> does not have the attribute 'get_registered_schemes'

I think the failure to import TF might be somehow related to the summary dependency issues (since those were also failing in the same CI run). Would you mind trying again with the unaffected nightly?

Also, since this means that this test does depend on a real TF in order to run properly, we should add a "//tensorboard:expect_tensorflow_installed" placeholder dep on the BUILD rule for this test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks for the info!

Copy link
Contributor

@nfelt nfelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@yatbear yatbear merged commit 931fdb8 into tensorflow:master Jan 13, 2022
@yatbear yatbear deleted the tf_io branch January 13, 2022 23:53
bmd3k pushed a commit to bmd3k/tensorboard that referenced this pull request Jan 19, 2022
* conditionally import tensorflow_io

* refactor
bmd3k pushed a commit that referenced this pull request Jan 20, 2022
* conditionally import tensorflow_io

* refactor
@rhps
Copy link

rhps commented Mar 15, 2022

Hi, Is it already released?

@yatbear
Copy link
Member Author

yatbear commented Mar 15, 2022

@rhps Yes.

yatbear added a commit to yatbear/tensorboard that referenced this pull request Mar 27, 2023
* conditionally import tensorflow_io

* refactor
dna2github pushed a commit to dna2fork/tensorboard that referenced this pull request May 1, 2023
* conditionally import tensorflow_io

* refactor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

v2.7 tensorboard works with s3
3 participants