Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Call os.fsync on flush. #729

Open
dchao34 opened this issue Dec 3, 2024 · 0 comments
Open

Call os.fsync on flush. #729

dchao34 opened this issue Dec 3, 2024 · 0 comments

Comments

@dchao34
Copy link

dchao34 commented Dec 3, 2024

We encountered this issue while writing tensorboard events into a GCS Fuse mounted directory tree.

It turns out that GCS fuse requires the os buffers to to be flushed before others that also mount the same GCS bucket can see the writes (c.f. https://github.com/googlecloudplatform/gcsfuse/blob/master/docs/semantics.md#readwrites).

Since https://github.com/lanpa/tensorboardX/blob/master/tensorboardX/record_writer.py#L192 only does a python file buffer level flush, one needs to add another line just below that to call os.fsync. Something like this

def flush(self):
        self._writer.flush()
        os.fsync(self._writer.fileno())

With this change, gcs fuse mounted directories behaves as you would expect.

torch.utils.tensorboard has the behavior on GCS fuse described here out of the box.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant