-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't Sync() on every upload #67
Comments
mostynb
added a commit
to mostynb/bazel-remote
that referenced
this issue
Dec 19, 2019
os.File.Sync() was inneffective on mac in go prior to 1.12. In 1.12 onwards it is effective, but slow: golang/go#26650 This change disables Sync for CAS uploads on mac, because we can (potentially) verify this later. Let's see if this helps performance on mac: buchgr#67 (while being less risky than buchgr#68).
mostynb
added a commit
to mostynb/bazel-remote
that referenced
this issue
Dec 30, 2019
os.File.Sync() was inneffective on mac in go prior to 1.12. In 1.12 onwards it is effective, but slow: golang/go#26650 This change disables Sync for CAS uploads on mac, because we can (potentially) verify this later. Let's see if this helps performance on mac: buchgr#67 (while being less risky than buchgr#68).
mostynb
added a commit
to mostynb/bazel-remote
that referenced
this issue
Jan 1, 2020
os.File.Sync() was inneffective on mac in go prior to 1.12. In 1.12 onwards it is effective, but slow: golang/go#26650 This change disables Sync for CAS uploads on mac, because we can (potentially) verify this later. Let's see if this helps performance on mac: buchgr#67 (while being less risky than buchgr#68).
mostynb
added a commit
to mostynb/bazel-remote
that referenced
this issue
Jan 28, 2020
This pattern is used in the filepath.Walk godocs. Discovered while investigating buchgr#67, committing separately so it doesn't get lost in a larger PR.
mostynb
added a commit
that referenced
this issue
Jan 28, 2020
This pattern is used in the filepath.Walk godocs. Discovered while investigating #67, committing separately so it doesn't get lost in a larger PR.
mostynb
added a commit
to mostynb/bazel-remote
that referenced
this issue
Feb 10, 2020
The header is made up of three fields: 1) Little-endian int32 (4 bytes) representing the REAPIv2 DigestFunction. 2) Little-endian int64 (8 bytes) representing the number of bytes in the blob. 3) The hash bytes from the digest, length determined by the particular DigestFunction. (32 for SHA256. 20 for SHA1, 16 for MD5). Note that we currently only support SHA256, however. This header is simple to parse, and does not require buffering the entire blob in memory if you just want the data. To distinguish blobs with and without this header, we use new directories for the affected blobs: ac.v2/ instead of ac/ and similarly for raw/. We do not use this header to actually verify data yet, and we still os.File.Sync() after file writes (buchgr#67). This also includes a slightly refactored version of PR buchgr#123 (load the items from disk concurrently) by @bdittmer.
mostynb
added a commit
to mostynb/bazel-remote
that referenced
this issue
Feb 10, 2020
The header is made up of three fields: 1) Little-endian int32 (4 bytes) representing the REAPIv2 DigestFunction. 2) Little-endian int64 (8 bytes) representing the number of bytes in the blob. 3) The hash bytes from the digest, length determined by the particular DigestFunction. (32 for SHA256. 20 for SHA1, 16 for MD5). Note that we currently only support SHA256, however. This header is simple to parse, and does not require buffering the entire blob in memory if you just want the data. To distinguish blobs with and without this header, we use new directories for the affected blobs: ac.v2/ instead of ac/ and similarly for raw/. We do not use this header to actually verify data yet, and we still os.File.Sync() after file writes (buchgr#67). This also includes a slightly refactored version of PR buchgr#123 (load the items from disk concurrently) by @bdittmer.
mostynb
added a commit
to mostynb/bazel-remote
that referenced
this issue
Feb 10, 2020
The header is made up of three fields: 1) Little-endian int32 (4 bytes) representing the REAPIv2 DigestFunction. 2) Little-endian int64 (8 bytes) representing the number of bytes in the blob. 3) The hash bytes from the digest, length determined by the particular DigestFunction. (32 for SHA256. 20 for SHA1, 16 for MD5). Note that we currently only support SHA256, however. This header is simple to parse, and does not require buffering the entire blob in memory if you just want the data. To distinguish blobs with and without this header, we use new directories for the affected blobs: ac.v2/ instead of ac/ and similarly for raw/. We do not use this header to actually verify data yet, and we still os.File.Sync() after file writes (buchgr#67). This also includes a slightly refactored version of PR buchgr#123 (load the items from disk concurrently) by @bdittmer.
mostynb
added a commit
to mostynb/bazel-remote
that referenced
this issue
Feb 10, 2020
The header is made up of three fields: 1) Little-endian int32 (4 bytes) representing the REAPIv2 DigestFunction. 2) Little-endian int64 (8 bytes) representing the number of bytes in the blob. 3) The hash bytes from the digest, length determined by the particular DigestFunction. (32 for SHA256. 20 for SHA1, 16 for MD5). Note that we currently only support SHA256, however. This header is simple to parse, and does not require buffering the entire blob in memory if you just want the data. To distinguish blobs with and without this header, we use new directories for the affected blobs: ac.v2/ instead of ac/ and similarly for raw/. We do not use this header to actually verify data yet, and we still os.File.Sync() after file writes (buchgr#67). This also includes a slightly refactored version of PR buchgr#123 (load the items from disk concurrently) by @bdittmer.
mostynb
added a commit
to mostynb/bazel-remote
that referenced
this issue
Feb 11, 2020
The header is made up of three fields: 1) Little-endian int32 (4 bytes) representing the REAPIv2 DigestFunction. 2) Little-endian int64 (8 bytes) representing the number of bytes in the blob. 3) The hash bytes from the digest, length determined by the particular DigestFunction. (32 for SHA256. 20 for SHA1, 16 for MD5). Note that we currently only support SHA256, however. This header is simple to parse, and does not require buffering the entire blob in memory if you just want the data. To distinguish blobs with and without this header, we use new directories for the affected blobs: ac.v2/ instead of ac/ and similarly for raw/. We do not use this header to actually verify data yet, and we still os.File.Sync() after file writes (buchgr#67). This also includes a slightly refactored version of PR buchgr#123 (load the items from disk concurrently) by @bdittmer.
mostynb
added a commit
to mostynb/bazel-remote
that referenced
this issue
Feb 14, 2020
The header is made up of three fields: 1) Little-endian int32 (4 bytes) representing the REAPIv2 DigestFunction. 2) Little-endian int64 (8 bytes) representing the number of bytes in the blob. 3) The hash bytes from the digest, length determined by the particular DigestFunction. (32 for SHA256. 20 for SHA1, 16 for MD5). Note that we currently only support SHA256, however. This header is simple to parse, and does not require buffering the entire blob in memory if you just want the data. To distinguish blobs with and without this header, we use new directories for the affected blobs: ac.v2/ instead of ac/ and similarly for raw/. We do not use this header to actually verify data yet, and we still os.File.Sync() after file writes (buchgr#67). This also includes a slightly refactored version of PR buchgr#123 (load the items from disk concurrently) by @bdittmer.
mostynb
added a commit
to mostynb/bazel-remote
that referenced
this issue
Feb 14, 2020
The header is made up of three fields: 1) Little-endian int32 (4 bytes) representing the REAPIv2 DigestFunction. 2) Little-endian int64 (8 bytes) representing the number of bytes in the blob. 3) The hash bytes from the digest, length determined by the particular DigestFunction. (32 for SHA256. 20 for SHA1, 16 for MD5). Note that we currently only support SHA256, however. This header is simple to parse, and does not require buffering the entire blob in memory if you just want the data. To distinguish blobs with and without this header, we use new directories for the affected blobs: ac.v2/ instead of ac/ and similarly for raw/. We do not use this header to actually verify data yet, and we still os.File.Sync() after file writes (buchgr#67). This also includes a slightly refactored version of PR buchgr#123 (load the items from disk concurrently) by @bdittmer.
mostynb
added a commit
to mostynb/bazel-remote
that referenced
this issue
Feb 18, 2020
The header is made up of three fields: 1) Little-endian int32 (4 bytes) representing the REAPIv2 DigestFunction. 2) Little-endian int64 (8 bytes) representing the number of bytes in the blob. 3) The hash bytes from the digest, length determined by the particular DigestFunction. (32 for SHA256. 20 for SHA1, 16 for MD5). Note that we currently only support SHA256, however. This header is simple to parse, and does not require buffering the entire blob in memory if you just want the data. To distinguish blobs with and without this header, we use new directories for the affected blobs: ac.v2/ instead of ac/ and similarly for raw/. We do not use this header to actually verify data yet, and we still os.File.Sync() after file writes (buchgr#67). This also includes a slightly refactored version of PR buchgr#123 (load the items from disk concurrently) by @bdittmer.
mostynb
added a commit
to mostynb/bazel-remote
that referenced
this issue
Feb 26, 2020
The header is made up of three fields: 1) Little-endian int32 (4 bytes) representing the REAPIv2 DigestFunction. 2) Little-endian int64 (8 bytes) representing the number of bytes in the blob. 3) The hash bytes from the digest, length determined by the particular DigestFunction. (32 for SHA256. 20 for SHA1, 16 for MD5). Note that we currently only support SHA256, however. This header is simple to parse, and does not require buffering the entire blob in memory if you just want the data. To distinguish blobs with and without this header, we use new directories for the affected blobs: ac.v2/ instead of ac/ and similarly for raw/. We do not use this header to actually verify data yet, and we still os.File.Sync() after file writes (buchgr#67). This also includes a slightly refactored version of PR buchgr#123 (load the items from disk concurrently) by @bdittmer.
mostynb
added a commit
to mostynb/bazel-remote
that referenced
this issue
Feb 26, 2020
The header is made up of three fields: 1) Little-endian int32 (4 bytes) representing the REAPIv2 DigestFunction. 2) Little-endian int64 (8 bytes) representing the number of bytes in the blob. 3) The hash bytes from the digest, length determined by the particular DigestFunction. (32 for SHA256. 20 for SHA1, 16 for MD5). Note that we currently only support SHA256, however. This header is simple to parse, and does not require buffering the entire blob in memory if you just want the data. To distinguish blobs with and without this header, we use new directories for the affected blobs: ac.v2/ instead of ac/ and similarly for raw/. We do not use this header to actually verify data yet, and we still os.File.Sync() after file writes (buchgr#67). This also includes a slightly refactored version of PR buchgr#123 (load the items from disk concurrently) by @bdittmer.
mostynb
added a commit
to mostynb/bazel-remote
that referenced
this issue
Mar 4, 2020
The header is made up of three fields: 1) Little-endian int32 (4 bytes) representing the REAPIv2 DigestFunction. 2) Little-endian int64 (8 bytes) representing the number of bytes in the blob. 3) The hash bytes from the digest, length determined by the particular DigestFunction. (32 for SHA256. 20 for SHA1, 16 for MD5). Note that we currently only support SHA256, however. This header is simple to parse, and does not require buffering the entire blob in memory if you just want the data. To distinguish blobs with and without this header, we use new directories for the affected blobs: ac.v2/ instead of ac/ and similarly for raw/. We do not use this header to actually verify data yet, and we still os.File.Sync() after file writes (buchgr#67). This also includes a slightly refactored version of PR buchgr#123 (load the items from disk concurrently) by @bdittmer.
mostynb
added a commit
to mostynb/bazel-remote
that referenced
this issue
Mar 8, 2020
The header is made up of three fields: 1) Little-endian int32 (4 bytes) representing the REAPIv2 DigestFunction. 2) Little-endian int64 (8 bytes) representing the number of bytes in the blob. 3) The hash bytes from the digest, length determined by the particular DigestFunction. (32 for SHA256. 20 for SHA1, 16 for MD5). Note that we currently only support SHA256, however. This header is simple to parse, and does not require buffering the entire blob in memory if you just want the data. To distinguish blobs with and without this header, we use new directories for the affected blobs: ac.v2/ instead of ac/ and similarly for raw/. We do not use this header to actually verify data yet, and we still os.File.Sync() after file writes (buchgr#67). This also includes a slightly refactored version of PR buchgr#123 (load the items from disk concurrently) by @bdittmer.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
We found that on some hardware this absolutely kills the performance of the system e.g. a Mac Pro became completely unresponsive. In general, there is no need for calling Sync() on every write as in the worst case it's ok to lose data. We should only ensure that on loading the entries from disk that there is data integrity.
The text was updated successfully, but these errors were encountered: