Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add batched copy for regular and block device files #748

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

koonwen
Copy link

@koonwen koonwen commented Aug 18, 2024

The default behavior of Flow.copy in the uring backend doesn't leverage batching syscalls to be sent in one shot. In principle, files that are bounded in size can use this batched submission pattern to reduce context switching overhead. The batched copy implementation is taken directly from lib_eio_linux/tests/eurcp.ml.

Old behaviour:
trace_old

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ------------------
 33.44    0.000418          13        30           io_uring_enter

New behaviour:
trace_new

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ------------------
  0.00    0.000000           0         7           io_uring_enter

Tests:

#OLD
+default_fs: 288.79 MB/s
{
  "config": {
    "uname": "Linux debian-thinkpad 6.1.0-23-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.99-1 (2024-07-15) x86_64 GNU/Linux",
    "backend": "linux",
    "recommended_domain_count": 4
  },
  "results": [
    {
      "name": "Flow.copy",
      "metrics": [
        {
          "name": "default_fs",
          "value": 302819851.7110036,
          "units": "bytes/s",
          "description": "default_fs Flow.copy"
        }
      ]
    }
  ]
}

#NEW
+default_fs: 462.35 MB/s
{
  "config": {
    "uname": "Linux debian-thinkpad 6.1.0-23-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.99-1 (2024-07-15) x86_64 GNU/Linux",
    "backend": "linux",
    "recommended_domain_count": 4
  },
  "results": [
    {
      "name": "Flow.copy",
      "metrics": [
        {
          "name": "default_fs",
          "value": 484808364.3907238,
          "units": "bytes/s",
          "description": "default_fs Flow.copy"
        }
      ]
    }
  ]
}

Copy link
Collaborator

@talex5 talex5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for investigating this!

Doing this in Flow.copy might be a problem, because normally it copies bytes in order but this doesn't. For example, if I'm doing tail -f dst then I might miss some parts of the file. But having an Eio.Path.copy that worked this way would be reasonable.

Also, it shouldn't rely on fixed buffers as they're not always available (this is why the benchmarks are failing). I'm not convinced they're any faster either (Anil said they only make a difference for networking, and rarely even then, I think).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants