Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve ntfs_device_size_get for file #45

Open
wants to merge 3 commits into
base: edge
Choose a base branch
from

Conversation

kgermanov
Copy link

@kgermanov kgermanov commented Jul 1, 2022

For avoid random read we can use IO callback for get size from struct stat.
Issue: #46

For avoid random read we can use IO callback for get stat.
@kgermanov
Copy link
Author

@szakacsits What do you think about this?

@jpandre
Copy link
Collaborator

jpandre commented Jul 1, 2022

AFAICT this does not work for a block device. You should check whether you are mounting a regular file.

@kgermanov
Copy link
Author

kgermanov commented Jul 4, 2022

@jpandre Usually block device should be handled by some ioctl codes. But anyway we should handle the situation, when it cannot, thx.

@kgermanov
Copy link
Author

@jpandre Is it ok now?

@unsound
Copy link
Member

unsound commented Jul 12, 2022

@kgermanov While trusting 'stat' is usually fine, the binary search method is more reliable with regard to the actual underlying file characteristics and it doesn't require that many calls. In addition it ensures that the entire length of the file can be accessed (e.g. it's not corrupted in the file system).
Can you explain in what way the binary search method is problematic for you?

@kgermanov
Copy link
Author

kgermanov commented Jul 12, 2022

@unsound For 2TB disk it requred about log(2*1024*1024*1024*1024) = 41 calls. But most problem in that this calls are randomly over whole disk (Did not work buffered IO).
If underlying fs does not fast (for example on slow NFS) there is may be problem.
Corrupted file system can break read calls for any chunk, binary search does not provide guarantees.

@unsound
Copy link
Member

unsound commented Jul 12, 2022

Maybe it's just me but 41 I/O requests doesn't seem like much, even random ones, compared to what the driver would issue during normal filesystem operation. How much does your fix speed up mounting in this particular scenario?

The particular corruption I'm thinking about is when internal extents end before logical ones, and also some filter filesystems can choose to expose a device as a 0-byte (indeterminate) file, which if we apply this patch couldn't be opened.

@kgermanov
Copy link
Author

@unsound My case was extremly corner: each read's call take 1s. In this situation impove from 1 min to 5 sec for 2TB file for mount.
We can do binary search if stat return size less than 512.

@jpandre
Copy link
Collaborator

jpandre commented Jul 12, 2022

AFAICT ntfs_device_size_get() is not used while mounting (mounting relies on data stored in the boot sector). It is only used by a few ntfsprogs : mkntfs, ntfsclone, ntfsfix, ntfslabel and ntfsresize.

@unsound
Copy link
Member

unsound commented Jul 12, 2022

@jpandre Ah, I could have sworn I've seen it used in the library code as a sanity check but I may have confused it with another project. Then the impact is much smaller than I thought initially.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants