How to use rayon to speedup reading files by blocks? #1164
-
We have a huge binary file, whose content can be divided into millions of blocks with same size (order does not matter). What we want is to read each block, map them into instances of same structure, and process each instance. I think maybe we should use |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 8 replies
-
Since you appear to be able to compute the block offsets upfront, you should be able to start with Rayon's let file = File::open("path/to/large/file")?;
let len = file.metadata()?.len();
let block_size = 4096;
let blocks = len / block_size;
(0..blocks)
.into_par_iter()
.map(|block| block * block_size)
.try_for_each_init(
|| vec![0; block_size],
|buf, offset| {
file.read_exact_at(buf, offset)?;
// TODO: process buf...
buf.clear();
Ok(())
},
)?; Alternatively, you might want to look into using the EDIT: The buffer must be initialized for |
Beta Was this translation helpful? Give feedback.
Since you appear to be able to compute the block offsets upfront, you should be able to start with Rayon's
ParallelIterator
implementation forRange
, e.g.Alternatively, you might want to look into using the
memmap2
crate to map the whole file into your process' a…