eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
-
Updated
Sep 14, 2022 - D
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
Random Sampling in Clojure
Efficient reservoir sampling implementation for PyTorch
Performs memory-efficient reservoir sampling on very large input files delimited by newlines
A collection of algorithms in Java 8 for the problem of random sampling with a reservoir
Sampling methods for data streams
Produce a sample of lines from files.
Sample documents from MongoDB collections.
Python implementation of fast approximation reservioir sampling.
SAT'18 Paper: SPUR - Satisfying Perfectly Uniform Random sampler (Winner Best Student Paper)
Reservoir sampling implementation with akka-streams support
A fast implementation of Reservoir Sampling with Immutable Persistent data structures.
Stream sampler that picks a random (representative) sample of size k from a stream of values with unknown and possibly very large length.
Output randomly sampled lines from input stream or file
Data- and processor- parallelism for fast weighted sampling
Reservoir Sampling for Group-By Queries in Flink Platform. Answering effectively Single Aggregate.
A stream sampler extracts one or more sample sets, each with a given number of elements, from a stream. Each possible sample set (of the given size) has an equal probability of being extracted. A stream sampler is an online algorithm: The size of the input is unknown, and only one pass over the stream is possible.
Add a description, image, and links to the reservoir-sampling topic page so that developers can more easily learn about it.
To associate your repository with the reservoir-sampling topic, visit your repo's landing page and select "manage topics."