From ed4e9c1f8cfb77084d095d99200b68355cc059f4 Mon Sep 17 00:00:00 2001 From: Shahrokh Daijavad Date: Mon, 18 Nov 2024 16:40:37 -0800 Subject: [PATCH] Update README.md utils folder is one level up from the python folder --- transforms/universal/fdedup/python/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/transforms/universal/fdedup/python/README.md b/transforms/universal/fdedup/python/README.md index d2d940344..295862221 100644 --- a/transforms/universal/fdedup/python/README.md +++ b/transforms/universal/fdedup/python/README.md @@ -39,7 +39,7 @@ shingles. `num_minhashes_per_band` minhashes. For each document, generate a unique signature for every band. The values for `num_bands` and `num_minhashes_per_band` determine the likelihood that documents with a certain Jaccard -similarity will be marked as duplicates. A Jupyter notebook in the [utils](utils) folder generates a graph of this +similarity will be marked as duplicates. A Jupyter notebook in the [utils](../utils) folder generates a graph of this probability function, helping users explore how different settings for `num_bands` and `num_minhashes_per_band` impact the deduplication process.