diff --git a/transforms/code/code_quality/ray/README.md b/transforms/code/code_quality/ray/README.md index 120814155..e19cfc370 100644 --- a/transforms/code/code_quality/ray/README.md +++ b/transforms/code/code_quality/ray/README.md @@ -10,7 +10,7 @@ This module captures code specific metrics of input data. The implementation is * line specific metrics include mean & max line length * character and token ratio - uses the input tokenizer to tokenize the input data & measure the ratio between the characters and tokens -* identifies the high occurence of the keywords "test " or "config" and tags them as config or test samples +* identifies the high occurrence of the keywords "test " or "config" and tags them as config or test samples * tags the samples as autogenerated if the sample contains keywords like `auto-generated`, `autogenerated` or `automatically generated` * programming language specific identification, where: * if the input sample is `python` programming language and sample has no reference to constructs like def, class, it is highlighted as `has_no_keywords`