Skip to content

Commit

Permalink
Merge pull request #148 from eltociear/patch-1
Browse files Browse the repository at this point in the history
Update ray/README.md
  • Loading branch information
shahrokhDaijavad authored May 18, 2024
2 parents 7a851bf + ba33327 commit 3eeaf71
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion transforms/code/code_quality/ray/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ This module captures code specific metrics of input data. The implementation is

* line specific metrics include mean & max line length
* character and token ratio - uses the input tokenizer to tokenize the input data & measure the ratio between the characters and tokens
* identifies the high occurence of the keywords "test " or "config" and tags them as config or test samples
* identifies the high occurrence of the keywords "test " or "config" and tags them as config or test samples
* tags the samples as autogenerated if the sample contains keywords like `auto-generated`, `autogenerated` or `automatically generated`
* programming language specific identification, where:
* if the input sample is `python` programming language and sample has no reference to constructs like def, class, it is highlighted as `has_no_keywords`
Expand Down

0 comments on commit 3eeaf71

Please sign in to comment.