-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quieter logs in LanKit test suite #135
Conversation
|
||
load-test: | ||
poetry run pytest langkit/tests --load | ||
poetry run pytest langkit/tests -o log_level=WARN -o log_cli=true --load |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ERROR level that is the most common problem I see when running the load tests, the standard unit tests I think are ok at INFO?
e.g.
ERROR whylogs.experimental.core.udf_schema:udf_schema.py:77 Evaluating UDF response.monosyllable_count failed
Traceback (most recent call last):
File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/whylogs/experimental/core/udf_schema.py", line 74, in _apply_udfs_on_row
new_columns[new_col] = udf(values)[0]
File "/home/jamie/projects/v1/TextMetricsToolkit/langkit/textstat.py", line 49, in wrappee
return [stat(input) for input in text[column]]
File "/home/jamie/projects/v1/TextMetricsToolkit/langkit/textstat.py", line 49, in <listcomp>
return [stat(input) for input in text[column]]
File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/textstat/textstat.py", line 1407, in monosyllabcount
word_list = self.remove_punctuation(text).split()
File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/textstat/textstat.py", line 268, in remove_punctuation
text = re.sub(punctuation_regex, '', text)
File "/usr/lib/python3.8/re.py", line 210, in sub
return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or bytes-like object
ERROR whylogs.experimental.core.udf_schema:udf_schema.py:77 Evaluating UDF response.difficult_words failed
Traceback (most recent call last):
File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/whylogs/experimental/core/udf_schema.py", line 74, in _apply_udfs_on_row
new_columns[new_col] = udf(values)[0]
File "/home/jamie/projects/v1/TextMetricsToolkit/langkit/textstat.py", line 49, in wrappee
return [stat(input) for input in text[column]]
File "/home/jamie/projects/v1/TextMetricsToolkit/langkit/textstat.py", line 49, in <listcomp>
return [stat(input) for input in text[column]]
File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/textstat/textstat.py", line 920, in difficult_words
return len(self.difficult_words_list(text, syllable_threshold))
File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/textstat/textstat.py", line 942, in difficult_words_list
words = set(re.findall(r"[\w\='‘’]+", text.lower()))
AttributeError: 'int' object has no attribute 'lower'
ERROR whylogs.experimental.core.udf_schema:udf_schema.py:77 Evaluating UDF response.aggregate_reading_level failed
Traceback (most recent call last):
File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/whylogs/experimental/core/udf_schema.py", line 74, in _apply_udfs_on_row
new_columns[new_col] = udf(values)[0]
File "/home/jamie/projects/v1/TextMetricsToolkit/langkit/textstat.py", line 60, in wrappee
return [stat(input, float_output=True) for input in text[column]]
File "/home/jamie/projects/v1/TextMetricsToolkit/langkit/textstat.py", line 60, in <listcomp>
return [stat(input, float_output=True) for input in text[column]]
File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/textstat/textstat.py", line 1191, in text_standard
lower = self._legacy_round(self.flesch_kincaid_grade(text))
File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/textstat/textstat.py", line 711, in flesch_kincaid_grade
sentence_length = self.avg_sentence_length(text)
File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/textstat/textstat.py", line 400, in avg_sentence_length
asl = float(self.lexicon_count(text) / self.sentence_count(text))
File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/textstat/textstat.py", line 295, in lexicon_count
text = self.remove_punctuation(text)
File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/textstat/textstat.py", line 268, in remove_punctuation
text = re.sub(punctuation_regex, '', text)
File "/usr/lib/python3.8/re.py", line 210, in sub
return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or bytes-like object
ERROR whylogs.experimental.core.udf_schema:udf_schema.py:77 Evaluating UDF response.custom_group_count failed
Traceback (most recent call last):
File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/whylogs/experimental/core/udf_schema.py", line 74, in _apply_udfs_on_row
new_columns[new_col] = udf(values)[0]
File "/home/jamie/projects/v1/TextMetricsToolkit/langkit/count_regexes.py", line 26, in wrappee
return [count_patterns(pattern_group, input) for input in text[column]]
File "/home/jamie/projects/v1/TextMetricsToolkit/langkit/count_regexes.py", line 26, in <listcomp>
return [count_patterns(pattern_group, input) for input in text[column]]
File "/home/jamie/projects/v1/TextMetricsToolkit/langkit/count_regexes.py", line 18, in count_patterns
if expression.search(text):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After this PR, any ERRORs logged should reflect real unexpected errors that we should investigate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whyuser@222bf565ee10:/workspace/langkit-config1$ make load-test
poetry run pytest langkit/tests -o log_level=WARN -o log_cli=true --load
========================================================================= test session starts ==========================================================================
platform linux -- Python 3.8.17, pytest-7.4.0, pluggy-1.2.0
rootdir: /workspace/langkit-config1
collected 28 items
langkit/tests/test_callback_handler.py::test_callback_passthroughs_undefined_ok PASSED [ 3%]
langkit/tests/test_callback_handler.py::test_callback_passthroughs_undefined_no_args PASSED [ 7%]
langkit/tests/test_callback_handler.py::test_callback_passthroughs_defined_functions PASSED [ 10%]
langkit/tests/test_callback_handler.py::test_callback_passthroughs_defined_logging_functions PASSED [ 14%]
langkit/tests/test_callback_handler.py::test_callback_instance_handler_defined PASSED [ 17%]
langkit/tests/test_callback_handler.py::test_callback_instance_handler_with_metadata PASSED [ 21%]
langkit/tests/test_callback_handler.py::test_callback_instance_handler_defined_getattr PASSED [ 25%]
langkit/tests/test_callback_handler.py::test_callback_instance_three_ply_class_hierarchy PASSED [ 28%]
langkit/tests/test_count_patterns.py::test_count_patterns[False] PASSED [ 32%]
langkit/tests/test_count_patterns.py::test_count_patterns[True] PASSED [ 35%]
langkit/tests/test_injections.py::test_injections PASSED [ 39%]
langkit/tests/test_injections.py::test_injections_long_prompt PASSED [ 42%]
langkit/tests/test_input_output.py::test_init_call PASSED [ 46%]
langkit/tests/test_input_output.py::test_custom_encoder PASSED [ 50%]
langkit/tests/test_input_output.py::test_similarity PASSED [ 53%]
langkit/tests/test_nlp_scores.py::test_bleu_score PASSED [ 57%]
langkit/tests/test_patterns.py::test_ptt[False] PASSED [ 60%]
langkit/tests/test_patterns.py::test_ptt[True] PASSED [ 64%]
langkit/tests/test_patterns.py::test_individual_patterns_isolated PASSED [ 67%]
langkit/tests/test_sentiment.py::test_sentiment PASSED [ 71%]
langkit/tests/test_textstat.py::test_textstat PASSED [ 75%]
langkit/tests/test_themes.py::test_init_call PASSED [ 78%]
langkit/tests/test_themes.py::test_theme_custom PASSED [ 82%]
langkit/tests/test_themes.py::test_theme PASSED [ 85%]
langkit/tests/test_themes.py::test_themes_with_json_string PASSED [ 89%]
langkit/tests/test_themes.py::test_themes_standalone PASSED [ 92%]
langkit/tests/test_toxicity.py::test_toxicity PASSED [ 96%]
langkit/tests/test_toxicity.py::test_toxicity_long_response PASSED [100%]
=========================================================================== warnings summary ===========================================================================
.venv/lib/python3.8/site-packages/textstat/textstat.py:7
/workspace/langkit-config1/.venv/lib/python3.8/site-packages/textstat/textstat.py:7: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
import pkg_resources
langkit/tests/test_injections.py::test_injections
/workspace/langkit-config1/.venv/lib/python3.8/site-packages/transformers/models/open_llama/modeling_open_llama.py:42: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
logger.warn(
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=================================================================== 28 passed, 2 warnings in 30.36s ====================================================================
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that looks good. We don't need to explicitly set the level to warning since that is default, better to drop the:
-o log_level=WARN
from the commands.
Also I don't see anything that looks broken in the unit tests with INFO logging, why switch that off?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for quietness... We can set it to whatever the consensus level is
@@ -78,7 +78,7 @@ def meteor_score(text): | |||
return result | |||
|
|||
else: | |||
diagnostic_logger.warning( | |||
diagnostic_logger.info( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add the reference corpus to LangKitConfig
so that we don't have to explicitly reinitialize this module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Increase log level around tests with intentional bad inputs. Change logging level to info instead of warn for normal input/output module init.