Quieter logs in LanKit test suite #135

richard-rogers · 2023-08-21T19:34:50Z

Increase log level around tests with intentional bad inputs. Change logging level to info instead of warn for normal input/output module init.

jamie256 · 2023-08-21T20:21:58Z

Makefile


 load-test:
-	poetry run pytest langkit/tests --load
+	poetry run pytest langkit/tests -o log_level=WARN -o log_cli=true --load


It's ERROR level that is the most common problem I see when running the load tests, the standard unit tests I think are ok at INFO?

e.g.

ERROR whylogs.experimental.core.udf_schema:udf_schema.py:77 Evaluating UDF response.monosyllable_count failed Traceback (most recent call last): File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/whylogs/experimental/core/udf_schema.py", line 74, in _apply_udfs_on_row new_columns[new_col] = udf(values)[0] File "/home/jamie/projects/v1/TextMetricsToolkit/langkit/textstat.py", line 49, in wrappee return [stat(input) for input in text[column]] File "/home/jamie/projects/v1/TextMetricsToolkit/langkit/textstat.py", line 49, in <listcomp> return [stat(input) for input in text[column]] File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/textstat/textstat.py", line 1407, in monosyllabcount word_list = self.remove_punctuation(text).split() File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/textstat/textstat.py", line 268, in remove_punctuation text = re.sub(punctuation_regex, '', text) File "/usr/lib/python3.8/re.py", line 210, in sub return _compile(pattern, flags).sub(repl, string, count) TypeError: expected string or bytes-like object ERROR whylogs.experimental.core.udf_schema:udf_schema.py:77 Evaluating UDF response.difficult_words failed Traceback (most recent call last): File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/whylogs/experimental/core/udf_schema.py", line 74, in _apply_udfs_on_row new_columns[new_col] = udf(values)[0] File "/home/jamie/projects/v1/TextMetricsToolkit/langkit/textstat.py", line 49, in wrappee return [stat(input) for input in text[column]] File "/home/jamie/projects/v1/TextMetricsToolkit/langkit/textstat.py", line 49, in <listcomp> return [stat(input) for input in text[column]] File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/textstat/textstat.py", line 920, in difficult_words return len(self.difficult_words_list(text, syllable_threshold)) File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/textstat/textstat.py", line 942, in difficult_words_list words = set(re.findall(r"[\w\='‘’]+", text.lower())) AttributeError: 'int' object has no attribute 'lower' ERROR whylogs.experimental.core.udf_schema:udf_schema.py:77 Evaluating UDF response.aggregate_reading_level failed Traceback (most recent call last): File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/whylogs/experimental/core/udf_schema.py", line 74, in _apply_udfs_on_row new_columns[new_col] = udf(values)[0] File "/home/jamie/projects/v1/TextMetricsToolkit/langkit/textstat.py", line 60, in wrappee return [stat(input, float_output=True) for input in text[column]] File "/home/jamie/projects/v1/TextMetricsToolkit/langkit/textstat.py", line 60, in <listcomp> return [stat(input, float_output=True) for input in text[column]] File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/textstat/textstat.py", line 1191, in text_standard lower = self._legacy_round(self.flesch_kincaid_grade(text)) File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/textstat/textstat.py", line 711, in flesch_kincaid_grade sentence_length = self.avg_sentence_length(text) File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/textstat/textstat.py", line 400, in avg_sentence_length asl = float(self.lexicon_count(text) / self.sentence_count(text)) File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/textstat/textstat.py", line 295, in lexicon_count text = self.remove_punctuation(text) File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/textstat/textstat.py", line 268, in remove_punctuation text = re.sub(punctuation_regex, '', text) File "/usr/lib/python3.8/re.py", line 210, in sub return _compile(pattern, flags).sub(repl, string, count) TypeError: expected string or bytes-like object ERROR whylogs.experimental.core.udf_schema:udf_schema.py:77 Evaluating UDF response.custom_group_count failed Traceback (most recent call last): File "/home/jamie/.cache/pypoetry/virtualenvs/langkit-EeFODeF5-py3.8/lib/python3.8/site-packages/whylogs/experimental/core/udf_schema.py", line 74, in _apply_udfs_on_row new_columns[new_col] = udf(values)[0] File "/home/jamie/projects/v1/TextMetricsToolkit/langkit/count_regexes.py", line 26, in wrappee return [count_patterns(pattern_group, input) for input in text[column]] File "/home/jamie/projects/v1/TextMetricsToolkit/langkit/count_regexes.py", line 26, in <listcomp> return [count_patterns(pattern_group, input) for input in text[column]] File "/home/jamie/projects/v1/TextMetricsToolkit/langkit/count_regexes.py", line 18, in count_patterns if expression.search(text):

After this PR, any ERRORs logged should reflect real unexpected errors that we should investigate.

whyuser@222bf565ee10:/workspace/langkit-config1$ make load-test poetry run pytest langkit/tests -o log_level=WARN -o log_cli=true --load ========================================================================= test session starts ========================================================================== platform linux -- Python 3.8.17, pytest-7.4.0, pluggy-1.2.0 rootdir: /workspace/langkit-config1 collected 28 items langkit/tests/test_callback_handler.py::test_callback_passthroughs_undefined_ok PASSED [ 3%] langkit/tests/test_callback_handler.py::test_callback_passthroughs_undefined_no_args PASSED [ 7%] langkit/tests/test_callback_handler.py::test_callback_passthroughs_defined_functions PASSED [ 10%] langkit/tests/test_callback_handler.py::test_callback_passthroughs_defined_logging_functions PASSED [ 14%] langkit/tests/test_callback_handler.py::test_callback_instance_handler_defined PASSED [ 17%] langkit/tests/test_callback_handler.py::test_callback_instance_handler_with_metadata PASSED [ 21%] langkit/tests/test_callback_handler.py::test_callback_instance_handler_defined_getattr PASSED [ 25%] langkit/tests/test_callback_handler.py::test_callback_instance_three_ply_class_hierarchy PASSED [ 28%] langkit/tests/test_count_patterns.py::test_count_patterns[False] PASSED [ 32%] langkit/tests/test_count_patterns.py::test_count_patterns[True] PASSED [ 35%] langkit/tests/test_injections.py::test_injections PASSED [ 39%] langkit/tests/test_injections.py::test_injections_long_prompt PASSED [ 42%] langkit/tests/test_input_output.py::test_init_call PASSED [ 46%] langkit/tests/test_input_output.py::test_custom_encoder PASSED [ 50%] langkit/tests/test_input_output.py::test_similarity PASSED [ 53%] langkit/tests/test_nlp_scores.py::test_bleu_score PASSED [ 57%] langkit/tests/test_patterns.py::test_ptt[False] PASSED [ 60%] langkit/tests/test_patterns.py::test_ptt[True] PASSED [ 64%] langkit/tests/test_patterns.py::test_individual_patterns_isolated PASSED [ 67%] langkit/tests/test_sentiment.py::test_sentiment PASSED [ 71%] langkit/tests/test_textstat.py::test_textstat PASSED [ 75%] langkit/tests/test_themes.py::test_init_call PASSED [ 78%] langkit/tests/test_themes.py::test_theme_custom PASSED [ 82%] langkit/tests/test_themes.py::test_theme PASSED [ 85%] langkit/tests/test_themes.py::test_themes_with_json_string PASSED [ 89%] langkit/tests/test_themes.py::test_themes_standalone PASSED [ 92%] langkit/tests/test_toxicity.py::test_toxicity PASSED [ 96%] langkit/tests/test_toxicity.py::test_toxicity_long_response PASSED [100%] =========================================================================== warnings summary =========================================================================== .venv/lib/python3.8/site-packages/textstat/textstat.py:7 /workspace/langkit-config1/.venv/lib/python3.8/site-packages/textstat/textstat.py:7: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html import pkg_resources langkit/tests/test_injections.py::test_injections /workspace/langkit-config1/.venv/lib/python3.8/site-packages/transformers/models/open_llama/modeling_open_llama.py:42: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead logger.warn( -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html =================================================================== 28 passed, 2 warnings in 30.36s ====================================================================

Ok, that looks good. We don't need to explicitly set the level to warning since that is default, better to drop the:
-o log_level=WARN from the commands.

Also I don't see anything that looks broken in the unit tests with INFO logging, why switch that off?

Just for quietness... We can set it to whatever the consensus level is

richard-rogers · 2023-08-21T23:19:43Z

langkit/nlp_scores.py

@@ -78,7 +78,7 @@ def meteor_score(text):
                    return result

    else:
-        diagnostic_logger.warning(
+        diagnostic_logger.info(


Can we add the reference corpus to LangKitConfig so that we don't have to explicitly reinitialize this module?

Makefile

jamie256

LGTM!

Quieter logs in LanKit test suite

3d6370a

richard-rogers requested review from bernease, FelipeAdachi and jamie256 August 21, 2023 19:34

jamie256 reviewed Aug 21, 2023

View reviewed changes

richard-rogers commented Aug 21, 2023

View reviewed changes

Merge branch 'main' into dev/richard/quiet

e82d466

jamie256 reviewed Aug 23, 2023

View reviewed changes

Makefile Outdated Show resolved Hide resolved

switch unit test to INFO output

bee2e17

jamie256 approved these changes Aug 23, 2023

View reviewed changes

richard-rogers merged commit 344d346 into main Aug 24, 2023

richard-rogers deleted the dev/richard/quiet branch August 24, 2023 05:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quieter logs in LanKit test suite #135

Quieter logs in LanKit test suite #135

richard-rogers commented Aug 21, 2023

jamie256 Aug 21, 2023

richard-rogers Aug 21, 2023

richard-rogers Aug 21, 2023

jamie256 Aug 22, 2023

richard-rogers Aug 22, 2023

richard-rogers Aug 21, 2023

jamie256 left a comment

Quieter logs in LanKit test suite #135

Quieter logs in LanKit test suite #135

Conversation

richard-rogers commented Aug 21, 2023

jamie256 Aug 21, 2023

Choose a reason for hiding this comment

richard-rogers Aug 21, 2023

Choose a reason for hiding this comment

richard-rogers Aug 21, 2023

Choose a reason for hiding this comment

jamie256 Aug 22, 2023

Choose a reason for hiding this comment

richard-rogers Aug 22, 2023

Choose a reason for hiding this comment

richard-rogers Aug 21, 2023

Choose a reason for hiding this comment

jamie256 left a comment

Choose a reason for hiding this comment