Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MAINTENANCE] Instrument test_yaml_config() #2981

Merged

Conversation

anthonyburdi
Copy link
Member

@anthonyburdi anthonyburdi commented Jul 1, 2021

Changes proposed in this pull request:

  • Add usage statistics instrumentation to test_yaml_config() (golden path and diagnostic_info for errors)
  • Update anonymizers to handle new v3 API data types for use in the test_yaml_config() instrumentation
  • Anonymizers can now check if the parent class of a given class is recognized, this is used to determine if a Datasource is compatible with the V2 or V3 API.
  • Special handling for SimpleSqlalchemyDatasource is included
  • Supported types for test_yaml_config are now class variables of BaseDataContext
  • Tests for the above

Definition of Done

Please delete options that are not relevant.

  • My code follows the Great Expectations style guide
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added unit tests where applicable and made sure that new and existing tests are passing.
  • I have run any local integration tests and made sure that nothing is broken.

@anthonyburdi anthonyburdi self-assigned this Jul 1, 2021
@netlify
Copy link

netlify bot commented Jul 1, 2021

✔️ Deploy Preview for knoxpod ready!

🔨 Explore the source changes: b2adada

🔍 Inspect the deploy log: https://app.netlify.com/sites/knoxpod/deploys/60e47b793483a50007bbb470

😎 Browse the preview: https://deploy-preview-2981--knoxpod.netlify.app

@github-actions
Copy link
Contributor

github-actions bot commented Jul 1, 2021

HOWDY! This is your friendly 🤖 CHANGELOG bot 🤖

Please don't forget to add a clear and succinct description of your change under the Develop header in docs/changelog.rst, if applicable. This will help us with the release process. See the Contribution checklist in the Great Expectations documentation for the type of labels to use!

Thank you!

Comment on lines +27 to +30
"additionalProperties": True,
# we don't want this to be true, but this is required to allow show_cta_footer
# Note AJB-20201218 show_cta_footer was removed in v 0.9.9 via PR #1249
"required": ["parent_class"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note : these lines were moved here from another section. Don't worry about this :)

Comment on lines +63 to +70
"anonymized_execution_engine": {
"$ref": "#/definitions/anonymized_class_info"
},
"anonymized_data_connectors": {
"type": "array",
"maxItems": 1000,
"items": {"$ref": "#/definitions/anonymized_class_info"},
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note : this was added by this PR (excuse the strange diff)

Comment on lines +409 to +412
"__substitution_error__",
"__yaml_parse_error__",
"__custom_subclass_not_core_ge__",
"__class_name_not_provided__",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more descriptive errors:

  1. substitution : variable substitution error (I dont exist)
  2. yaml parse : tabs are off in yaml
  3. not core GE : the class is not in the list in anonymizer or datacontext
  4. class_name not provided : we didn't have a class name for you

Comment on lines +3240 to +3250
except Exception as e:
usage_stats_event_payload: dict = {
"diagnostic_info": ["__substitution_error__"],
}
send_usage_message(
data_context=self,
event=usage_stats_event_name,
event_payload=usage_stats_event_payload,
success=False,
)
raise e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is how we generate a substitution_error

Comment on lines +3258 to +3267
except Exception as e:
usage_stats_event_payload: dict = {
"diagnostic_info": ["__yaml_parse_error__"],
}
send_usage_message(
data_context=self,
event=usage_stats_event_name,
event_payload=usage_stats_event_payload,
success=False,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yaml_parrse_error generated here

Comment on lines +3309 to +3315
if class_name == "SimpleSqlalchemyDatasource":
# Use the raw config here, defaults will be added in the anonymizer
usage_stats_event_payload = (
datasource_anonymizer.anonymize_simple_sqlalchemy_datasource(
name=datasource_name, config=config
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SimpleSqlalchemyDatasource has a different class inheritance so handled separately

Comment on lines +3493 to +3500
else:
# If class_name is not a supported type or subclass of a supported type,
# mark it as custom with no additional information since we can't anonymize
usage_stats_event_payload[
"diagnostic_info"
] = usage_stats_event_payload.get("diagnostic_info", []) + [
"__custom_subclass_not_core_ge__"
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a core ge class and not a subclass .. then we dont know what it is. error.

Comment on lines +10 to +17
# Test usage stats for test_yaml_config
# - [x] test_test_yaml_config_usage_stats_substitution_error
# - [x] test_test_yaml_config_usage_stats_yaml_parse_error
# See test_data_context_test_yaml_config.test_config_with_yaml_error()
# - [x] test_test_yaml_config_usage_stats_store_type
# See test_data_context_test_yaml_config.test_expectations_store_with_filesystem_store_backend()
# - [NA] test_test_yaml_config_usage_stats_datasource_type_v2
# - [x] test_test_yaml_config_usage_stats_datasource_type_v3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is ~ sniff ~ beautiful :)

@anthonyburdi anthonyburdi marked this pull request as ready for review July 1, 2021 22:36
@anthonyburdi anthonyburdi enabled auto-merge (squash) July 1, 2021 22:38
Copy link
Contributor

@Shinnnyshinshin Shinnnyshinshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A true tour de force :) thank you for the well-thought-out design, implementation and testing. Leaving a few non-blocking requests for comments, first in the DataConnectorAnonymizer and also in the classes that are used for testing (which we talked about last week). Otherwise LGTM! :)

@anthonyburdi anthonyburdi merged commit d5fab68 into develop Jul 6, 2021
@anthonyburdi anthonyburdi deleted the MAINTENANCE/GEA-1/GEA-7/update_usage_stats_messages branch July 6, 2021 16:49
pasmavie pushed a commit to pasmavie/great_expectations that referenced this pull request Jul 13, 2021
Instrument test_yaml_config() and update Anonymizers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants