Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor variable substitution #198

Merged
merged 114 commits into from
Jun 13, 2023
Merged

Refactor variable substitution #198

merged 114 commits into from
Jun 13, 2023

Conversation

sujuka99
Copy link
Contributor

@sujuka99 sujuka99 commented Apr 24, 2023

closes #115

@sujuka99 sujuka99 self-assigned this Apr 24, 2023
@sujuka99 sujuka99 marked this pull request as draft April 24, 2023 13:20
@sujuka99
Copy link
Contributor Author

sujuka99 commented May 2, 2023

All substitution functions OLD
  • yaml_loading.substitute() x Substitute $-placeholders in input using template string.
    • PipelineComponent.substitute_component_names() x Introduces ${component_type} = type_ to the substitution.
      • PipelineComponent.substitute_component_variables() x Introduces ${component_name}, ${error_topic_name}, ${output_topic_name} to the substitution.
        • PipelineComponent.substitute_output_topic_names() x Substitutes ${component_name}, ${error_topic_name}, ${output_topic_name}, ${component_type} ONLY in self.to.topics.
        • StreamsApp.__substitute_autoscaling_topic_names() x Substitutes ${component_name}, ${error_topic_name}, ${output_topic_name}, ${component_type} ONLY in self.app.autoscaling.topics and self.app.autoscaling.consumer_group.
      • PipelineComponent.substitute_name() x Substitute ${component_type} ONLY in self.name
    • PipelineComponent.substitute_prefix() x Substitute any $-placeholders ONLY in self.prefix with environment variables
    • Pipeline.substitute_component_specific_variables() x Substitutes${component_type} and ${component_name} everywhere in a env-specific component.

A nested item is more specialized and calls its parent

Trace of substitution functions OLD
  • substitute()
  • substitute_component_names()
    • substitute()
  • substitute_prefix()
    • substitute()
  • substitute_component_specific_variables()
    • substitute()
  • substitute_component_variables()
    • substitute_component_names()
      • substitute()
  • substitute_name()
    • substitute_component_names()
      • substitute()
  • substitute_output_topic_names()
    • substitute_component_variables()
      • substitute_component_names()
        • substitute()
  • __substitute_autoscaling_topic_names()
    • substitute_component_variables()
      • substitute_component_names()
        • substitute()
Substitution trace in deployment
OLD TRACE
  • create_pipeline_config() Create default config, read values from a yaml if it exists
    • PipelineConfig()
      • yaml_config_settings_source()
        • load_yaml_file()
          • substitute()
  • setup_pipeline()
    • Pipeline.load_from_yaml()
      • load_yaml_file() Substitute all set env vars into pipeline.yaml
        • substitute()
      • Pipeline()
        • create_env_components_index() Create index of all components - turn dict from list to mapping
          • component.substitute_component_names()
            • substitute()
        • parse_components() Go through all components one by one
          • apply_component()
            • enrich_component()
              • substitute_component_specific_variables() Override component config with component config in pipeline environment definition.
                • substitute()
              • component_class() Init method of any PipelineComponent
                • extend_with_defaults() Load defaults from yaml
                  • load_defaults()
                    • defaults_from_yaml()
                    • load_yaml_file()
                    • substitute()
                • substitute_output_topic_names() Substitute $ placeholders in component output topics
                  • substitute_component_variables()
                    • substitute_component_names()
                    • substitute()
                • substitute_name() Substitute $ placeholders in component name
                  • substitute_component_names()
                    • substitute()
                • substitute_prefix() Substitute $-placeholders in pipeline prefix
                  • substitute()
                • IF STREAMSAPP __substitute_autoscaling_topic_names() Substitute autoscaling topics' names
                  • substitute_component_variables()
                    • substitute_component_names()
                    • substitute()
                • IF KAFKAAPP substitute() override StreamsApp broker setting
                • IF KAFKACONNECTOR prepare_connector_config() Substitute all topic names variables
                  • substitute_component_variables()
                    • substitute_component_names()
                    • substitute()
NEW TRACE
  • create_pipeline_config() Create default config, read values from a yaml if it exists
    • PipelineConfig()
      • yaml_config_settings_source()
        • load_yaml_file()
          • substitute()
  • setup_pipeline()
    • Pipeline.load_from_yaml()
      • load_yaml_file() Substitute all set env vars into pipeline.yaml
        • substitute()
      • Pipeline()
        • create_env_components_index() Create index of all components - turn dict from list to mapping
          • substitute_nested()
            • substitute()
        • parse_components() Go through all components one by one
          • apply_component()
            • enrich_component()
              • substitute_in_component() Override component config with component config in pipeline environment definition and carry out substitutions.
                • substitute_nested()
                  • substitute()
              • component_class() Init method of any PipelineComponent
                • extend_with_defaults() Load defaults from yaml
                  • load_defaults()
                    • defaults_from_yaml()
                    • load_yaml_file()
                    • substitute()
  • print_yaml() Prints the generated pipeline definition
    • substitute()
Current solution

Summary

The substitution has been moved from the component class to the pipeline generator. It happens together with the (optional) enrichment of the component with environment-specific definitions.

Details

  • Introduced a function that takes any number of substitutions and repeats them until no further changes are made. Enables chained references.
  • Introduced a function that 'inflates' a mapping. Copies all nested pairs to the top level. Could be useful for automatic creation of variables such as a component's output topic. Currently doesn't change the copied keys at all, might have to add prefixes in the way the usual dict flattening works.
  • Refactored component enrichment to also do the regular component substitution. Might not be ideal, subject to further refactoring.
  • Refactored substitution in component that was previously in the pipeline generator. Now the method is responsible for all substitutions in the component definition.
  • Introduced a function that generates a substitution from a pyudantic BaseModel and optionally merges it with an existing substitution. Is capable of adding prefixes to the variables.
  • Removed all substitutions from all component classes aside from defaults-related operations.
  • Enabled cross-component referencing in pipeline definition

See the docstrings and comments in the diff for further info.

Achievements

  • Enabled substitution everywhere in defaults and pipeline def
  • Enabled component-specific variable generation which covers all component attributes.
  • Enabled variables cross-referencing between components.
Limitations
  • Circular references result in an infinite loop
  • Nesting placeholders doesn't work if at some point during resolving a non-alphanumeric character is introduced (underscores are allowed)
Further suggestions

@sujuka99 sujuka99 changed the title Refactor input/output types Refactor variable substitution May 9, 2023
@disrupted disrupted added the type/refactor Refactoring of existing functionality label Jun 8, 2023
kpops/utils/yaml_loading.py Outdated Show resolved Hide resolved
@sujuka99 sujuka99 dismissed stale reviews from disrupted and raminqaf June 13, 2023 13:13

stale

Copy link
Member

@disrupted disrupted left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sujuka99 sujuka99 merged commit d8918a4 into main Jun 13, 2023
@sujuka99 sujuka99 deleted the refactor/variable-substitution branch June 13, 2023 15:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/refactor Refactoring of existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor variable substitution
5 participants