Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] YAML merge operator not supported in custom maps #1038

Closed
zstadler opened this issue Sep 27, 2024 · 5 comments · Fixed by #1040
Closed

[BUG] YAML merge operator not supported in custom maps #1038

zstadler opened this issue Sep 27, 2024 · 5 comments · Fixed by #1040
Labels
enhancement New feature or request

Comments

@zstadler
Copy link
Contributor

zstadler commented Sep 27, 2024

Describe the bug
The << : operator in YAML is used to merge two mappings (key-value pairs) together. [See this description for additional details.]
It is a powerful feature that simplifies configuration files by allowing copy-modify operation for YAML mappings.

For example, making identical definitions for geometry: line and geometry: polygon features in the same layer:

layers:
  - id: aeroway
    features:
    - &aeroway
      source: osm
      geometry: line
      min_zoom: 11
      include_when:
        aeroway:
          - runway
          - taxiway
      attributes:
        - key: class
          tag_value: aeroway
    - << : *aeroway
      geometry: polygon

instead of

layers:
  - id: aeroway
    features:
      - source: osm
        geometry: line
        min_zoom: 11
        include_when:
          aeroway:
            - runway
            - taxiway
        attributes:
          - key: class
            tag_value: aeroway
      - source: osm
        geometry: polygon
        min_zoom: 11
        include_when:
          aeroway:
            - runway
            - taxiway
        attributes:
          - key: class
            tag_value: aeroway

To Reproduce
Steps to reproduce the behavior:

  1. Save the following merge-operator.yaml file in an empty directory:
    # Use YAML << : merge operator
    schema_name: merge operator
    schema_description: Example for merge operator
    attribution: <a href="https://www.openstreetmap.org/copyright" target="_blank">&copy; OpenStreetMap contributors</a>
    args:
      area:
        description: Geofabrik area to download
        default: monaco
      osm_url:
        description: OSM URL to download
        default: '${ args.area == "planet" ? "aws:latest" : ("geofabrik:" + args.area) }'
    sources:
      osm:
        type: osm
        url: '${ args.osm_url }'
    layers:
      - id: aeroway
        features:
          - &aeroway
            source: osm
            geometry: line
            min_zoom: 11
            include_when:
              aeroway:
                - runway
                - taxiway
            attributes:
              - key: class
                tag_value: aeroway
          - << : *aeroway
            geometry: polygon
  2. Run the following command, or an equivalent:
    docker run --rm -v .:/data ghcr.io/onthegomap/planetiler generate-custom --schema=/data/merge-operator.yml --download --force
  3. The first error in the log file is:
    Exception in thread "main" java.lang.IllegalArgumentException: Unrecognized field "<<" (class com.onthegomap.planetiler.custommap.configschema.FeatureItem), not marked as ignorable (8 known properties: "min_zoom", "include_when", "geometry", "attributes", "source", "max_zoom", "exclude_when", "min_size"])
     at [Source: UNKNOWN; byte offset: #UNKNOWN] (through reference chain: com.onthegomap.planetiler.custommap.configschema.SchemaConfig["layers"]->java.util.ArrayList[0]->com.onthegomap.planetiler.custommap.configschema.FeatureLayer["features"]->java.util.ArrayList[1]->com.onthegomap.planetiler.custommap.configschema.FeatureItem["<<"])
            at com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:4624)
            at com.fasterxml.jackson.databind.ObjectMapper.convertValue(ObjectMapper.java:4555)
            at com.onthegomap.planetiler.util.YAML.convertValue(YAML.java:59)
            at com.onthegomap.planetiler.util.YAML.load(YAML.java:36)
            at com.onthegomap.planetiler.util.YAML.load(YAML.java:27)
            at com.onthegomap.planetiler.custommap.configschema.SchemaConfig.load(SchemaConfig.java:39)
            at com.onthegomap.planetiler.custommap.ConfiguredMapMain.main(ConfiguredMapMain.java:36)
            at com.onthegomap.planetiler.Main.main(Main.java:102)
    

Expected behavior
The custom config file should be accepted and produce the same result as the expanded definition.

Environment (please complete the following information):

  • Hardware: Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz 2.59 GHz
  • OS: WSL on Win 11 Pro
  • Java version and distribution: As in the docker image
  • Maven version: As used for building the docker image

Additional context

@zstadler zstadler added the bug Something isn't working label Sep 27, 2024
@msbarry
Copy link
Contributor

msbarry commented Sep 27, 2024

Snakeyaml supports anchors and aliases, but it looks like the merge operator was initially supported in the original snakeyaml but was explicitly excluded from the new version snakeyaml-engine that planetiler uses.

See:

@zstadler
Copy link
Contributor Author

Given the large potential benefit of the merge operator, could you consider using another YAML parser that would support it?
It is not only about writing less, it is also about avoiding errors like when a copy-modified section gets out-of-sync with its origin.

For reference, the operator is supported by OpenMapTiles and docker compose.

@msbarry msbarry added enhancement New feature or request and removed bug Something isn't working labels Sep 28, 2024
@msbarry
Copy link
Contributor

msbarry commented Sep 28, 2024

I don't think I want to move away from snakeyaml since that is the most robust parser for java out there. One option could be to do a post-processing step, since a yaml document like this:

source: &label
  a: 1
dest:
  <<: *label
  b: 2

gets parsed by snake-yaml into this json:

{"source": {"a": 1}, "dest": {"<<": {"a": 1}, "b": 2}}

I could have the yaml parser traverse that struct and hoist any "<<" keys to the parent? That seems like it would match the behavior of pyyaml which hosts the child even if it's not a ref.

@zstadler
Copy link
Contributor Author

That makes sense!

Although not explicitly mentiond the the yaml merge document, I believe the << operator is allowed only as the first key in a mapping. i.e., the following is not allowed:

source: &label
  a: 1
dest:
  c: 3
  <<: *label
  b: 2

@msbarry
Copy link
Contributor

msbarry commented Sep 28, 2024

Sounds good although with this approach it won't be able to tell whether the merge operator was first or later so either would work. Made a rough POC, check out the test cases here: https://github.com/onthegomap/planetiler/pull/1040/files#diff-fd5539f08265c29356d135323e4f5276e244faf9c16574e159da245abcc642d4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants