Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

option to not eat comments #154

Open
GoogleCodeExporter opened this issue Mar 30, 2015 · 21 comments
Open

option to not eat comments #154

GoogleCodeExporter opened this issue Mar 30, 2015 · 21 comments

Comments

@GoogleCodeExporter
Copy link

hrm, I can't find any way to mark this as a feature request. anyways...

I'd like to preserve comments when I parse yaml. This library looks like it can 
write comments, but just eats them when it's reading files. Any chance of 
adding an option to include comment nodes somewhere?

Original issue reported on code.google.com by chan...@gmail.com on 9 Mar 2012 at 10:58

@Mortal42
Copy link

Mortal42 commented Oct 14, 2016

Has a decision been made regarding attaching comments to nodes during parsing?
Edit:
Or alternatively, treating them as fake nodes, given maps in yaml-cpp now store their parsed order.
For example, they could be inserted into sequences easily enough, or into node_map as a special pair. This behavior could be gated by an option so it doesn't break existing code. Initial feeling is this would be more of a hack than a solution.

Additionally, it seems odd that there is no OnComment event handler.

@jbeder
Copy link
Owner

jbeder commented Oct 14, 2016

I haven't considered this issue in a while, but I'm happy to accept patches. I think an event handler for comments would be pretty uncontroversial.

@tepperly
Copy link

I also think this feature would be useful. It would be nice to be able to preserve comments the user put into a file when rewriting the file (as stated above).

@SimplyKnownAsG
Copy link

@jbeder, I'm trying to see about adding comments. Reading through concerns above and in other related tickets it looks like folks expect a comment to come after the thing it describes, but I think it may make sense to have both a "pre" and "post" comment. For example.

# comment before a sequence
- first item
- # comment before a scalar
  second item # trailing comment
              # this is where it gets confusing
              # I guess it could be possible to detect equivalent Mark.colum of consecutive
              # comments to determine that it appears as though someone is continuing.
# in my mind, this is clearly a comment describing the last item, but is inconsistent
# with where the comment for the "second item" was placed
- last item

I guess I have a couple questions, then I'll probably have more should I make any progress.

  1. At one point you stated if it was a part of the Event system it wouldn't be that big of a deal. I don't quite understand how yaml-cpp would be able to re-emit without the data stored on the node_data. Am I missing something, or will comments need to be stored on node_data? Alternatively they could be their own nodes, but that loses context. Perhaps the idea is that two NodeBuilders could be created, one that uses comments and one that ignores.
  2. Assuming that a comment should be attached to the previous node, what is the proper way to retrieve the previous node? Is it NodeBuilder.m_stack.back()?
  3. If a comment before a node should reside on the following node, any ideas where to store the previous comment until a node exists?

p.s. I am a huge fan of "I don't want yaml-cpp to output nasty YAML." And if works, one might be able to create "yaml-format" utilizing "yaml-cpp".

@mazen-mardini
Copy link

mazen-mardini commented Mar 12, 2020

Any progress on this one? The ability to edit commented YAML-files would be awesome.

@SimplyKnownAsG From what I can see, Nodes in yaml-cpp are either scalars or collections (mappings or sequences). Because of that yaml-cpp can easily represent things like complex keys (keys that aren't scalars) by implementing the subscript operator like this:

Node operator[](const Key& key);

This is very nice and it makes Node versatile. Saying that Node can be a comment would lead to absurdities, at least for the above reason that comments cannot act as a key in a mapping. There are other reasons, such as the fact that comments can be placed between a key and a value, or that it can be placed before a document even starts (before a "---").

Excluding that idea would leave us with (at least) the second solution, to make comments a part of Node-objects. I'm planning to take a look at the code some more and see if/how this could be done.

Wouldn't it be good to extend this feature to include empty lines as well? Because those are also part of the readability of a document.

@nikich340
Copy link

Would be great feature!

@filipebeavis
Copy link

filipebeavis commented Dec 16, 2020

I also need this feature.
For me this is important because I serialize only one yaml-node to send to a server and there I verify the configuration consistency, because the client and the host execution should be independently.

Here in this example I send to the server the node["configurationToSendServer"] and its mark positions, that I can handle the error positions. This works perfect if I don't have comments inside the configurationToSendServer. In the case that we have the comments, the node serialized and sent to the server is ate and will not appear in the server side, causing a wrong Mark information.

name: example
description: "some decription"
configurationToSendServer:
    name: execution
    # comment to describe something
    # the next line is wrong configuration for the server
    serverProperty: foobarWrongProperty

@seisowl
Copy link

seisowl commented Nov 13, 2021

A lot of work, but starting with taking care of the simplest case, i.e., separate comment lines above the actual thing, would help a lot!

@nathanieltagg
Copy link

This is very nice and it makes Node versatile. Saying that Node can be a comment would lead to absurdities, at least for the above reason that comments cannot act as a key in a mapping. There are other reasons, such as the fact that comments can be placed between a key and a value, or that it can be placed before a document even starts (before a "---").

I think it can be workable. You simply devine a new value type, a "CommentKey". The key could simply be a unique string, for example "#124" for a comment on line 124. You could similarly have a Comment value, or it could simply be a scalar. The nice thing here is that by containing whitespace and comments in the tree, you can emit the tree back out as it came in, after modifying things.

I would personally like it so that Nodes could have comments attached to them if the comment appears on the same line; this could simply be an attribute in the YAML node, and can be added to the tree structure. I like this, because I can construct a template configuration tree with comments, unify it with the user config, then recreate it (without having to muck with the Emitter)

@SirNate0
Copy link

This is very nice and it makes Node versatile. Saying that Node can be a comment would lead to absurdities, at least for the above reason that comments cannot act as a key in a mapping. There are other reasons, such as the fact that comments can be placed between a key and a value, or that it can be placed before a document even starts (before a "---").

I am inclined to agree with this. Comments cannot be a value, especially with lists:

- first item
- # comment
  second item # trailing comment

If we treat the comment as a value, then I suppose there are actually 4 items here:
["first item", # comment, "second item", # trailing comment], but the actual YAML is only 2 ['first item', 'second item'].


Instead, I think the Node needs to own the comments associated with it. I would store them in a few strings: before, inline, and after, or head, text, and tail (etc. regarding the naming):

# Introductory Comment (which still belongs to data as the first item)
# ...
# The following white space is preserved as part of the comment

# The data
data: # Still belongs to the data entry
    # dict head/leading comment
    ? # dict inline comment
      dict # dict trailing comment
    : # 1234 leading comment
      1234 # 1234 trailing comment

# Trailing comment of the root node of the document

Are there situations where this sort of division can not be ideal - sure. Commenting out an item from the end of a list, for example, will add it to the beginning of the next item. But

  1. The division can be made "smarter" later, while still keeping the divisions I'm proposing.
  2. When adding/changing content the comment will still generally be there, though an extra key/value could be inserted between it's final location and where it ends up. Only when removing content do we risk deleting the comments, but we're already deleting content so it's still a significant improvement to only delete a couple comments vs eating all of them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests