-
Notifications
You must be signed in to change notification settings - Fork 858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Allow newlines and trailing commas in inline tables #516
Comments
Maybe I'm off-base, but I'm not yet sold on this proposal. Regarding the first point, considering that arrays and tables are different things, the perceived inconsistency in syntax is not a problem, is perfectly acceptable, and sets these different things off nicely. Skipping down to point 3, isn't the following equivalent? It's already legal TOML, it's readable, and it's space-efficient, or so I like to think. You may disagree with me (especially since I swapped two of the keys) but at least take a look:
Also consider this. Any more readable?
Inline tables are fully intended to be small tables, with multiple key/value pairs on one line. If the tables in your (quite readable) example were any larger, then double-bracket notation would make much more sense, even with repeated keys, and you'd get the one-line-one-pair that you seem to find aesthetically appealing. In either case, we don't need to add a pseudo-JSON to get readability, no matter whether it would be simple to implement. |
@eksortso I can see where you're coming from on the first point. I do disagree though, because IMHO they are very much similar because they're both inline datastructures. I don't know how to make that argument more convincing though. I think my main point there is: both the ararys and the inline tables have pseudo-JSON syntax, but the inline tables are missing some features right now that you would expect coming from JSON (or any other language that has similar syntax). On the other two examples you make some good points. I took the second example because it was mentioned in the original issue. I see now though that there was discussion on that issue if it was even a good example. The third one is an issue I actually have myself with my configs. I think you made some good points there as well. Especially the aligning of keys in the third one helps quite a lot with readability. I do think you indeed cheated in a smart way a bit by moving the keys around a bit. I'll will expand on that point and hopefully make my arguments there a bit stronger, but of course you're still allowed to disagree: Modified point 3I'll show the same piece of config in different ways below and list some of disadvantages and advantages with each one. With double brackets[main_app.general_settings.logging]
log-lib = "logrus"
[[main_app.general_settings.logging.handlers]]
name = "default"
output = "stdout"
level = "info"
[[main_app.general_settings.logging.handlers]]
name = "stderr"
output = "stderr"
level = "error"
[[main_app.general_settings.logging.handlers]]
name = "http-access"
output = "/var/log/access.log"
level = "info"
[[main_app.general_settings.logging.loggers]]
name = "default"
handlers = ["default", "stderr"]
level = "warning"
[[main_app.general_settings.logging.loggers]]
name = "http-access"
handlers = ["default"]
level = "info" Advantages:
Disadvantages:
With inline tables unaligned[main_app.general_settings.logging]
log-lib = "logrus"
handlers = [
{name = "default", output = "stdout", level = "info"},
{name = "stderr", output = "stderr", level = "error"},
{name = "http-access", output = "/var/log/access.log", level = "info"},
]
loggers = [
{name = "default", handlers = ["default", "stderr"], level = "warning"},
{name = "http-access", handlers = ["http-access"], level = "info"},
] Advantages:
Disadvantages:
With inline tables without newlines without reordered keys[main_app.general_settings.logging]
log-lib = "logrus"
handlers = [
{name = "default", output = "stdout", level = "info"},
{name = "stderr", output = "stderr", level = "error"},
{name = "http-access", output = "/var/log/access.log", level = "info"},
]
loggers = [
{name = "default", handlers = ["default", "stderr"], level = "warning"},
{name = "http-access", handlers = ["http-access"], level = "info"},
] Advantages:
Disadvantages:
With inline tables without newlines with reordered keys[main_app.general_settings.logging]
log-lib = "logrus"
handlers = [
{name = "default", level = "info", output = "stdout"},
{name = "stderr", level = "error", output = "stderr"},
{name = "http-access", level = "info", output = "/var/log/access.log"},
]
loggers = [
{name = "default", level = "warning", handlers = ["default", "stderr"]},
{name = "http-access", level = "info", handlers = ["http-access"]},
] Advantages:
Disadvantages:
With newlines[main_app.general_settings.logging]
log-lib = "logrus"
handlers = [
{
name = "default",
output = "stdout",
level = "info",
}, {
name = "stderr",
output = "stderr",
level = "error",
}, {
name = "http-access",
output = "/var/log/access.log",
level = "info",
},
]
loggers = [
{
name = "default",
handlers = ["default", "stderr"]
level = "warning",
}, {
name = "http-access",
handlers = ["http-access"]
level = "info",
},
] Advantages:
Disadvatages:
ConclusionI think ultimately it's a matter of taste what looks better. And a matter of tradeoffs between, repeated keys, vertical space, line length, diff clarity and logical vs visually pleasant key ordering. I think my main point with this example is that it would be nice if users could choose what they find more important. |
I'm all for this. Just started to look into TOML properly for the first time as I was planning on using it for the configuration file for a tool I'm writing. I really like TOML overall, but this one thing makes some specific things really nasty. The bit I'm working on is actually sort of like the Docker Compose syntax in some ways. Take this YAML for example: version: "3"
services:
elasticsearch:
container_name: metrics_elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:5.5.3
network_mode: host
environment:
discovery.type: single-node
http.cors.enabled: true
http.cors.allow-origin: "*"
xpack.security.enabled: false
ports:
- 9200:9200
- 9300:9300
volumes:
- elasticsearch-data:/usr/share/elasticsearch/data
kibana:
container_name: metrics_kibana
image: docker.elastic.co/kibana/kibana:5.5.3
network_mode: host
environment:
ELASTICSEARCH_URL: http://localhost:9200
XPACK_MONITORING_ENABLE: false
ports:
- 5601:5601
volumes:
elasticsearch-data:
driver: local And then compare it to the equivalen TOML: version = "3"
[services]
[services.elasticsearch]
container_name = "metrics_elasticsearch"
image = "docker.elastic.co/elasticsearch/elasticsearch:5.5.3"
network_mode = "host"
ports = [
"9200:9200",
"9300:9300"
]
volumes = [
"elasticsearch-data:/usr/share/elasticsearch/data"
]
[services.elasticsearch.environment]
"discovery.type" = "single-node"
"http.cors.enabled" = true
"http.cors.allow-origin" = "*"
"xpack.security.enabled" = false
[services.kibana]
container_name = "metrics_kibana"
image = "docker.elastic.co/kibana/kibana:5.5.3"
network_mode = "host"
ports = [
"5601:5601"
]
[services.kibana.environment]
ELASTICSEARCH_URL = "http://localhost:9200"
XPACK_MONITORING_ENABLE = false
[volumes]
[volumes.elasticsearch-data]
driver = "local" That extra level of nesting just makes TOML that much less nice to use in this case. If the environment could be on the same level as the rest of the service configuration it'd tidy it right up. version = "3"
[services]
[services.elasticsearch]
container_name = "metrics_elasticsearch"
image = "docker.elastic.co/elasticsearch/elasticsearch:5.5.3"
network_mode = "host"
environment = {
"discovery.type" = "single-node",
"http.cors.enabled" = true,
"http.cors.allow-origin" = "*",
"xpack.security.enabled" = false,
}
ports = [
"9200:9200",
"9300:9300"
]
volumes = [
"elasticsearch-data:/usr/share/elasticsearch/data"
]
[services.kibana]
container_name = "metrics_kibana"
image = "docker.elastic.co/kibana/kibana:5.5.3"
network_mode = "host"
environment = {
ELASTICSEARCH_URL = "http://localhost:9200",
XPACK_MONITORING_ENABLE = false,
}
ports = [
"5601:5601"
]
[volumes]
[volumes.elasticsearch-data]
driver = "local" |
At the heart of your issue, you have a large subtable that you wish to keep in the middle of your configurations. Not before it, and not after it. Some relief exists with inline tables and key-path assignments. But with a table nested a few layers deep, the keys would grow very long. I still find multiline tables that look like JSON offputting. But I think I have an idea for a TOML-friendly syntax that could get you what you're wanting. I don't have time to write it down now, but I'll be back later on. |
Hi @JelteF! Thanks for filing this issue. I'm deferring any new syntax proposal as I try to ramp up my effort to get us to TOML 1.0, which will not contain any new syntax changes from TOML 0.5. This is definitely an idea I want to explore more -- personally, I still haven't finalized how much TOML should be flat (INI-like) vs nested (JSON-like). Both approaches have their trade-offs and we'll know what we want to do for this specific request, once we finalize that overarching idea. However, I'd appreciate if we hold off that discussion until TOML 1.0 is released. |
The earlier example: [services]
[services.kibana]
container_name = "metrics_kibana"
image = "docker.elastic.co/kibana/kibana:5.5.3"
network_mode = "host"
[services.kibana.environment]
ELASTICSEARCH_URL = "http://localhost:9200"
XPACK_MONITORING_ENABLE = false
Could instead be: [services]
[.kibana]
container_name = "metrics_kibana"
image = "docker.elastic.co/kibana/kibana:5.5.3"
network_mode = "host"
[.environment]
ELASTICSEARCH_URL = "http://localhost:9200"
XPACK_MONITORING_ENABLE = false The example just takes advantage of the dotted keys notation, in that if the key starts with a dot, it would inherit the parent table keyspace. I went with a dot as it is has related meaning for a relative path as well The above deals with the issue of table keys getting progressively longer, where the actual table unique name gets offset to the right(potentially requiring scrolling) and/or lost in the noise of similar table keys as shown earlier in the thread. Personally, for nested config the table keys or dotted keys can get quite long/repetitive. It's one area that I think JSON and YAML handle better.
@eksortso I take it later on never came, or did you raise it in another issue? What do you think about the above? I did find it odd that inline tables have this special syntax for single lines, unable to break to multi-line with trailing commas like arrays can. Most new comers to TOML will be familiar with a table/object being defined this way and it'd click, until they realize it breaks should you want to go to multiple lines, yet arrays don't share this restriction. I personally prefer curly brackets for additional clarification of scope. TOML appears to rely on name-spacing allowing for a flat format should you pay attention to the keys. Some try to indicate the scope a bit more via the optional indentation as shown earlier but that uncomfortable/detached to me. I like that end of lines don't need commas in TOML, although they're required for arrays(and inline tables), they could be dropped/optional for multi-line variants?: [services]
[.kibana]
container_name = "metrics_kibana"
image = "docker.elastic.co/kibana/kibana:5.5.3"
network_mode = "host"
environment = {
ELASTICSEARCH_URL = "http://localhost:9200"
XPACK_MONITORING_ENABLE = false
}
ports = [
"5601:5601"
]
[.kibana_2]
container_name = "metrics_kibana2"
image = "docker.elastic.co/kibana/kibana:5.5.3"
network_mode = "host"
environment = {
ELASTICSEARCH_URL = "http://localhost:9201"
XPACK_MONITORING_ENABLE = false
}
ports = [
"5602:5601"
] This example from the project readme is a good case of verbosity/noise that gave me a double take of trying to make sense of what was going on: [[fruit]]
name = "apple"
[fruit.physical] # subtable
color = "red"
shape = "round"
[[fruit.variety]] # nested array of tables
name = "red delicious"
[[fruit.variety]]
name = "granny smith"
[[fruit]]
name = "banana"
[[fruit.variety]]
name = "plantain" This is probably not much better, and might be asking for too much?(strays too far from what TOML currently is?):
Applied to the earlier example for arrays of tables:
The use of Note the lack of assignment |
Oh, I didn't mean it that way! 😝
Ah, that's unfortunate.. 😞 I liked the multi-line approach you proposed, substituting commas with new lines. HJSON ended up offering a good enough solution for me offering this feature in the meantime. |
No worries. But there is a link to #525 up there.
Thanks! That's good how HJSON implemented it. I've see similar patterns in other config formats, whose names I've forgotten. But keep in mind that HJSON is based on JSON, and TOML was originally inspired by informal INI formats. What that means, philosophically, is that nesting in TOML is possible, but deep nesting is, and ought to be, discouraged. By that philosophy, shallow nesting is ideal for a configuration format, and it also works for simple data exchange uses. Over time, I've come to adopt this philosophy myself. I'm still interested in bringing back a little bit of nesting, a la #551, but unless it gains traction, I won't push for it. Other proposals have been offered to use But there's another problem that Regarding commas in arrays and in inline tables, I do feel like the rules for placing those commas ought to be strict, to prevent confusion. It's already decided that arrays require commas between elements, and that a trailing comma is fine. For inline tables, commas must separate the key/value pairs, since they're on the same line. If #551 were reintroduced, newlines could be used to separate the key/value pairs in multi-line inline tables, same as they are used for regular tables. But commas would not be allowed between lines. I'm intrigued by some of your other proposals, particularly the |
Perhaps worth connecting this proposal with #744 as well, for use of placeholders/shortcuts to outer tables names. Example: [servers]
mask = "255.255.255.0"
[*.server1] # subtable 2
ip = "192.168.0.1"
[*.server2] # subtable 3
ip = "192.168.0.2" The above is the same as explicit/verbose keys |
I agree to support line break. Yes, there exists a form that makes the final result look good and easy to read. But the problem is that the conversion tools and serde tools can’t do it. The conversion tool can only convert a long line of things that cannot be read. If line breaks are allowed, these tools can adjust the indentation to make the results look better. |
For me as a user, the fact that newlines aren't allowed inside inline tables was extremely surprising. For me as an implemented, that's a special case in the parser that I wish I could get rid of. I'm all for allowing it. |
I could say in response that the mere existence of inline tables is surprising, because the INI tradition only allows values on a single line, and only then it's just one key/value pair per line. Multiple lines are the exception, not the rule. And there are two other, more versatile, ways to define a table over multiple lines. Probably ought to go to #781 and join the discussion there. |
@eksortso It's a surprise within the TOML specification. If you see a file with an array with line breaks, it's quite reasonable to assume that all composite values can have line breaks in them, except it's not the case. Same goes for trailing commas. |
This is a good point. I'd support this change for that reason alone; it's a very weird inconsistency in the language. |
Everyone has forgotten that inline tables were intended to allow brief, terse injections of small tables into a configuration. They were never intended to replace table headers and sections, and they were never intended to extend beyond a single line. How consistent must we be? Consistent enough to nullify all intentional design choices? This is still a bad idea. I mean, it wouldn't be hard to implement. Our work is halfway done for us already, because we can reuse the ABNF code for splitting arrays across multiple lines. This would also let us include end-of-line comments. More consistent all around.
And while we're at it, let's allow commas between key/value pairs outside of inline tables, so we can have more than one key/value pair on a single line. This is also a bad idea, but it's consistency, and that's what we want.
Other benefits may come from this. If all headers were replaced with inline tables, then we could define top-level key/value pairs at the bottom of the document, or in the middle, because why not?
Consistency over design, consistency over functionality, consistency over readability, consistency over everything else. Where does it end? When TOML becomes a superset of JSON? Never mind the bitterness. Tell me what you think of these different ideas. Maybe you can put my fears to rest. |
But if we're going to smash this piñata to bits, let's stuff it with some more sweet treats. Once again, I propose we allow newlines to separate key/value pairs as well as commas, just like we can do outside of inline tables. That will make things even more consistent. And we can still have a comma before or after the newline if we wanted.
|
I don't think anyone has forgotten, they just disagree with the intentions behind the design. The design is not sacrosanct and it should not be treated as such.
This entire bug is debating over a specific intentional design decision, and the answer seems to be "at least a tiny bit more consistency than we have now." If TOML's primary goal was to make pretty configs the current design already does poorly when tasked with common config structures. Those examples are at least concise and consistent, even if they're intentionally ugly. TOML doesn't currently force end users to write good looking configs, and if it did it would have to be with parsers rejecting configs that don't follow some strictly mandated style.
The inability to back out of a regular table to the global scope is also a surprising pain point that has come up repeatedly, dictating the order of configuration options to applications. Just because a key/value is top level doesn't mean it's important, it can be much less important than the tables that would appear before it in other languages. I think the primary reason #551 failed to garner interest was because it would result in unexpected and surprising parsing errors for end users (as opposed to developers writing parsers). At least that was my problem with it. They will not realize or appreciate that there are two types of tables using |
Literally nobody is saying that, but you know that. The inconsistency is dumb in this one particular context because it already causes regular, significant confusion for users. I'd also argue that "consistency over design" is conceptually nonsensical; good design is always internally consistent. TOML has two collection value types (as in, two ways of specifying a collection on the right hand side of a KVP assignment); both are comma-delimited, but only one allows newlines. This is internally inconsistent, enough that users are regularly caught out by it. |
@pradyunsg I am delighted to hear that this is moving forward. Taking the wording for arrays from the spec and using for inline tables, the spec for inline tables will be: "Inline tables can span multiple lines. A terminating comma (also called a trailing comma) is permitted after the last value of the inline tables. Any number of newlines and comments may precede values, commas, and the closing bracket. Indentation between inline table values and commas is treated as whitespace and ignored." My understanding then is that both the representations below will be valid. Please correct me if I am wrong. [tool.pydoc-markdown.renderer]
type = "mkdocs"
mkdocs_config = {
site_name = "HDX Python Scraper"
}
pages = [
{ title = "Home"},
{
title = "API Documentation",
children = [
{
title = "Source Readers",
contents = [
"hdx.scraper.readers.*"
]
},
{
title = "Outputs",
contents = [
"hdx.scraper.jsonoutput.*",
"hdx.scraper.googlesheets.*",
"hdx.scraper.exceloutput.*"
]
}
]
}
] [tool.pydoc-markdown.renderer]
type = "mkdocs"
mkdocs_config = {
site_name = "HDX Python Scraper",
}
pages = [
{ title = "Home", },
{
title = "API Documentation",
children = [
{
title = "Source Readers",
contents = [
"hdx.scraper.readers.*",
]
},
{
title = "Outputs",
contents = [
"hdx.scraper.jsonoutput.*",
"hdx.scraper.googlesheets.*",
"hdx.scraper.exceloutput.*",
],
},
],
},
] |
I just came across this in my project (my first using TOML) and I think my example illustrates why the current solutions just don't "feel" clean, even though they aren't really problematic per se. The starting point in my project was: # Form 1
[layer.base]
name = 'Base Layer'
buttons = [
'open-test-layer',
'',
'',
'',
'reset',
'exit'
] However, the buttons array is sparse and I didn't want to need to include empty keys. This is what I tried next, which seemed like a logical way to move from an array to a dict, and looks clean, but isn't currently allowed: # Form 2
[layer.base]
name = 'Base Layer'
buttons = {
1 = 'open-test-layer'
5 = 'reset'
6 = 'exit'
} The next version of course works, but with more than 1 or 2 buttons this would begin to completely fall flat in terms of readability: # Form 3
[layer.base]
name = 'Base Layer'
buttons = { 1 = 'open-test-layer', 5 = 'reset', 6 = 'exit' } And finally, what I've settled on (for now) as the best available option: # Form 4 - okay
[layer.base]
name = 'Base Layer'
[layer.base.buttons]
1 = 'open-test-layer'
5 = 'reset'
6 = 'exit' This is not bad, but I really don't like the duplication in the buttons array. For longer keys (and with multiple sub-tables) this could get quite tiring. I think my issues come down to two things:
And to add one more thing: the ending commas on arrays really seem like they could be optional - I don't believe it would introduce any ambiguity by not requiring them, but maybe others have some more well-researched thoughts on this. Regardless, I'm really liking TOML. It's a breath of fresh air after the feature-creep abomination that YAML has turned into. 🤣 |
@jstm88 Using only the existing syntax, your Form 4 can also be nicely written using dotted keys: [layer.base]
name = 'Base Layer'
buttons.1 = 'open-test-layer'
buttons.5 = 'reset'
buttons.6 = 'exit' (It would look even better if the subtable were named "button" instead of "buttons".) |
I have an idea for a single universal separation format. It incorporates the idea of newlines and trailing commas in inline tables, and much more. In a sense, it's a bound on the other extreme of this debate. Take a look at #903. |
I touched a bit on this in #903 (comment), but my main concern with this is generating quality error messages. For example:
Assuming we allow both newlines and trailing commas, in the first example we can generate a good error message: after
The second example is trickier; we left off the
Which is still okay-ish, I guess, but not great either. The difficulty here is that None of this is a show-stopper as far as I'm concerned, but I'm a huge fan of accurate error messages that say "here exactly is your error", rather than "here is where I encountered a parsing error, but your actual error is a few lines up". Currently, TOML allows almost entirely the first type of errors. |
@pradyunsg A while back, you observed that this is "just a matter of changing the ws, comment and comma handling for inline tables to be consistent with arrays," and you would file a PR. I'd like to expedite this. Do you have a PR started? Would you mind if I took a crack at it? I'm leaving #903 open for further discussion, but it's becoming apparent that this change needs to be made. We'll retain the need for commas as separators inside inline tables even if those tables span multiple lines, and we will allow a trailing comma. From the perspective of #903, this change could be seen as a precursor. But it's necessary now. |
This backs out the unicode bare keys from toml-lang#891. This does *not* mean we can't include it in a future 1.2 (or 1.3, or whatever); just that right now there doesn't seem to be a clear consensus regarding to normalisation and which characters to include. It's already the most discussed single issue in the history of TOML. I kind of hate doing this as it seems a step backwards; in principle I think we *should* have this so I'm not against the idea of the feature as such, but things seem to be at a bit of a stalemate right now, and this will allow TOML to move forward on other issues. It hasn't come up *that* often; the issue (toml-lang#687) wasn't filed until 2019, and has only 11 upvotes. Other than that, the issue was raised only once before in 2015 as far as I can find (toml-lang#337). I also can't really find anyone asking for it in any of the HN threads on TOML. All of this means we can push forward releasing TOML 1.1, giving people access to the much more frequently requested relaxing of inline tables (toml-lang#516, with 122 upvotes, and has come up on HN as well) and some other more minor things (e.g. `\e` has 12 upvotes in toml-lang#715). Basically, a lot more people are waiting for this, and all things considered this seems a better path forward for now, unless someone comes up with a proposal which addresses all issues (I tried and thus far failed). I proposed this over here a few months ago, and the response didn't seem too hostile to the idea: toml-lang#966 (comment)
This backs out the unicode bare keys from toml-lang#891. This does *not* mean we can't include it in a future 1.2 (or 1.3, or whatever); just that right now there doesn't seem to be a clear consensus regarding to normalisation and which characters to include. It's already the most discussed single issue in the history of TOML. I kind of hate doing this as it seems a step backwards; in principle I think we *should* have this so I'm not against the idea of the feature as such, but things seem to be at a bit of a stalemate right now, and this will allow TOML to move forward on other fronts. It hasn't come up *that* often; the issue (toml-lang#687) wasn't filed until 2019, and has only 11 upvotes. Other than that, the issue was raised only once before in 2015 as far as I can find (toml-lang#337). I also can't really find anyone asking for it in any of the HN threads on TOML. Reverting this means we can go forward releasing TOML 1.1, giving people access to the much more frequently requested relaxing of inline tables (toml-lang#516, with 122 upvotes, and has come up on HN as well) and some other more minor things (e.g. `\e` has 12 upvotes in toml-lang#715). Basically, a lot more people are waiting for this, and all things considered this seems a better path forward for now, unless someone comes up with a proposal which addresses all issues (I tried and thus far failed). I proposed this over here a few months ago, and the responses didn't seem too hostile to the idea: toml-lang#966 (comment)
Overall I really like toml and its syntax feels very obvious to me for the most part. The only thing that doesn't is the expclicit cripling of inline tables, i.e. inline tables cannot have newlines or trailing commas. I've read the reasoning behind this in the existing issue and PR. However, I don't think that the reason given (discouraging people from using big inline tables instead of sections) weighs up against the downsides. That's why I would like to open up a discussion about this.
There's three main downsides I see:
{}
style mappings allow newlines in them (JSON, Python, Javascript, Go). Also newlines and trailing commas are allowed in lists in the toml spec, so it is inconsistent in this regard.To the one with inline tables with newlines:
Finally, extending current toml parsers to support this is usually really easy, so that also shouldn't be an argument against it. I changed the the https://github.com/pelletier/go-toml implementation to support newlines in tables and I only had to change 5 lines to do it (3 of which I simply had to delete).
The text was updated successfully, but these errors were encountered: