Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Preserve dotted-key ordering #808

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

mxndtaylor
Copy link

@mxndtaylor mxndtaylor commented Nov 10, 2024

Fixes #163

Based on @epage's WIP work from #165.

Admittedly not the most efficient solution as it adds another Vec::sort_by_key to the Table::append_values and InlineTable::append_values functions, sorting the Vecs after the values have just been inserted into them.

The slightly more efficient Vec::sort_by_key is not necessarily possible since it might move non-dotted keys all together, thus again breaking the original document's order. I was incorrect about this, and have used Vec::sort_by_key instead

A better solution might be to insert the values into the values Vecs in the intended order, but this would be much more involved.

Note:
This change failed some tests from the edit functionality, specifically, Table::{sort_values, sort_values_by} and the InlineTable versions. So, I created a targeted fix for that, where it just sets the Key.position property to be the index in the iterator over its keys.

The implication here is that any future changes that involve key re-orderings on Tables will have to modify Key.position value, instead of just their order in the items: IndexMap.

In fact, because of this, we could remove the ordering of the IndexMap completely now, and instead rely on the position value, thereby reducing the chances of future confusion.

Without a more comprehensive change like that, I have a feeling there might be an untested use case here somewhere, if something else I'm not aware of is modifying the Key order in Table.items or InlineTable.items, then this will now fail.

crates/toml_edit/src/key.rs Fixed Show fixed Hide fixed
crates/toml_edit/src/key.rs Fixed Show fixed Hide fixed
crates/toml_edit/src/key.rs Fixed Show fixed Hide fixed
crates/toml_edit/src/key.rs Fixed Show fixed Hide fixed
crates/toml_edit/src/table.rs Fixed Show fixed Hide fixed
crates/toml_edit/src/table.rs Fixed Show fixed Hide fixed
@coveralls
Copy link

coveralls commented Nov 10, 2024

Pull Request Test Coverage Report for Build 11785987202

Details

  • 27 of 32 (84.38%) changed or added relevant lines in 6 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.2%) to 68.249%

Changes Missing Coverage Covered Lines Changed/Added Lines %
crates/toml_edit/tests/testsuite/edit.rs 2 3 66.67%
crates/toml_edit/src/inline_table.rs 1 3 33.33%
crates/toml_edit/src/key.rs 7 9 77.78%
Totals Coverage Status
Change from base Build 11667065238: 0.2%
Covered Lines: 3781
Relevant Lines: 5540

💛 - Coveralls

@mxndtaylor
Copy link
Author

Just to add a note about the additional changes, I've switched the Vec::sort_by to Vec::sort_by_key using usize::MAX for the key when position=None.

Originally, I used Vec::sort_by because I thought that any constant value for position=None would cause problems as it wouldn't consider the insertion order, and just move those Keys to the end of the output. And this is largely true, however:

  1. every item parsed by this library will have position=Some
  2. all items manually constructed with this library will have position=None by default
  3. returning Ordering::Equal from Vec::sort_by for comparing position=Some to position=None does not result in a preserving of the order (nor does returning any of the other Ordering values)

If the Key objects are either all from the parser or all manually constructed, then the original Vec::sort_by_key and the previous Vec::sort_by implementations result in identical output.

If the Key objects are mixed, where some are from the parser and some are manually constructed, we would get behavior like this:

let doc = r#"foo.bar = 1

foo.baz.qwer = "a"
goodbye = { forever="no" }

foo.pub = 2
foo.baz.asdf = "b"
"#.parse<MutDocument>().unwrap();

let root = doc.as_table_mut();
root["foo"]["baz"]["zxcv"] = "CCC"
assert_data_eq!(doc.to_string(), r#"foo.bar = 1

foo.baz.qwer = "a"
foo.baz.asdf = "b"
foo.baz.zxcv = "CCC"
goodbye = { forever="no" }

foo.pub = 2
"#);

where foo.baz.asdf and foo.baz.qwer were moved when we made no modification to them. The fix is to consider position=None as simply "greater" than position=Some which can more easily be implemented with Vec::sort_by_key.


The only remaining problem is a somewhat odd one, that would not normally come up, which is to merge tables by iterating through the second table's keys manually like so:

let mut origin_doc = r#"
foo = 1
bar = 2
"#.parse::<DocumentMut>().unwrap();
let mut additional_doc = r#"
baz = 3
qux = 4
"#.parse::<DocumentMut>().unwrap();

let origin_table = origin_doc.as_table_mut();

// if we use origin_table.extend(addition_doc.iter()) this does not occur
// it only occurs when we directly get the key from its parsed 
additional_doc.iter().for_each(|(k, v)| {
    // we get the actually `Key` here with its position
    let (key, _) = additional_doc.get_key_value(k).unwrap();
    origin_table.insert_formatted(key, v.clone());
});

// baz (with position=Some(0) from the second table) gets inserted before 
// bar (with position=Some(1) from the first table)
assert_data_eq!(origin_doc.to_string(), r#"
foo = 1
baz = 3
bar = 2
qux = 4
"#);

But as noted, the more canonical process of using Table.extend(Table.iter()) does not have this problem, so I think this is fine.

crates/toml_edit/src/table.rs Fixed Show fixed Hide fixed
crates/toml_edit/src/table.rs Fixed Show fixed Hide fixed
crates/toml_edit/src/table.rs Fixed Show fixed Hide fixed
crates/toml_edit/src/table.rs Fixed Show fixed Hide fixed
crates/toml_edit/src/table.rs Fixed Show fixed Hide fixed
crates/toml_edit/src/table.rs Fixed Show fixed Hide fixed
@mxndtaylor
Copy link
Author

not really sure if that comment should be a doc or not, maybe I should have just made it into a non-doc comment, but I just did what the linter said to instead, so hopefully that fixes it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, because of this, we could remove the ordering of the IndexMap completely now, and instead rely on the position value, thereby reducing the chances of future confusion.

Table and InlineTable have the semantics of an IndexMap. We need to preserve the order of content as it gets inserted. We could set a position on insertion.

Also, I'm unsure about allowing key positions to be moved between tables because the position is dependent on the other values within the same table and you can't easily translate that from one table to the next.

@epage
Copy link
Member

epage commented Nov 12, 2024

Could you clean up the commits for how you would want me to review and merge this?

My recommendation is to split up commits so that

  • The first commit is a test (that passes), showing the current behavior
  • Any code changes from the fix you can pull out into refactoring commits
  • The actual fix which would also include updating the test to now pass

@epage
Copy link
Member

epage commented Nov 12, 2024

If the Key objects are mixed, where some are from the parser and some are manually constructed, we would get behavior like this:

This is a very common case and is one the motivating reasons for this crate. Grouping dotted keys, rather than treating a dotted key as something to add at the end, is a more natural behavior. We currently get it for free. We don't always guarantee a more natural way of doing things though. However, if we were to change it, it would likely be a breaking behavior change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dotted key ordering isn't always preserved
3 participants