Fix issue #432 #590

danielzuncke · 2023-10-17T02:20:51Z

Overview:
Array astInformation.structInitEndLocations is used to find index to use in astInformation.indentInfoSortedByEndLocation.
Both are sorted, the first one is used to find the array index, the second one is accessed at that index to get brace indentation information.

Problem:
structInitEndLocations is generated from struct initializers exclusively while the brace information array also contains entries for function literal initializers. Thus when function literal init(s) are used, we get accumulating off by one errors in the second array:

match value in structInitEndLocations and take that index:

[3, 50]
    ^--> index 1

take brace indent information from that index (represented by the same offset values .endLocation):

[3, 15, 50]
    |   ^--> the one we want
    ^------> the one we get (function literal init)

If I missed something please let me know; also I am not 100% sure that I got the allman, knr, otbs examples right, but they look reasonable and throw no errors.

Better search strategies than linear are possible, this should be enough for any sane and most of the insane source files.

Overview: Array astInformation.structInitEndLocations is used to find index to use in astInformation.indentInfoSortedByEndLocation. Both are sorted, first is used to find the array index, second is accessed at that index to get brace indentation information. Problem: structInitEndLocations is generated from struct initializers exclusively while the brace information array also contains entries for function literal initializers. Thus when function literal init(s) are used, we get accumulating off by one errors in the second array: match value in structInitEndLocations and take that index: [3, 50] ^--> index 1 take brace indent information from that index: [3, 15, 50] | ^--> the one we want ^------> the one we get (function literal init) Solution: This guarantees that searching forward works. While better search strategies than linear are possible, this should be enough for any sane and most of the insane code files.

WebFreak001

can you also include the case where there is a lambda inside the array itself in the test cases? e.g. add

immutable S s = {
	1111111111111111111,
	1111111111111111111,
	(x) {
		return x + 1111;
	},
};

as well

src/dfmt/formatter.d

I have formatted them both to look like the the individual bad output and not like the actual output from formatting the issue0432.d file with the current implementation. The current implementation makes it look like a new / different problem.

Using different variable to iterate to guarantee unchanged behaviour if anything doesn't work as intended.

danielzuncke · 2023-10-18T01:04:05Z

I have found some problems with < too, before I clutter up the commit history,
what do you think of this:

// Account for possible function literals in this array which offset
// the previously set index (pos). Fixes issue #432.
size_t newPos = pos;
while (newPos < astInformation.indentInfoSortedByEndLocation.length &&
    astInformation.indentInfoSortedByEndLocation[newPos].endLocation < tokens[index].index)
{
    newPos++;
}

if (newPos < astInformation.indentInfoSortedByEndLocation.length &&
    astInformation.indentInfoSortedByEndLocation[newPos].endLocation == tokens[index].index)
{
    pos = newPos;
}
else
{
    debug
    {
        /+ Provide debug information of what the broken assumption was:
        (todo)

        assert(isSorted(structInitEndLocation), msg1);
        assert(isSorted(indentInfoSortedByEndLocation), msg2);
        assert(isSubsetOf(structInitEndLocation, indentInfoSortedByEndLocation), msg3);

        Alternatively write to stderr and continue execution.
        +/
        
    }
}

This gives us

no out of bounds errors
fast fail
no change in behaviour compared to current implementation
fail silently on errors but complain if compiled with debug switch
- would you prefer asserts or writing to stderr and continue?

Additionally it's easy to see why all of the above are true without having to
look up any behaviour which may not be documented clearly.

!= and < should never fail for the same reasons, but if we expect
one can fail, both can (as far as I can tell, require same expected conditions
to hold to guarantee success in all cases).

WebFreak001 · 2023-10-18T02:31:36Z

I don't think logging the broken assumptions is necessary here, just a comment saying that "broken AST could have messed up the location check above" should be enough, since in the correct case it should all work and in the broken AST case we will get new issues on the issue tracker and the comment might or might not help identifying if that is really the part that's broken.

The current code looks good to me, what kind of issues did you find with it that you would think make it not ready to be merged? Those would be great to have as test cases.

danielzuncke · 2023-10-18T07:42:10Z

Sorry I was unclear, can merge as is.

The explanation got long so I removed it in the previous message. And explaining it here in full was so much worse, so without details:

Pessimistically the argument can be made that if you are right and != can lead to out of bounds, a precondition I have taken from the code is broken which also that also breaks < unless further restricted again. Disproving this / confirming <, requires to make a case everywhere leading up to the change that the preconditions are assumed correctly and stay that way / do they weaken in a way to keep < correct.
Suggesting to compare against .length was the shortcut once the previous explanations got too long.

But I am confident both do exactly the same, you are confident that at least one works, so lets just take it.

WebFreak001 requested changes Oct 17, 2023

View reviewed changes

src/dfmt/formatter.d Outdated Show resolved Hide resolved

danielzuncke added 2 commits October 17, 2023 17:56

Update test to include both known cases

979c74d

I have formatted them both to look like the the individual bad output and not like the actual output from formatting the issue0432.d file with the current implementation. The current implementation makes it look like a new / different problem.

Confirm token equality, safer loop

4a08417

Using different variable to iterate to guarantee unchanged behaviour if anything doesn't work as intended.

WebFreak001 approved these changes Oct 18, 2023

View reviewed changes

WebFreak001 merged commit 08fe5d6 into dlang-community:master Oct 18, 2023
1 check passed

WebFreak001 mentioned this pull request Oct 22, 2023

Anonymous function in struct breaks indentation #432

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix issue #432 #590

Fix issue #432 #590

danielzuncke commented Oct 17, 2023

WebFreak001 left a comment

danielzuncke commented Oct 18, 2023

WebFreak001 commented Oct 18, 2023

danielzuncke commented Oct 18, 2023

Fix issue #432 #590

Fix issue #432 #590

Conversation

danielzuncke commented Oct 17, 2023

WebFreak001 left a comment

Choose a reason for hiding this comment

danielzuncke commented Oct 18, 2023

WebFreak001 commented Oct 18, 2023

danielzuncke commented Oct 18, 2023