Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: redesign StyledString decoding #2816

Merged
merged 3 commits into from
May 23, 2022
Merged

fix: redesign StyledString decoding #2816

merged 3 commits into from
May 23, 2022

Conversation

IgorEisberg
Copy link
Contributor

Fixes #2815 by using recursive writing of nested tags instead of relying on questionable sorting methods.

@IgorEisberg
Copy link
Contributor Author

Somewhat inspired by this:
https://cs.android.com/android-studio/platform/tools/base/+/mirror-goog-studio-main:resource-repository/main/java/com/android/resources/aar/ProtoStyledStringDecoder.java;l=28?q=StyledString

But using recursion allows for cleaner code and easy handling of self-closing tags which neatly outputs pairs like <br></br> as <br/> which is the XHTML standard.

@iBotPeaches
Copy link
Owner

Interesting the only test failing in Isolation in some setups is the test that was added that led to these regressions.

Thanks for PR. Slowly going through w/ debugger to understand.

@IgorEisberg
Copy link
Contributor Author

Yeah I'm not sure what's going on with that test, tried compiling/decompiling with all of the strings in that XML and the output was identical, both with aapt1 and aapt2. Very strange.

@IgorEisberg
Copy link
Contributor Author

IgorEisberg commented May 22, 2022

Looks like an issue with aapt1. When the APK is compiled with it, spans are not stored in the same order as with aapt2.
So for string:
<string name="test_string43"><ul> <li><b>aaaaa aa aaaaa</b> – aaaaaaa aaaaaaaaaa aa aaaaaaaa aaaaaa aaaaa (aaaa) aaaa aaaaaaaaa aaaaa aa aaaaaaaaa aaaaaaa aaaa</li> <li><b>aaaaaaaaa aaaaaaaaaaaaaaa aaaaaaaa</b> – aaaaaaa aaaaaaaaaa aaaaaaaaa aaaa aaaaaa aa aaaa aaaa aaaa aaaa aaaaaaa aaaaaaaaaaaaa, aaaaaa aaa aaaaaaaaaa (aaa) aaaaaaaaaaaaaaa</li> <li><b>aaaaaaaaaaa aaaaaa aaaaaaaaaa</b> – aaaaaaaaaa aaaa aaaaaa aa a aaa aa aaaaaa aaa aaaaaaa (aaaaa aaaaaaaa) aaaaaaaa aa aaaaaa aaaaa aaa aaaa</li> </ul></string>

Spans when compiled with AAPT1:

0: b (1-14)
1: li (1-114)
2: b (116-149)
3: li (116-282)
4: b (284-312)
5: li (284-419)
6: ul (0-420)

Spans when compiled with AAPT2:

0: ul (0-420)
1: li (1-114)
2: b (1-14)
3: li (116-282)
4: b (116-149)
5: li (284-419)
6: b (284-312)

@IgorEisberg
Copy link
Contributor Author

Due to the above span ordering discrepancy between aapt1/2 I had to write a simple sorting function.
Would be more robust if we could somehow detect which aapt was used to build a given resources.arsc, but this is serviceable for most cases, although cases like:
... </li><br/><br/><li><b>Title</b> ...
Would decompile to something like:
... </li><li><b><br/><br/>Title</b> ...
It's an unusual case though...

@iBotPeaches
Copy link
Owner

Thanks for the large investigation and fix. Will try and visit again tomorrow to step through it as well for my own understanding.

@IgorEisberg
Copy link
Contributor Author

By the way, I tried with Google's own ProtoStyledStringDecoder, and it also just handles AAPT2's spans (ordered by opening tag?) properly, while AAPT1's spans (ordered by closing tag?) are heavily misplaced. I don't even know of a single piece of code that decodes AAPT1's spans properly, let alone code that supports both orderings...

@iBotPeaches
Copy link
Owner

My guess is Google can just always depend on aapt2 in Studio prior to bundletool being used. I think every month we see less and less applications being disassembled via aapt1. Of course ignoring legacy apps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Incorrect order of HTML tags in strings
2 participants