[cdac] Implement ISOSDacInterface::GetPEFileName in cDAC #106358

elinor-fung · 2024-08-13T18:13:39Z

Include Module path in data descriptor
Implement GetPath in Loader contract
Make cDAC implement ISOSDacInterface::GetPEFileName

Contributes to #99302

dotnet-policy-service · 2024-08-13T18:14:07Z

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

AaronRobinsonMSFT · 2024-08-13T18:30:59Z

src/native/managed/cdacreader/src/Contracts/Loader_1.cs

+        while (true)
+        {
+            // Read characters until we find the null terminator
+            char nameChar = _target.Read<char>(addr);
+            if (nameChar == 0)
+                break;
+
+            name.Add(nameChar);
+            addr += sizeof(char);
+        }


I would consider doing this slightly different. I would instead determine the entire length, find the null terminator and then reread the entire block of bytes in a single operation. I think it would be slightly easier to debug. I also don't know if endianness is important here. Metadata is always encoded as UTF-16LE, but I'm not sure about non-metadata allocated strings.

non-metadata allocated strings would be big endian on a big endian machine

I think this implementation is ... fine... but it would be better if we added an api to Target to read null terminated strings of both 16bit and 8 bit char types in target endianness. This is a thing which keeps coming up, and we shouldn't be re-implementing the wheel again and again.

non-metadata allocated strings would be big endian on a big endian machine

Would they even be utf-16 at all?

(To be clear, I think this is kind of academic until someone sits down and does a coreclr port to POWER64 and actually makes some decision about what an LPCWSTR even is in that platform version of the CoreCLR PAL. I don't know what a reasonable answer might be. /cc @uweigand @directhex )

Do we want to stash an encoding somewhere in the runtime descriptor?

but it would be better if we added an api to Target to read null terminated strings of both 16bit and 8 bit char types in target endianness.

It is more complicated than that. We need to know if the target pointer is into metadata, which is always LE. I was going to suggest the same thing, but the metadata angle makes it less obvious the correct answer.

Even with the metadata angle, I think having APIs on Target reading strings in target endianness still makes sense.

We need to know if the target pointer is into metadata, which is always LE.

And then it is up to the caller / consumer of the target pointer to know that it shouldn't be read in target endianness and to read it their own way?

I think having APIs on Target reading strings in target endianness still makes sense.

Sure, I agree with this. My point is, it isn't as simple as "read the string". In fact it can become very complicated if the API doesn't know the encoding. The API needs to know the encoding apriori or else it won't be able to detect null properly. Is it the "high order" byte of a UTF-16 or an actual null in UTF-8? The string reading API needs to know. The endianness for UTF-16 is easier, but still can be confusing to the caller. The API said it read UTF-16, but now the caller must consider the source and determine if it is in machine endian form or metadta endian form. Creating three APIs would be needed.

It would need to be something like:

string ReadUTF8String(); string ReadUTF16TargetEndianness(); string ReadUTF16MetadataEndianness();

I think @AaronRobinsonMSFT's apis here are good enough for now. As @lambdageek points out, dealing with actual big endian machines is quite unlikely to happen in the near term, and we can get away with having a "reasonable" model now, and tweak it to match reality if/when such a thing happens. I also don't think this refactor to using a Target level api should happen in this PR, but should be deferred to a follow-up PR which is focused on making the string handling sensible.

Yeah, I was expecting multiple APIs - like what @AaronRobinsonMSFT wrote. I put in a TODO for adding/using Target APIs and will a follow-up change for that.

…reading strings

elinor-fung added 3 commits August 13, 2024 09:48

Add module path to data descriptor

3b3d23a

Add GetPath to Loader contract

d896e2b

Implement ISOSDacInterface::GetPEFileName in cDAC

836f5dc

elinor-fung requested review from lambdageek, davidwrighton and AaronRobinsonMSFT August 13, 2024 18:13

dotnet-issue-labeler bot added the area-Diagnostics-coreclr label Aug 13, 2024

dotnet-policy-service bot assigned elinor-fung Aug 13, 2024

elinor-fung mentioned this pull request Aug 13, 2024

[cdac] Implement !PrintException #99302

Open

25 tasks

AaronRobinsonMSFT reviewed Aug 13, 2024

View reviewed changes

davidwrighton approved these changes Aug 13, 2024

View reviewed changes

build-analysis bot mentioned this pull request Aug 13, 2024

MSBuild crashing in the build #92290

Open

Compute string length and read as buffer, TODO about Target APIs for …

6facc86

…reading strings

elinor-fung merged commit 527ab8f into dotnet:main Aug 14, 2024
150 of 152 checks passed

elinor-fung deleted the cdac-pe-file-name branch August 14, 2024 01:04

elinor-fung mentioned this pull request Aug 15, 2024

[cdac] Add helper methods to Target for reading UTF-8/16 strings #106483

Merged

github-actions bot locked and limited conversation to collaborators Sep 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[cdac] Implement ISOSDacInterface::GetPEFileName in cDAC #106358

[cdac] Implement ISOSDacInterface::GetPEFileName in cDAC #106358

elinor-fung commented Aug 13, 2024

dotnet-policy-service bot commented Aug 13, 2024

AaronRobinsonMSFT Aug 13, 2024

davidwrighton Aug 13, 2024 •

edited

Loading

davidwrighton Aug 13, 2024

lambdageek Aug 13, 2024

AaronRobinsonMSFT Aug 13, 2024

elinor-fung Aug 13, 2024

AaronRobinsonMSFT Aug 13, 2024

davidwrighton Aug 13, 2024

elinor-fung Aug 13, 2024

[cdac] Implement ISOSDacInterface::GetPEFileName in cDAC #106358

[cdac] Implement ISOSDacInterface::GetPEFileName in cDAC #106358

Conversation

elinor-fung commented Aug 13, 2024

dotnet-policy-service bot commented Aug 13, 2024

Choose a reason for hiding this comment

davidwrighton Aug 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidwrighton Aug 13, 2024 •

edited

Loading