Cache DocumentUrl in PortablePdbSymbolReader #79804

MichalStrehovsky · 2022-12-19T05:19:12Z

This uses the exact same strategy as the unmanaged reader (Dictionary with a lock around this - wouldn't be my first choice, but we should use the same thing).

That was 100,000+ string allocations in a hello world:

Cc @dotnet/ilc-contrib

This uses the exact same strategy as the unmanaged reader (`Dictionary` with a `lock` around `this`).

ghost · 2022-12-19T05:19:36Z

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

This uses the exact same strategy as the unmanaged reader (Dictionary with a lock around this - wouldn't be my first choice, but we should use the same thing).

That was 100,000+ string allocations in a hello world:

Cc @dotnet/ilc-contrib

Author:	MichalStrehovsky
Assignees:	MichalStrehovsky
Labels:	`area-NativeAOT-coreclr`
Milestone:	-

jkotas · 2022-12-19T05:28:04Z

That was 100,000+ string allocations in a hello world:

Why is not CachingMetadataStringDecoder taking care of caching these string allocations?

MichalStrehovsky · 2022-12-19T05:35:55Z

That was 100,000+ string allocations in a hello world:

Why is not CachingMetadataStringDecoder taking care of caching these string allocations?

We call this API:

runtime/src/libraries/System.Reflection.Metadata/src/System/Reflection/Metadata/MetadataReader.cs

Lines 1352 to 1355 in a92c5bc

    
           public string GetString(DocumentNameBlobHandle handle) 
        
           { 
        
               return BlobHeap.GetDocumentName(handle); 
        
           }

Which calls this:

runtime/src/libraries/System.Reflection.Metadata/src/System/Reflection/Metadata/Internal/BlobHeap.cs

Lines 185 to 213 in a92c5bc

    
           public string GetDocumentName(DocumentNameBlobHandle handle) 
        
           { 
        
               var blobReader = GetBlobReader(handle); 
        
               // Spec: separator is an ASCII encoded character in range [0x01, 0x7F], or byte 0 to represent an empty separator. 
        
               int separator = blobReader.ReadByte(); 
        
               if (separator > 0x7f) 
        
               { 
        
                   throw new BadImageFormatException(SR.InvalidDocumentName); 
        
               } 
        
               var pooledBuilder = PooledStringBuilder.GetInstance(); 
        
               var builder = pooledBuilder.Builder; 
        
               bool isFirstPart = true; 
        
               while (blobReader.RemainingBytes > 0) 
        
               { 
        
                   if (separator != 0 && !isFirstPart) 
        
                   { 
        
                       builder.Append((char)separator); 
        
                   } 
        
                   var partReader = GetBlobReader(blobReader.ReadBlobHandle()); 
        
                   builder.Append(partReader.ReadUTF8(partReader.Length)); 
        
                   isFirstPart = false; 
        
               } 
        
               return pooledBuilder.ToStringAndFree(); 
        
           }

The other GetString APIs pass the Utf8Decoder around:

runtime/src/libraries/System.Reflection.Metadata/src/System/Reflection/Metadata/MetadataReader.cs

Lines 1060 to 1063 in a92c5bc

    
           public string GetString(StringHandle handle) 
        
           { 
        
               return StringHeap.GetString(handle, UTF8Decoder); 
        
           }

MichalStrehovsky · 2022-12-19T05:38:25Z

Digging more into it, the problem seems to be that the caching string decoder caches things by pointer, but these document names are composed from chunks or something like that.

jkotas

LGTM!

Cache DocumentUrl in PortablePdbSymbolReader

15179b1

This uses the exact same strategy as the unmanaged reader (`Dictionary` with a `lock` around `this`).

dotnet-issue-labeler bot added the area-crossgen2-coreclr label Dec 19, 2022

ghost assigned MichalStrehovsky Dec 19, 2022

MichalStrehovsky added area-NativeAOT-coreclr and removed area-crossgen2-coreclr labels Dec 19, 2022

build-analysis bot mentioned this pull request Dec 19, 2022

Precondition failure: File has not had execution verified #79439

Closed

jkotas approved these changes Dec 19, 2022

View reviewed changes

jkotas merged commit 6e69214 into dotnet:main Dec 19, 2022

MichalStrehovsky deleted the docurl branch December 19, 2022 18:17

ghost locked as resolved and limited conversation to collaborators Jan 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache DocumentUrl in PortablePdbSymbolReader #79804

Cache DocumentUrl in PortablePdbSymbolReader #79804

MichalStrehovsky commented Dec 19, 2022

ghost commented Dec 19, 2022

jkotas commented Dec 19, 2022

MichalStrehovsky commented Dec 19, 2022

MichalStrehovsky commented Dec 19, 2022

jkotas left a comment

Cache DocumentUrl in PortablePdbSymbolReader #79804

Cache DocumentUrl in PortablePdbSymbolReader #79804

Conversation

MichalStrehovsky commented Dec 19, 2022

ghost commented Dec 19, 2022

jkotas commented Dec 19, 2022

MichalStrehovsky commented Dec 19, 2022

MichalStrehovsky commented Dec 19, 2022

jkotas left a comment

Choose a reason for hiding this comment