Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace ConditionalWeakTable with simpler non-static Dictionary-based cache #2506

Merged
merged 16 commits into from
Sep 19, 2022

Conversation

habbes
Copy link
Contributor

@habbes habbes commented Sep 9, 2022

Issues

*This pull request addresses #2503

When OData Client is reading items from the reader, it stores some metadata as "annotations" in a cache using the SetAnnotation() extension method. It retrieves this metadata during materialization when converting the OData items into client CLR types. For the cache, it uses the custom InternalDictionary static class that is based on ConditionalWeakTable.

The ConditionalWeakTable has the following properties relevant to understand the issue and the solution implemented in this PR:

  • Entries are held in the table by weak references (DependentHandle to be precise) and are not kept alive simply by being in the ConditionalWeakTable. If an entry's key is not referenced from outside the table, it will be eventually garbage collected even if the table is still alive.
  • The table uses reference equality to identify keys, and this behavior cannot be overridden (e.g. via custom GetHashCode, Equals or comparer implementations)
  • The ConditionalWeakTable is thread-safe. It uses locks to synchronize writes to the table
  • The ConditionalWeakTable contains an internal Container class that actually holds the entries array. This class defines a destructor but is not IDisposable. When the array gets full a new Container instance with more capacity will be created and the old one will be eventually garbage collector. The destructor also needs to acquire the lock to clean up property and avoid data corruption.
  • The ConditionalWeakTable does not shrink in size. It can recycle the space of old entries that have expired or have been removed, over item, it either remains the size or grows larger (by creating containers with larger capacities).

OData Client create static ConditionalWeakTable<ODataAnnotatable, T>, a different instance static instance is created for each type T that is used. MaterializerEntry is used as the value type for ODataResource keys, IEnumerable<ODataResource> as the value type for ODataResourceSet keys, etc.

The fact that they are static means they never get garbage collected and they are shared by all requests. This why it's important the tables are thread-safe. But this also causes efficiency and performance issues: when we have a lot of concurrent requests, we will have concurrent writes to the same ConditionalWeakTable, which result in:

  • lock contention issues as different threads try to acquire locks in order to add entries
  • more concurrent requests, means more cache entries will in-use at the same time, meaning their space cannot be recycled. This increases the chances of the cache the ConditionalWeakTable being resized
  • Resizing the table not only means we allocate more memory, but the old internal container will get garbage collected, which means it will end up in the finalization queue and make GC more expensive
  • Because the Container's destructor needs to acquire a lock, it might be a victim (or cause) of lock contention and clog the finalization process

This explains (at least in part) why the customer reported so much memory use and GC activity (up to 11-12GB heap size). This is not sufficient to explain the discrepancy between heap allocations observed in .NET Core 3.1 (~1-3GB) and .NET 6.0.

Description

This PR removes the ConditionalWeakTable-based static cache and replaces it with a new MaterializerCache. The new cache is based on a simple Dictionary<ODataAnnotatable, object> and is not static. Instead, it's created for each request and only remains alive for the duration of the request. This has the following benefits over the previous implementation:

  • No need for thread synchronization since each request has its own dictionary
  • Dictionary size does not grow indefinitely. Since they're short-lived their memory is reclaimed once garbage collected.

The customers tested this implementation in a nightly build and reported that memory use dropped from 11-12GB to about 4.5-6GB (~50% improvement) and remains stable around that level. This is however still higher than what it was on .NET Core 3.1, so there's still more investigations to be carried out.

When implementing this PR I had to make the following considerations:

  • Ensure the MaterializerCache is functionally equivalent to old cache
  • At what point in the request/response pipeline should the cache be created
  • How to pass the cache a long wherever its needed

How MaterializerCache is functionally equivalent to the old cache implementation

The old cache was actually a handful of static caches. For example, in a given request, an <ODataResource, MaterializerEntry> pair would be stored in a different ConditionalWeakTable from an <ODataProperty, MaterializedPropertyValue> pair. So, the ConditionalWeakTable<ODataAnnotatable, MaterializedPropertyValue> actually contains ODataProperty and values from different requests. But the important thing that makes this work is each odata item only has a single value. An ODataProperty instance is only once once as key and exists in only one dictionary, you won't find an ODataProperty as key in the ConditionalWeakTable<ODataAnnotatble, MaterializerEntry> dictionary. Otherwise, that would lead to incorrect behaviour.

Since each instance is only used as key once, then it's safe to have keys of different types in the same Dictionary<ODataAnnotatable, object> dictionary. We know that when we fetch the value for an ODataResource it will be a MaterializerEntry value, and when fetch an ODataProperty it will return a MaterializedPropertyValue value since this how OData Client is already using the cache. If these assumptions do not hold, then a different design would be necessary. I have also ensured that the dictionary uses a reference-equality comparer.

Where to create the MaterializerCache

I noticed that the ODataMaterializer and similar classes accept or create an ODataMaterializerContext instance. Initially I created the MaterializerCache inside the context class and exposed it as a property. This worked fine for ExecuteAsync() requests but did not work well for SaveChangesAsync requests. The latter calls HandleOperationResponse at least twice, leading two different materializer contexts, but expects to find data in the cache that was stored in a previouse context. This caused tests to fail. For this reason, I moved the MaterializerCache to BaseAsyncResult. For context, QueryResult (which extends BaseAsyncResult) is created inside Execute() or GetValue() methods of a DataServiceRequest object. So, the lifetime is limited to those methods. I don't know if there's a way to give it short lifespan.

How to pass MaterializerCache around

Since the materializer cache is no longer static, I needed to refactor methods that access it to accept is an argument. For most methods, instead of passing the materializer cache directly as method, I passed the materializer context instead. This appeared to more convenient and more robust to changes in the future since we can add stuff to the context instead of adding more granular parameters. It also avoids the inconsistency of having to figure when to pass materializer context and when to pass something else. But on the other hand, it has coupled some methods to the materializer context even though they only need to access the cache. And once, in some cases it has forced me to create a new instance of materializer context where it did not previous exist so that I can pass it around. If people believe that this is not a good approach, I'm open to refactoring it to reduce places where I use the materializer context as parameter (it still makes sense to keep the materializer cache as a property of the materializer context in classes that already had dependency on the materializer context).

Checklist (Uncheck if it is not completed)

  • Test cases added
  • Build and test with one-click build and test script passed

Additional work necessary

If documentation update is needed, please add "Docs Needed" label to the issue and provide details about the required document change in the issue.

@habbes habbes marked this pull request as ready for review September 15, 2022 05:40
@habbes habbes changed the title Replace ConditionalWeakTable with MaterializerContext.AnnotationsCache Replace ConditionalWeakTable with simpler non-static Dictionary-based cache Sep 15, 2022
corranrogue9
corranrogue9 previously approved these changes Sep 16, 2022
Copy link
Contributor

@corranrogue9 corranrogue9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm approving due to the urgent nature of this change, and because I find the design changes to be a significant improvement over what we currently have. I have not reviewed in detail.

@mikepizzo
Copy link
Member

I haven't had time to do a line-by-line review, but I agree with the intent and design of this solution. Thanks, Clement, for the research and for putting a solution together so quickly, and thanks everyone who has jumped on to provide feedback and help turn this around.

mikepizzo
mikepizzo previously approved these changes Sep 16, 2022
Copy link
Member

@mikepizzo mikepizzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

…er.cs

Co-authored-by: Elizabeth Okerio <elizaokerio@gmail.com>
@habbes habbes dismissed stale reviews from mikepizzo and corranrogue9 via 819ca96 September 17, 2022 16:09
gathogojr
gathogojr previously approved these changes Sep 17, 2022
Copy link
Contributor

@gathogojr gathogojr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

…ontext.cs

Co-authored-by: Kennedy Kang'ethe <kemunga@microsoft.com>
@habbes habbes dismissed stale reviews from ElizabethOkerio and gathogojr via 310e03b September 19, 2022 07:47
@pull-request-quantifier-deprecated

This PR has 485 quantified lines of changes. In general, a change size of upto 200 lines is ideal for the best PR experience!


Quantification details

Label      : Extra Large
Size       : +264 -221
Percentile : 82.83%

Total files changed: 40

Change summary by file extension:
.cs : +264 -221

Change counts above are quantified counts, based on the PullRequestQuantifier customizations.

Why proper sizing of changes matters

Optimal pull request sizes drive a better predictable PR flow as they strike a
balance between between PR complexity and PR review overhead. PRs within the
optimal size (typical small, or medium sized PRs) mean:

  • Fast and predictable releases to production:
    • Optimal size changes are more likely to be reviewed faster with fewer
      iterations.
    • Similarity in low PR complexity drives similar review times.
  • Review quality is likely higher as complexity is lower:
    • Bugs are more likely to be detected.
    • Code inconsistencies are more likely to be detected.
  • Knowledge sharing is improved within the participants:
    • Small portions can be assimilated better.
  • Better engineering practices are exercised:
    • Solving big problems by dividing them in well contained, smaller problems.
    • Exercising separation of concerns within the code changes.

What can I do to optimize my changes

  • Use the PullRequestQuantifier to quantify your PR accurately
    • Create a context profile for your repo using the context generator
    • Exclude files that are not necessary to be reviewed or do not increase the review complexity. Example: Autogenerated code, docs, project IDE setting files, binaries, etc. Check out the Excluded section from your prquantifier.yaml context profile.
    • Understand your typical change complexity, drive towards the desired complexity by adjusting the label mapping in your prquantifier.yaml context profile.
    • Only use the labels that matter to you, see context specification to customize your prquantifier.yaml context profile.
  • Change your engineering behaviors
    • For PRs that fall outside of the desired spectrum, review the details and check if:
      • Your PR could be split in smaller, self-contained PRs instead
      • Your PR only solves one particular issue. (For example, don't refactor and code new features in the same PR).

How to interpret the change counts in git diff output

  • One line was added: +1 -0
  • One line was deleted: +0 -1
  • One line was modified: +1 -1 (git diff doesn't know about modified, it will
    interpret that line like one addition plus one deletion)
  • Change percentiles: Change characteristics (addition, deletion, modification)
    of this PR in relation to all other PRs within the repository.


Was this comment helpful? 👍  :ok_hand:  :thumbsdown: (Email)
Customize PullRequestQuantifier for this repository.

Copy link
Contributor

@gathogojr gathogojr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@habbes habbes merged commit a2bef7f into OData:master Sep 19, 2022
@habbes habbes deleted the eliminate-conditional-weak-table branch September 19, 2022 08:52
src/Microsoft.OData.Client/BaseAsyncResult.cs Show resolved Hide resolved
/// <summary>
/// Cache used to store temporary metadata used for materialization of OData items.
/// </summary>
protected MaterializerCache materializerCache = new MaterializerCache();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to expose this through the constructor rather than always using the default cache in case caches become configurable in the future

Copy link
Contributor Author

@habbes habbes Oct 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a good idea, but since this is an internal class, I think it would be better to instantiate the cache until there's a need to make it configurable. This way, we think of it as an implementation detail of the BaseAsyncResult that the caller does not need to know about. If we pass it on the constructor, the caller will have to take responsibility for passing down the cache, and we might make the argument that it should be configurable at that level, all the way up to the DataServiceContext. But the buck has to stop somewhere since this isn't a user input. I think BaseAsyncResult is good place because it's what glues the request together, and this cache is scoped to a request.

src/Microsoft.OData.Client/BatchSaveResult.cs Show resolved Hide resolved
src/Microsoft.OData.Client/QueryResult.cs Show resolved Hide resolved
src/Microsoft.OData.Client/SaveResult.cs Show resolved Hide resolved
src/Microsoft.OData.Client/SaveResult.cs Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants