Cache container elements in ODataUriResolver model elements cache #2623

habbes · 2023-02-22T04:08:38Z

Issues

This pull request fixes #2534 .

Description

This is a follow up to PR #2610, it adds container elements (operation imports and navigation sources) to the case-insensitive cache used during case-insensitive URI resolution.

Checklist (Uncheck if it is not completed)

Test cases added
Build and test with one-click build and test script passed

Additional work necessary

If documentation update is needed, please add "Docs Needed" label to the issue and provide details about the required document change in the issue.

xuzhg · 2023-02-22T17:20:07Z

src/Microsoft.OData.Core/UriParser/Resolver/NormalizedModelElementsCache.cs


        /// <summary>
        /// Builds a case-insensitive cache of schema elements from
        /// the specified <paramref name="model"/>.
        /// </summary>
        /// <param name="model">The model whose schema elements to cache. This model should be immutable. See <see cref="ExtensionMethods.MarkAsImmutable(IEdmModel)"/>.</param>
-        public NormalizedSchemaElementsCache(IEdmModel model)
+        public NormalizedModelElementsCache(IEdmModel model)


Why can't we do the cache in the 'EdmModel'?

What do you mean by cache in the EdmModel? Does the EdmModel have a cache?

Also, in this case we're dealing with the IEdmModel interface, we can't guarantee whether it's an EdmModel, a CsdlSemanticsModel or some other implementation.

I've checked around other users in the codebase and in WebAPI, and any time some metadata is added to the model, it's added as an "annotation". In this case this cache will also be an annotation that's bound to the model. It's not a static cache.

Have I answered your question?

xuzhg · 2023-02-22T17:21:33Z

src/Microsoft.OData.Core/UriParser/Resolver/NormalizedModelElementsCache.cs

+        /// <returns>A list of matching navigation sources, or null if no navigation source matches the name.</returns>
+        public List<IEdmNavigationSource> FindNavigationSources(string name)
+        {
+            if (navigationSourcesCache.TryGetValue(name, out List<IEdmNavigationSource> results))


Is the name the simple identifier of the navigation source? not the qualified name?

How about searching the navigation source in the referenced model? with the same name but a different namespace?

Or, is it only on Top-level model entity container?

Yes, it's the simple identifier of the navigation source. And yes, it's only searching the top-level model entity container.

The reason I implemented it this way is because I wanted to retain the same behaviour as the existing implementation which performs the search as follows:

IEdmEntityContainer container = model.EntityContainer; if (container == null) { return null; } var result = container.Elements.OfType<IEdmNavigationSource>() .Where(source => string.Equals(identifier, source.Name, StringComparison.OrdinalIgnoreCase));

see: https://github.com/OData/odata.net/blob/master/src/Microsoft.OData.Core/UriParser/Resolver/ODataUriResolver.cs#L94-L102

As you can see the existing implementation only searches the top-level model's container and it only compares the source.Name, not the fully qualified name (including the container name).

If I change the behaviour of the cache to include referenced models or fully qualified identifier, then there would be an inconsistency between the cache and non-cache code paths.

On the other hand, if the expected behaviour is that the URI resolver should search referenced models and/or fully qualified name, then the current implementation is a bug. And I think that should be a separate discussion since that change of behaviour could be observable to the customer.

The doc comment of the ResolveOperationImports describes the identifier parameter as follows:

The qualified name of the operation import which may or may not include the container name.

However, the existing implementation does not consider the qualified name in its search:

IEdmEntityContainer container = model.EntityContainer; if (container == null) { return Enumerable.Empty<IEdmOperationImport>(); } return container.Elements.OfType<IEdmOperationImport>() .Where(source => string.Equals(identifier, source.Name, StringComparison.OrdinalIgnoreCase));

If we are specifying the navigation source identifier and not searching in referenced models, is there a scenario where whatever that gets returned is a list and not just a single navigation source?

Like can we have several navigation sources with the same identifier in the same model?

Since we're searching a single container, I don't think that's possible. Maybe if we were searching in multiple containers that could be possible, but that would be arguably a new feature, extending the existing behaviour, which is beyond the scope of this PR.

However, because we are dealing with a case insensitive scenario, we can have multiple navigation sources in the same container whose names only differ by case. That's why we return a list.

can we have navigation sources in the same container whose names only differ by case? or we can access a navigation source using different cases?

The edm model is case-sensitive by default, so it is possible to have navigation sources which only differ by case because they are different names as far as the IEdmModel is concerned. If the IEdmModel were case-insensitive, then this cache would not be necessary.

Since this cache is case-insensitive, for the same key we might have multiple matching entries, that's why were store the values in lists.

src/Microsoft.OData.Core/UriParser/Resolver/NormalizedModelElementsCache.cs

ElizabethOkerio · 2023-02-28T04:29:58Z

src/Microsoft.OData.Core/UriParser/Resolver/NormalizedModelElementsCache.cs

+
+        private void PopulateContainerElements(IEdmModel model)
+        {
+            if (model.EntityContainer is null)


can the model be null here?

It's the responsibility of the caller to ensure that the model is not null here. Since this is an internal class, I placed a Debug.Assert(model != null) in the constructor.

ElizabethOkerio · 2023-02-28T04:32:05Z

src/Microsoft.OData.Core/UriParser/Resolver/NormalizedModelElementsCache.cs

+
+            return null;
+        }
+
        private void PopulateSchemaElements(IEdmModel model)
        {
            foreach (IEdmSchemaElement element in model.SchemaElements)


can the model be null here or the check is done by the caller?

That is the responsibility of the caller. In this class I only placed a Debug.Assert(model != null) in the constructor.

That said, I checked the public methods in the public ODataUriResolver class and they're not performing any null-checks on the args.

I have added null argument checks to the public methods of ODataUriResolver.

ElizabethOkerio · 2023-02-28T06:05:30Z

src/Microsoft.OData.Core/UriParser/Resolver/ODataUriResolver.cs

+
+                    if (cachedResults.Count > 1)
+                    {
+                        throw new ODataException(Strings.UriParserMetadata_MultipleMatchingNavigationSourcesFound(identifier));


why throw an exception if cachedResults are more than one and you said this is a possibility? or what does this mean?

This cache-based code path replicates the existing behaviour of the ResolveNavigationSource method, which is to throw an exception if multiple matches are found.

Notice that this method returns a single IEdmNavigationSource, not a collection. If there are multiple matches, then it means the URI resolver doesn't know which one to use and throws an exception.

ElizabethOkerio · 2023-02-28T06:07:58Z

src/Microsoft.OData.Core/UriParser/Resolver/ODataUriResolver.cs

+
+                if (cachedResults != null)
+                {
+                    return cachedResults;


why don't you do the checks for multiple operations here like you doing with navigation sources?

Because this cache-based code path aims to replicate the existing behaviour of the ResolveOperationImports method, which is to return all the matches that were found.

Unlike the FindNavigationSource method, this method returns a collection. Keep in mind that the IEdmModel can have multiple operation overloads with the same name but different parameters. So this is handled differently from navigation sources.

how do you check for the params? or when two operations with the same name are returned but with different params how do you know which is which?

I don't know exactly. That is outside the scope of the cache, and possibly resolved elsewhere. The cache was only concerned in replicating the already existing behaviour in a more efficient manner.

Here's a snippet of the existing implementation (this code path will still be used when the model is not immutable):

return container.Elements.OfType<IEdmOperationImport>() .Where(source => string.Equals(identifier, source.Name, StringComparison.OrdinalIgnoreCase));

It would be a bug if the cache returned something different from the non-cached version.

ElizabethOkerio · 2023-02-28T06:10:06Z

src/Microsoft.OData.Core/UriParser/Resolver/ODataUriResolver.cs

@@ -545,11 +579,11 @@ internal static ODataUriResolver GetUriResolver(IServiceProvider container)
            }
        }

-        private static NormalizedSchemaElementsCache GetNormalizedSchemaElementsCache(IEdmModel model)
+        private static NormalizedModelElementsCache GetNormalizedModelElementsCache(IEdmModel model)
        {
            Debug.Assert(model != null);


why are you doing these checks here if you already did them in the constructor?

I could remove it. I think someone else suggested having it here just to be sure. But I don't think it makes a major difference to have it here or not.

I removed it. Added argument null checks to public methods in this class instead.

...onalTests/Microsoft.OData.Core.Tests/UriParser/Metadata/NormalizedModelElementsCacheTests.cs

ElizabethOkerio · 2023-02-28T06:20:36Z

...onalTests/Microsoft.OData.Core.Tests/UriParser/Metadata/NormalizedModelElementsCacheTests.cs

+
+            var matches = cache.FindNavigationSources(name);
+
+            Assert.Equal(2, matches.Count);


Why do I think this should be 1 match? EntitySet for person is Persons and not people..but you are using people to find the navigation sources? Or there is no need of having that persons in the test

It's to matches because the test is searching for "People" (and also "people"), and there are two navigation sources that match these keywords, the "People" entity set and the "peoPle" singleton.

The reason for having "Persons" in the test is to verify that navigation sources that do not match the key are not included in the result.

...onalTests/Microsoft.OData.Core.Tests/UriParser/Metadata/NormalizedModelElementsCacheTests.cs

ElizabethOkerio · 2023-02-28T06:21:48Z

...onalTests/Microsoft.OData.Core.Tests/UriParser/Metadata/NormalizedModelElementsCacheTests.cs

+            var container = model.AddEntityContainer("NS", "Container");
+            var entitySet1 = container.AddEntitySet("People", person);
+            var entitySet2 = container.AddSingleton("peoPle", person);
+            container.AddEntitySet("Persons", person);


this line,,i don't think it is necessary here.

I added this line to ensure that the method returns only the items that match the keyword.

To explain the rationale, let's say I had not added "Persons" to the container and only had "People" and "peoPle". Now let's say that in the implementation of cache.FindNavigationSources I had a bug that caused every navigation source in the container to be returned, as opposed to only returning those that match. Such a bug would not be caught by this test, the test would pass because everything in the container happens to match the keyword. So, to be able to catch such bugs, I added something that does not match to the container to verify that such items are not included in the result.

...onalTests/Microsoft.OData.Core.Tests/UriParser/Metadata/NormalizedModelElementsCacheTests.cs

pull-request-quantifier-deprecated · 2023-02-28T08:29:51Z

This PR has 264 quantified lines of changes. In general, a change size of upto 200 lines is ideal for the best PR experience!

Quantification details

Label      : Large
Size       : +235 -29
Percentile : 66.4%

Total files changed: 4

Change summary by file extension:
.cs : +235 -29

Change counts above are quantified counts, based on the PullRequestQuantifier customizations.

Why proper sizing of changes matters

Optimal pull request sizes drive a better predictable PR flow as they strike a
balance between between PR complexity and PR review overhead. PRs within the
optimal size (typical small, or medium sized PRs) mean:

Fast and predictable releases to production:
- Optimal size changes are more likely to be reviewed faster with fewer
  iterations.
- Similarity in low PR complexity drives similar review times.
Review quality is likely higher as complexity is lower:
- Bugs are more likely to be detected.
- Code inconsistencies are more likely to be detected.
Knowledge sharing is improved within the participants:
- Small portions can be assimilated better.
Better engineering practices are exercised:
- Solving big problems by dividing them in well contained, smaller problems.
- Exercising separation of concerns within the code changes.

What can I do to optimize my changes

Use the PullRequestQuantifier to quantify your PR accurately
- Create a context profile for your repo using the context generator
- Exclude files that are not necessary to be reviewed or do not increase the review complexity. Example: Autogenerated code, docs, project IDE setting files, binaries, etc. Check out the Excluded section from your prquantifier.yaml context profile.
- Understand your typical change complexity, drive towards the desired complexity by adjusting the label mapping in your prquantifier.yaml context profile.
- Only use the labels that matter to you, see context specification to customize your prquantifier.yaml context profile.
Change your engineering behaviors
- For PRs that fall outside of the desired spectrum, review the details and check if:
  - Your PR could be split in smaller, self-contained PRs instead
  - Your PR only solves one particular issue. (For example, don't refactor and code new features in the same PR).

How to interpret the change counts in git diff output

One line was added: +1 -0
One line was deleted: +0 -1
One line was modified: +1 -1 (git diff doesn't know about modified, it will
interpret that line like one addition plus one deletion)
Change percentiles: Change characteristics (addition, deletion, modification)
of this PR in relation to all other PRs within the repository.

Was this comment helpful? 👍 :ok_hand: :thumbsdown: (Email)
Customize PullRequestQuantifier for this repository.

gathogojr · 2023-02-28T12:59:02Z

src/Microsoft.OData.Core/UriParser/Resolver/ODataUriResolver.cs

+                NormalizedModelElementsCache cache = GetNormalizedModelElementsCache(model);
+                IList<IEdmNavigationSource> cachedResults = cache.FindNavigationSources(identifier);
+
+                if (cachedResults != null)


This is the reason why it's preferrable to return an empty collection rather than null from a function that has a collection return type.

While I agree that it's generally preferrable, return an empty collection for the cache would introduce more complexity to the code than this single null-check that's not exposed to the user. Most of the null checks against the result occur in code paths where we already have an if statement checking whether the collection count is 0, so they didn't result in that much uglier code in my opinion.

To return empty collections efficiently, I would have to create a static empty list for each element type that the cache supports:

static readonly List<IEdmSchemaType> emptySchemaTypesList = new List<IEdmSchemaType>(); static readonly List<IEdmOperation> emptyOperationsList = new List<...>; static readonly List<IEdmTerm> emptyTermsList = ...; static readonly List<IEdmNavigationSource> emptyNavigationSourcesList = ...; static readonly List<IEdmOperationImport> emptyOperationImportsList = ...; // then in the find methods: if (navigationSources.TryGetValue(...)) { return results; } return emptyNavigationSourcesList;

The runtime overhead is not that bad since these collections are allocated only once, but it didn't feel worthwhile since the cache is an internal implementation detail used in a very limited scope.

That said, I don't hold this opinion strongly. If you still believe that I should return an empty collection despite the explanation above, I can go ahead and make the change.

KenitoInc

LGTM

gathogojr

Cache container elements in ODataUriResolver model elements cache

5d3119d

pull-request-quantifier-deprecated bot added the Extra Small label Feb 22, 2023

Add tests to NormalizedSchemaElementsCache

5c30ad1

pull-request-quantifier-deprecated bot added Medium and removed Extra Small labels Feb 22, 2023

Add ODataUriResolver tests

0abc32a

pull-request-quantifier-deprecated bot added Large and removed Medium labels Feb 22, 2023

habbes marked this pull request as ready for review February 22, 2023 07:27

habbes requested review from gathogojr, corranrogue9, ElizabethOkerio, KenitoInc, lisicase, mikepizzo and xuzhg February 22, 2023 07:31

xuzhg previously approved these changes Feb 22, 2023

View reviewed changes

Minor method rename

54c9e19

habbes dismissed xuzhg’s stale review via 54c9e19 February 28, 2023 03:52

habbes requested a review from xuzhg February 28, 2023 03:53