Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CosmosDBOutput Unsupported PartitionKey value component '[]' #2112

Closed
Benjiiim opened this issue Nov 30, 2023 · 23 comments
Closed

CosmosDBOutput Unsupported PartitionKey value component '[]' #2112

Benjiiim opened this issue Nov 30, 2023 · 23 comments
Assignees

Comments

@Benjiiim
Copy link

Benjiiim commented Nov 30, 2023

In a .NET 8 Isolated function (migrating from .NET 6 In-Process), I'm trying to use CosmosDBOutput to write multiple documents without knowing their types but I'm getting the following error:

System.Private.CoreLib: Exception while executing function: Functions.Function1. Microsoft.Azure.Cosmos.Client: Unsupported PartitionKey value component '[]'. Numeric, string, bool, null, Undefined are the only supported types.

Here is a repo:

using Microsoft.Azure.Functions.Worker;
using Microsoft.Azure.Functions.Worker.Http;
using Newtonsoft.Json;

namespace TestCosmos
{
    public class Function1
    {
        [Function("Function1")]
        [CosmosDBOutput(databaseName: "Test", containerName: "Test", CreateIfNotExists = true, PartitionKey = "/email", Connection = "cosmosConnectionString")]
        public Object[] Run([HttpTrigger(AuthorizationLevel.Function, "get", "post")] HttpRequestData req)
        {
            var input = @"[
                        {
                            ""email"": ""hello@toto.com"",
                            ""event"": ""processed""
                        },
                        {
                            ""email"": ""hello@toto.com"",
                            ""event"": ""deferred"",
                            ""category"": ""cat facts""
                        }]";

            dynamic data = JsonConvert.DeserializeObject(input);

            List<Object> outputList = new List<Object>();

            foreach(var item in data)
            {
                item.id = System.Guid.NewGuid().ToString();

                outputList.Add(item);
            }

            return outputList.ToArray();
        }
    }
}

Am I doing anything wrong?

@kshyju
Copy link
Member

kshyju commented Nov 30, 2023

@ealsur Could you help with this?

@ealsur
Copy link
Member

ealsur commented Dec 1, 2023

@Benjiiim Are you sure the container is partitioned by /email?

This is the first time I see these kind of syntax, following the public documentation (https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-cosmosdb-v2-output?tabs=python-v2%2Cin-process%2Cnodejs-v4%2Cextensionv4&pivots=programming-language-csharp#queue-trigger-write-docs-using-iasynccollector) I would expect IAsyncCollector to be used?

@kshyju Is this something particular to the Worker packages?

@ealsur
Copy link
Member

ealsur commented Dec 1, 2023

@Benjiiim Can you also post the full Stack Trace of the exception? This message is coming from the Cosmos DB SDK (not the Extension code). The Stack Trace would normally show where is it happening. My hunch is that the SDK is attempting to extract the Partition Key Value from the documents and cannot do so, maybe because of the type of objects.

Output bindings work in 2 ways:

  • Using out/return variables -> In this case it supports 1 item.
  • Using IAsyncCollector -> For cases when you want to store multiple items.

You seem to be using the first approach but attempting to save multiple items.

@Benjiiim
Copy link
Author

Benjiiim commented Dec 1, 2023

Thanks @ealsur

Yes, I'm sure that the container is partitioned by /email.

From my understanding, IAsyncCollector<T> was used in In-process model. Isolated model must use T[].
I should be able to output multiple items with T[] with the isolated model.

https://learn.microsoft.com/en-us/azure/azure-functions/migrate-dotnet-to-isolated-model?tabs=net8

For output bindings, if the in-process model version used an IAsyncCollector<T>, you can replace this with binding to an array of the target type: T[].

On https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-cosmosdb-v2-output, you should use the isolated tab and not the in-process tab.

I think the issue is coming from the use of dynamic but I have to use it as I don't know the properties of the Json objects.

The same output was working with IAsyncCollector when my function was In-process.

@Benjiiim
Copy link
Author

Benjiiim commented Dec 1, 2023

Stack trace:

[2023-12-01T18:48:08.427Z] Executed 'Functions.Function1' (Failed, Id=6f6c0851-87fa-43c5-8b71-b51a9246551e, Duration=2672ms)
[2023-12-01T18:48:08.430Z] Microsoft.Azure.WebJobs.Host.FunctionInvocationException: Exception while executing function: Functions.Function1
[2023-12-01T18:48:08.431Z]  ---> System.ArgumentException: Unsupported PartitionKey value component '[]'. Numeric, string, bool, null, Undefined are the only supported types.
[2023-12-01T18:48:08.432Z]    at Microsoft.Azure.Cosmos.ContainerCore.CosmosElementToPartitionKeyObject(IReadOnlyList`1 cosmosElementList)
[2023-12-01T18:48:08.433Z]    at Microsoft.Azure.Cosmos.ContainerCore.GetPartitionKeyValueFromStreamAsync(Stream stream, ITrace trace, CancellationToken cancellation)
[2023-12-01T18:48:08.436Z]    at Microsoft.Azure.Cosmos.ContainerCore.ExtractPartitionKeyAndProcessItemStreamAsync[T](Nullable`1 partitionKey, String itemId, T item, OperationType operationType, ItemRequestOptions requestOptions, ITrace trace, CancellationToken cancellationToken)
[2023-12-01T18:48:08.437Z]    at Microsoft.Azure.Cosmos.ContainerCore.UpsertItemAsync[T](T item, ITrace trace, Nullable`1 partitionKey, ItemRequestOptions requestOptions, CancellationToken cancellationToken)
[2023-12-01T18:48:08.438Z]    at Microsoft.Azure.Cosmos.ClientContextCore.RunWithDiagnosticsHelperAsync[TResult](String containerName, String databaseName, OperationType operationType, ITrace trace, Func`2 task, Func`2 openTelemetry, String operationName, RequestOptions requestOptions)
[2023-12-01T18:48:08.439Z]    at Microsoft.Azure.Cosmos.ClientContextCore.OperationHelperWithRootTraceAsync[TResult](String operationName, String containerName, String databaseName, OperationType operationType, RequestOptions requestOptions, Func`2 task, Func`2 openTelemetry, TraceComponent traceComponent, TraceLevel traceLevel)
[2023-12-01T18:48:08.441Z]    at Microsoft.Azure.WebJobs.Extensions.CosmosDB.CosmosDBAsyncCollector`1.AddAsync(T item, CancellationToken cancellationToken) in D:\a\_work\1\s\src\WebJobs.Extensions.CosmosDB\Bindings\CosmosDBAsyncCollector.cs:line 26
[2023-12-01T18:48:08.442Z]    at Microsoft.Azure.WebJobs.Extensions.CosmosDB.CosmosDBAsyncCollector`1.AddAsync(T item, CancellationToken cancellationToken) in D:\a\_work\1\s\src\WebJobs.Extensions.CosmosDB\Bindings\CosmosDBAsyncCollector.cs:line 48
[2023-12-01T18:48:08.443Z]    at Microsoft.Azure.WebJobs.Script.Binding.FunctionBinding.BindAsyncCollectorAsync[T](BindingContext context) in /_/src/WebJobs.Script/Binding/FunctionBinding.cs:line 200
[2023-12-01T18:48:08.444Z]    at Microsoft.Azure.WebJobs.Script.Binding.ExtensionBinding.BindAsync(BindingContext context) in /_/src/WebJobs.Script/Binding/ExtensionBinding.cs:line 84
[2023-12-01T18:48:08.445Z]    at Microsoft.Azure.WebJobs.Script.Description.WorkerFunctionInvoker.<>c__DisplayClass13_0.<<BindOutputsAsync>b__0>d.MoveNext() in /_/src/WebJobs.Script/Description/Workers/WorkerFunctionInvoker.cs:line 182
[2023-12-01T18:48:08.446Z] --- End of stack trace from previous location ---
[2023-12-01T18:48:08.447Z]    at Microsoft.Azure.WebJobs.Script.Description.WorkerFunctionInvoker.BindOutputsAsync(Object input, Binder binder, ScriptInvocationResult result) in /_/src/WebJobs.Script/Description/Workers/WorkerFunctionInvoker.cs:line 186
[2023-12-01T18:48:08.448Z]    at Microsoft.Azure.WebJobs.Script.Description.WorkerFunctionInvoker.InvokeCore(Object[] parameters, FunctionInvocationContext context) in /_/src/WebJobs.Script/Description/Workers/WorkerFunctionInvoker.cs:line 109
[2023-12-01T18:48:08.449Z]    at Microsoft.Azure.WebJobs.Script.Description.FunctionInvokerBase.Invoke(Object[] parameters) in /_/src/WebJobs.Script/Description/FunctionInvokerBase.cs:line 82
[2023-12-01T18:48:08.450Z]    at Microsoft.Azure.WebJobs.Script.Description.FunctionGenerator.Coerce[T](Task`1 src) in /_/src/WebJobs.Script/Description/FunctionGenerator.cs:line 225
[2023-12-01T18:48:08.451Z]    at Microsoft.Azure.WebJobs.Host.Executors.FunctionInvoker`2.InvokeAsync(Object instance, Object[] arguments) in D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionInvoker.cs:line 52
[2023-12-01T18:48:08.452Z]    at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.InvokeWithTimeoutAsync(IFunctionInvoker invoker, ParameterHelper parameterHelper, CancellationTokenSource timeoutTokenSource, CancellationTokenSource functionCancellationTokenSource, Boolean throwOnTimeout, TimeSpan timerInterval, IFunctionInstance instance) in D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 581
[2023-12-01T18:48:08.453Z]    at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithWatchersAsync(IFunctionInstanceEx instance, ParameterHelper parameterHelper, ILogger logger, CancellationTokenSource functionCancellationTokenSource) in D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 527
[2023-12-01T18:48:08.454Z]    at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithLoggingAsync(IFunctionInstanceEx instance, FunctionStartedMessage message, FunctionInstanceLogEntry instanceLogEntry, ParameterHelper parameterHelper, ILogger logger, CancellationToken cancellationToken) in D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 306
[2023-12-01T18:48:08.455Z]    --- End of inner exception stack trace ---
[2023-12-01T18:48:08.456Z]    at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithLoggingAsync(IFunctionInstanceEx instance, FunctionStartedMessage message, FunctionInstanceLogEntry instanceLogEntry, ParameterHelper parameterHelper, ILogger logger, CancellationToken cancellationToken) in D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 352
[2023-12-01T18:48:08.457Z]    at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.TryExecuteAsync(IFunctionInstance functionInstance, CancellationToken cancellationToken) in D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 108

@ealsur
Copy link
Member

ealsur commented Dec 1, 2023

@Benjiiim Yes, this confirm the case. The problem is you are using the return pattern, which is for 1 item as per the documentation, but you are returning an object which is an array.

Please use the IAsyncCollector approach for Output bindings meant to save multiple items (this applies to any output binding, not just Cosmos DB).

@Benjiiim
Copy link
Author

Benjiiim commented Dec 1, 2023

Sorry but the doc says that IAsyncCollector should not be used in isolated mode.
According to the doc, in isolated mode, using an array of objects as the output is the way to save multiple objects.
This is working well with strongly typed objects in isolated mode but doesn't work with the object I'm passing in the code above (was working with IAsyncCollector in in-process mode).

@ealsur
Copy link
Member

ealsur commented Dec 4, 2023

I have personally no knowledge how isolated mode works or why IAsyncCollector won't work there.

Are you saying that when you use a Strongly Typed Array, it works, but in this example (using Object) it does not? Why are you using Object and not, for example, JObject, or dynamic?

List<JObject> outputList = new List<JObject>();

I wonder if the problem is the underlying Cosmos DB SDK cannot work with Object.

@Benjiiim
Copy link
Author

Benjiiim commented Dec 7, 2023

I don't want to be rude but why not involving someone who knows about the difference between how the CosmosDB output worked with in-process functions and how it is working now with isolated functions?
I've actually found a way to output an other type of objects while migrating to System.Text.Json and avoiding the use of dynamic.
However, understanding why the code above doesn't work while it was working with IAsyncCollector in an in-process function might be useful to either fix something or document the change?
Let me know if and how I can help.

@ealsur
Copy link
Member

ealsur commented Dec 7, 2023

involving someone who knows about the difference between how the CosmosDB output worked with in-process functions and how it is working now with isolated functions

I sadly do not know. @kshyju, any ideas who knows how Isolated Functions work in terms of Output Bindings?

understanding why the code above doesn't work

The key error is this: Unsupported PartitionKey value component '[]', this is what is telling that the Cosmos DB SDK, when attempting to extract the Partition Key of the document, is finding a type of value that is an Array. Hence my first thought was, the problem is the Array is being sent and treated as 1 document instead of an array of documents.

I've actually found a way to output an other type of objects

You mean that you have a working code that still returns a List but using different type works? Can you share which Type?

@chocvanstraw
Copy link

I was able to output multiple documents to the CosmosDBOutput binding as follows. Perhaps it might be helpful to you. I use JsonNode since I only care about a few properties on the document.

        [CosmosDBOutput("%DatabaseId%", "Container2", Connection = "ConnectionString")]
        public IEnumerable<JsonNode> Run([CosmosDBTrigger(
            databaseName: "%DatabaseId%",
            containerName: "Container1",
            Connection = "ConnectionString",
            LeaseContainerName = "leases",
            LeaseContainerPrefix = "xxx_",
            CreateLeaseContainerIfNotExists = true)] IReadOnlyList<JsonNode> documents)
        {
            var outputDocs = new List<JsonNode>();

            foreach (var doc in documents)
            {
                doc["partitionKey"] = Guid.NewGuid().ToString(); //change the partition key to something else

                outputDocs.Add(doc);
            }

            return outputDocs;
        }

@ealsur
Copy link
Member

ealsur commented Dec 9, 2023

Thanks @chocvanstraw for the example, JsonNode works if you are using System.Text.Json (are you customizing the serializer for the extension?). The default serialization for the Cosmos DB SDK uses Newtonsoft.Json, that is why I suggested to use JObject because the OP is using dynamic data = JsonConvert.DeserializeObject(input); (Newtonsoft.Json) for the example.

@chocvanstraw
Copy link

chocvanstraw commented Dec 9, 2023 via email

@ealsur
Copy link
Member

ealsur commented Dec 11, 2023

Interesting. Seems like the Isolation Mode might be using System.Text.Json to serializer/communicate between processes? @kshyju is there anyone that works on the Worker/Isolation mode area that can confirm this?

@tim-SIOA
Copy link

tim-SIOA commented Feb 7, 2024

Hey @kshyju @mattchenderson.

I've just hit this issue as well.

Is there an ETA on this one getting sorted? Or any workarounds for it?

Thanks!

@ealsur
Copy link
Member

ealsur commented Feb 7, 2024

@tim-SIOA Is the above workaround working?

@tim-SIOA
Copy link

tim-SIOA commented Feb 7, 2024

@tim-SIOA Is the above workaround working?

I've just changed to use 'JsonNode' and it seems to work - thanks @chocvanstraw !

@kshyju
Copy link
Member

kshyju commented Jun 24, 2024

@Benjiiim I was able to look into this and could reproduce the issue with the code you provided. Thank you!

The issue here is that your code is mixing two serializers. Your function code uses Newtonsoft.Json.JsonConvert.DeserializeObject to create an object from the JSON string. By default, JsonConvert.DeserializeObject returns a JToken (the base class for JSON objects in Newtonsoft.Json), which can be a JObject, JArray, JValue, etc. However, the function app uses System.Text.Json as the default serializer. When the function app invocation pipeline code tries to serialize this JToken using System.Text.Json.JsonSerializer (abstracted behind Azure SDK's ObjectSerializer), it doesn't handle the JToken structure correctly, producing an invalid JSON string like below for the output payload:

[{"email":[],"event":[],"id":[]},{"email":[],"event":[],"category":[],"id":[]}]

This will eventually cause an error while trying to persist this to cosmos db.

You have 3 options to resolve the issue:

  1. Use System.Text.Json serializer in your function code to deserialize.
[CosmosDBOutput(databaseName: "Test", containerName: "Test",
                                 CreateIfNotExists = true, PartitionKey = "/email",
                                 Connection = "cosmosConnectionString")]
public Object[] Run3([HttpTrigger(AuthorizationLevel.Anonymous, "get", "post")] HttpRequestData req)
{
    var input = @"[
                {
                    ""email"": ""two@toto.com"",
                    ""event"": ""processed""
                },
                {
                    ""email"": ""hello@toto.com"",
                    ""event"": ""deferred"",
                    ""category"": ""facts""
                }]";

    var data = System.Text.Json.JsonSerializer.Deserialize<JsonNode[]>(input);

    foreach (var item in data)
    {
        item["id"] = System.Guid.NewGuid().ToString();
    }

    return data;
}
  1. Update your app startup code to use NewtonSoft.Json serializer.
var host = new HostBuilder()
        .ConfigureFunctionsWorkerDefaults(builder =>
        {
            builder.UseNewtonsoftJson();
        })
        .Build();

host.Run();

The UseNewtonsoftJson extension method source code can be seen herein our samples.

  1. Use a strongly typed class for the data.

Create a POCO representing your data and use that in your function code.

[CosmosDBOutput(databaseName: "Test", containerName: "Test",
                                 CreateIfNotExists = true, PartitionKey = "/email",
                                 Connection = "cosmosConnectionString")]
public MyType[] Run2([HttpTrigger(AuthorizationLevel.Anonymous, "get", "post")] HttpRequestData req)
{
   // Super minimal example returning an array. You may update this as needed.
    return new MyType[]
    {
        new MyType { id = Guid.NewGuid().ToString(), Email = "two@toto.com", Category = "Fin", Event = "Order" }
    };
}

Hopefully you can adopt one of these solutions to resolve your issue. Let us know if you still run into problems.

@kshyju kshyju added Needs: Author Feedback and removed bug Something isn't working needs-investigation labels Jun 24, 2024
@Benjiiim
Copy link
Author

Thanks @kshyju !
I was able to workaround this problem months ago, indeed by getting rid of NewtonSoft.Json but without understanding what was happening.
That's great to finally understand the ins and outs of this issue. Thanks for your time and this detailed explanation.
Do you think that something can be done in the documentation (on Azure Cosmos DB trigger and bindings for Azure Functions 2.x and higher overview) page maybe?) to help people avoid this trap?

@chocvanstraw
Copy link

I have switched my code to use a POCO in the input and output bindings instead of JsonNode. However, the POCO uses Newtonsoft [JsonProperty] attributes on its properties, so the property names serialize properly in Cosmos. However, because the function app uses System.Text.Json as the serializer by default, the Cosmos Output binding fails. Your options are:

  1. Change the function app to use Newtonsoft.Json as the serializer and continue to use Newtonsoft [JsonProperty] in the POCO
  2. Keep the function app using the default System.Text.Json serializer and change the POCO to use System.Text.Json [JsonPropertyName] attributes

If you use option 2 you may not be able to share POCOs between the function app and any projects that use the actual CosmosClient as CosmosClient seems to have a dependency on Newtonsoft for serialization.

@ealsur
Copy link
Member

ealsur commented Jun 28, 2024

If you use option 2 you may not be able to share POCOs between the function app and any projects that use the actual CosmosClient as CosmosClient seems to have a dependency on Newtonsoft for serialization.

You can if you set your CosmosClient with a custom System.Text.Json serializer. We are shipping support soon (Azure/azure-cosmos-dotnet-v3#4332) that should make it easier.

@kshyju
Copy link
Member

kshyju commented Jul 1, 2024

@Benjiiim Serialization is not specific to any binding; it applies to the entire application. We have documentation on the default serialization behavior and how to customize it available here. I'm not certain that this information should be included on the Cosmos binding-specific page.

@Benjiiim
Copy link
Author

Benjiiim commented Jul 1, 2024

@kshyju > that makes sense. No reason to include in the Cosmos binding page indeed. Thanks a lot for your time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants