Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weaviate int needs to be represented as java Long instead of Double #149

Open
samos123 opened this issue Nov 21, 2022 · 4 comments
Open

Comments

@samos123
Copy link
Contributor

In Weaviate the int datatype is represented as int64 however in the Java client it gets converted to a java.lang.Double which is an 8 byte structure to represent floating points.

I suspect if we were to store a value of 9223372036854775807 or -9223372036854775807 in Weaviate of datatype int that the java client wouldn't be able to get the correct value.

@samos123
Copy link
Contributor Author

samos123 commented Nov 21, 2022

It's indeed an issue, currently an int stored as 9223372036854775807 will get returned as double value 9.223372036854776E18, thus losing precision because the number became 9223372036854776000 instead of 9223372036854775807.

I created an example demonstrating the issue here: samos123/weaviate-java-example@e8506e0

It seems it's not able to store the Article wordCount correctly either when using Java client with a long value:

curl http://localhost:8080/v1/objects | jq ".objects[0]"
{
  "class": "Article",
  "creationTimeUnix": 1669064501700,
  "id": "43e43a84-1370-4b21-8796-545c7d222382",
  "lastUpdateTimeUnix": 1669064501700,
  "properties": {
    "content": "Sam",
    "title": "Sam",
    "wordCount": 9223372036854776000
  },
  "vectorWeights": null
}

@samos123
Copy link
Contributor Author

I'm starting to wonder maybe the Weaviate docs are incorrect that it uses int64:

curl -X POST -H 'Content-Type: application/json' -d '{
      "class": "Article",
      "properties": {
          "title": "Large int64",
          "wordCount": 92233720368547758079
      }
  }' http://localhost:8080/v1/objects
{"error":[{"message":"invalid object: invalid integer property 'wordCount' on class 'Article': the JSON number '92233720368547758079' could not be converted to an int"}]}

@samos123
Copy link
Contributor Author

samos123 commented Nov 22, 2022

Does Weaviate int datatype support 64 byte numbers? It feels like support for int64 is flaky so maybe we should just only support int32 in the Java client.

Relevant issue: weaviate/weaviate#1563

The docs do mention this:
(*) Although Weaviate supports int64, GraphQL currently only supports int32, and does not support int64. This means that currently integer data fields in Weaviate with integer values larger than int32, will not be returned using GraphQL queries. We are working on solving this issue. As current workaround is to use a string instead.
however I'm not using GraphQL to store the object, I'm using the REST API.

@aliszka
Copy link
Member

aliszka commented Nov 24, 2022

Thank @samos123 for reporting the issue.
You are right, Double is not suitable for holding int64 numbers like given 9223372036854775807 (as it is rounded to 9223372036854776000) and Long would fit in that case better.
Unmarshaller used in the client is not aware of property type, so for each number-like value Double is used as the most versatile. This works well for smaller numbers, so for majority of usecases, but fails for big ones like in your example.
We need to figure out how that can be fixed.

As for relevant issue: weaviate is also affected with rounding problem and should be fixed independently: weaviate/weaviate#1563 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants