Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FATAL orionldState error during concurrent POST requests #408

Open
michaeI-s opened this issue Mar 10, 2020 · 3 comments
Open

FATAL orionldState error during concurrent POST requests #408

michaeI-s opened this issue Mar 10, 2020 · 3 comments

Comments

@michaeI-s
Copy link

Hi,

I'm doing a lot of POST /entities requests from a request queue in a Node.js script.
After some number of successfully served requests I get a "socket hang up" error in my script.
The Orion-LD log reads:

time=Tuesday 10 Mar 17:16:35 2020.571Z | lvl=FATAL | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionldState.cpp[239]:orionldStateDelayedKjFreeEnqueue | msg=Internal Error (the size of orionldState.delayedKjFreeVec needs to be augmented)

It's not actually a crash, because there is no stack trace. After that Orion is not reachable anymore. Its docker container exited.

I'm using the following Oron-LD version:

{
  "Orion-LD version": "post-v0.2.0",
  "based on orion": "1.15.0-next",
  "kbase version": "0.4",
  "kalloc version": "0.4",
  "khash version": "0.4",
  "kjson version": "0.4",
  "boost version": "1_62",
  "microhttpd version": "0.9.48-0",
  "openssl version": "OpenSSL 1.1.0l  10 Sep 2019",
  "mongo version": "1.1.3",
  "rapidjson version": "1.0.2",
  "libcurl version": "7.52.1",
  "libuuid version": "UNKNOWN",
  "branch": "(HEAD",
  "Next File Descriptor": 18
}

Kind regards,
Michael

@kzangeli
Copy link
Collaborator

Ok!
The broker exits willingly and tells us what's wrong.
The fix is simply to increase a size
I will take some time though to try to understand why a bigger size is needed.

@michaeI-s
Copy link
Author

Still

time=Wednesday 06 May 13:07:59 2020.051Z | lvl=FATAL | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionldState.cpp[243]:orionldStateDelayedKjFreeEnqueue | msg=Internal Error (the size of orionldState.delayedKjFreeVec needs to be augmented)

as with latest version:

{
  "orionld version": "post-v0.2.0",
  "orion version": "1.15.0-next",
  "uptime": "0 d, 0 h, 0 m, 30 s",
  "git_hash": "c38344b87377681a4ef8aea84a9937b7f2319d9b",
  "compile_time": "Tue May 5 17:58:03 UTC 2020",
  "compiled_by": "root",
  "compiled_in": "9e9a6eb98eaa",
  "release_date": "Tue May 5 17:58:03 UTC 2020",
  "doc": "https://fiware-orion.readthedocs.org/en/master/"
}

At least on my test machine and with my data this error is reproducible when setting the limitparameter to >=500:

curl --location --request GET '<ORION-LD-ADDRESS>/ngsi-ld/v1/entities?type=WeatherObserved&limit=500' \
--header 'Accept: application/ld+json' \
--header 'Link: <https://fiware.github.io/data-models/context.jsonld>; rel="http://www.w3.org/ns/json-ld#context"; type="application/ld+json"' \

In Orion v2 there is a maximum number of 1000. So this value might be expected to work here, too. Otherwise an error should be returned indicating that only a maximum value of x is allowed for 'limit'.
Currently the LD simply does not react anymore after this fatal error has occurred. Of course this must not happen.

@kzangeli
Copy link
Collaborator

kzangeli commented May 6, 2020

Yes, this is most definitely a bug.
Orion-LD has a default limit of 1000, just like Orion.
I'm pretty sure this is a problem with the buffer size of the rendered output.
Easy to fix to simply make it work for such big buffers (allocate a buffer of 1 gigabytes :) ).
Not so easy to streamline the calls to malloc and realloc ...
I will need to take a look at this asap

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants